Liferay has out-of-the-box OpenOffice integration which is robust and works very well for converting web content to a variety of document formats, including PDF. However, this approach requires you to have a separate OpenOffice server running, which may not be desired or feasible. Using JTidy and Flying Saucer (xhtmlrenderer), it’s very easy to implement a nice self-contained solution. The Tidy object from JTidy is used to ensure that the HTML from the web content is well-formed and to convert it to XML. The ITextRenderer object from Flying Saucer is used to generate the PDF from the XML and send it to the output stream.

Documentation on JTidy and Flying Saucer is available at the following links:

http://jtidy.sourceforge.net/

http://code.google.com/p/flying-saucer/

Example

This is a resource method in a Spring-based portlet that converts a web content article to a PDF:

    
import org.w3c.dom.Document;

import org.w3c.tidy.Tidy;
import org.xhtmlrenderer.pdf.ITextRenderer;
 
@ResourceMapping("getPDF")
public void getPDF( @RequestParam(value = ARTICLE_ID, required = true) String articleId,
                             ResourceRequest request,
                             ResourceResponse response)
{
try {
ThemeDisplay themeDisplay = (ThemeDisplay)request.getAttribute(WebKeys.THEME_DISPLAY);
long groupId = themeDisplay.getScopeGroupId();
 
//get journal article
JournalArticleDisplay articleDisplay = JournalContentUtil.getDisplay(groupId, articleId, "", LanguageUtil.getLanguageId(request),
themeDisplay);
 
//set up response to handle PDF
response.reset();
response.setContentType("application/pdf");
response.setProperty("Content-disposition", "attachment; filename="" + articleDisplay.getTitle().concat(StringPool.PERIOD).concat("pdf") + """);
OutputStream outputStream = response.getPortletOutputStream();
 
String articleHtml = "<html><body>"+articleDisplay.getContent()+"</body></html>";
 
//prepend portal URL to local document library relative URLs
articleHtml = articleHtml.replaceAll("src="/documents", "src=""+themeDisplay.getPortalURL()+"/documents");
 
Tidy tidy = new Tidy();
 
// Create inputStream to parse with tidy.
InputStream is = new ByteArrayInputStream(articleHtml.getBytes());
 
// Create XML Document from tidy
Document doc = tidy.parseDOM(is, null);
 
//render PDF
ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(doc, null);
renderer.layout();
renderer.createPDF(outputStream);
} catch (Exception e) {
//handle error
}
}

Required Maven Dependencies

              <dependency>
<groupId>net.sf.jtidy</groupId>
                     <artifactId>jtidy</artifactId>
                     <version>r938</version>
              </dependency>
 
              <dependency>
                     <groupId>org.xhtmlrenderer</groupId>
                     <artifactId>flying-saucer-core</artifactId>
                     <version>9.0.1</version>
              </dependency>
 
              <dependency>
                     <groupId>org.xhtmlrenderer</groupId>
                     <artifactId>flying-saucer-pdf</artifactId>
                     <version>9.0.1</version>
              </dependency>