Convert easily Docx to HTML in your Java applications

Sferyx JSyndrome DocxHTML Converter


Advanced Java Docx to HTML Converter component - convert easily Microsoft Word Docx files to HTML in Java

Sferyx DocxHTMLConverter Component is an advanced and powerful Java Docx to HTML Converter and Generator component. It can convert easily Microsoft Word Docx to HTML in any Java Application - Java Swing, JavaFX, SWT Eclipse and also Oracle Forms and produces perfectly paginated documents preserving the formatting including the page breaks, headers, footers and the page numbers. With only a few lines of Java code is possible to generate complex HTML files from almost any Word Docx source or location and the resulting HTML can be written to a local file, java.io.OutputStream or shown automatically inside the browser. The Docx to HTML Converter Java component supports all UTF-8 languages including support for Greek, Arabic, Cyrillic, Hebrew, Farsi, Chinese, Japanese, Hindi, Tamil and more. The Sferyx Docx to HTML Converter component is ready for use out of the box and does not depend on external packages.

You can create the files dynamically by adding the content on-the-fly directly inside your application and also generate page breaks when needed.

Convert Docx to HTML Java

 

All Sferyx products are signed with  
Trusted Code Signing Security Certificate from Thawte

Download Sferyx DocxHTMLConverter Java Docx to HTML Component Java Docx to HTML Converter
Version 22.0

Sferyx JSyndrome DocxHTML Converter Component Edition : Download DocxHTMLConverterDemo.zip
 
 

 

  • Pure Java Docx to HTML Generation engine - allows fast and easy HTML creation from various sources and also convert even very complex Docx documents with single line of code - 100% in house development - it does not depend on external packages.
  • Converts and generates quickly and easily HTML files directly from Microsoft Word Docx documents
  • Royalty free redistribution with your applications
  • Inclusion of all images including the inline Base64 encoded images, inline and linked CSS styles etc.
  • Works with any JRE/ JDK 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 9, 10, 11, 12, 13, 14, 15, 16 or higher
  • Support for Oracle Forms and full generation of HTML from Docx from Oracle Forms and CLOB 
  • Fully compatible with Java Swing, JavaFX, SWT Eclipse, Oracle Forms, Java Servlets, JSP
  • Compatible with Headless mode for server systems
  • Compact size and fast document generation
  • Now all hyperlinks inside the Docx document are generated as links (annotations) automatically into the resulting HTML file
  • Support for disabling the table breaking across multiple pages
  • Support for disabling lists breaking across multiple pages
  • Support for the CSS page break elements page-break-before:always, page-break-after:always, page-break-inside:never

 

Example usage


The use of the DocxHTMLConverter component is quite simple - with only a few lines of code is possible to generate and convert practically any Docx document to HTML.

Here are some examples on how to convert Microsoft Word Docx to HTML in Java with the Sferyx DocxHTMLConverter:

Convert Word Docx URL to HTML file

This method will convert the Docx to HTML and will save it to given file. The destinationFile parameter is a java.io.File object:

 

import sferyx.administration.docxhtmlconverter.*;

DocxHTMLConverter docxHTMLConverter=new DocxHTMLConverter ();
docxHTMLConverter.generateHTMLFromDocxURL ("http://your_url_here.docx", destinationFile);

or using the file name as String:

 

import sferyx.administration.docxhtmlconverter.*;

DocxHTMLConverter docxHTMLConverter=new DocxHTMLConverter ();
docxHTMLConverter.generateHTMLFromDocxURL ("http://your_url_here.docx", "c:/docxgenerator-test1.html");

Convert Word Docx URL to HTML OutputStream

It will convert the specified Docx document to HTML using the standard page format string such as "A4", "Letter" etc. and and save it to the specified OutputStream. This method will recognize automatically if the document is Docx file and will convert it accordingly. To use this automatic conversion the URL must end with the corresponding extension like docx.

 

docxHTMLConverter.generateHTMLFromDocxURL ("http://your_url_here/file.docx", destinationStream);


 

Convert Word Docx URL to HTML with different Page Format dialog options


Converts automatically the Docx URL to HTML and generates the file using the File dialog options. It will display File dialog for saving the generated file. This method will recognize automatically if the document is Docx and will convert it accordingly. To use this automatic conversion the URL must end with the corresponding extension like docx. 
 

docxHTMLConverter.generateDocxFromURL ("http://your_url_here/file.docx");

Dynamically Generate HTML from Word Docx and convert multiple files in Java with the Sferyx DocxHTML Converter

You can generate even very complex HTML documents dynamically in your Java application by simply providing all the formatting in HTML and inserting page breaks when new pages are needed - the HTML Generator will take care automatically for all the pagination of long formatted text spanning through multiple pages and also tables, lists etc. This functionality is perfect for creating various reports and other documents which need to be generated dynamically with rich text formatting. You can insert dynamically also Docx files which will be converted automatically to HTML or add other HTML content along with the Docx files, images which will be embedded as base64 encoded Strings inside the HTML document etc. - everything will be converted automatically and inserted as HTML in the same document.

 
import sferyx.administration.docxhtmlconverter.*;

DocxHTMLConverter docxHTMLConverter=new DocxHTMLConverter();
//Open the content buffer to  insert the content - HTML, Docx etc - everything can be merged together.
docxHTMLConverter.openContentBuffer();
//Append the content to the content buffer - you can insert styles, images and any kind of formatting.

docxHTMLConverter.appendHTMLContentToContentBuffer("<style>body{font-size:12pt;color:blue;} h1{background-color:yellow;}</style>");
docxHTMLConverter.appendHTMLContentToContentBuffer("<h1>This is H1 header</h1>Some other text <b>very important <i>stuff</i></b> with page break after");
//Insert page break to create new page - the HTMLGenerator will handle automatically all the pagination for long text if more pages are needed, tables and everything.
docxHTMLConverter.addPageBreakToContentBuffer();
//Append the content for the new page.

docxHTMLConverter.appendHTMLContentToContentBuffer("<h2 style=\"background-color:green;border-bottom:1px solid red;color:white\">This is second H2 header</h2>Some other text <span style=\"color:orange\">extremely interesting <u>stuff</u></b></span><br>");
//Insert another page break...
docxHTMLConverter.addPageBreakToContentBuffer();
....
//Append MS Word Docx file directly to the content buffer and it will be converted to HTML in the same document
docxHTMLConverter.appendDocxToContentBuffer(new java.net.URL("file:///c:/test/demo.docx"));
...
//Append another MS Word Docx file directly to the content buffer and it will be converted to HTML in the same document
docxHTMLConverter.appendDocxToContentBuffer(new java.net.URL("file:///c:/test/Sample06-1.docx"));
.....
//Close the content buffer and create the HTML document - there is a possibility to write it to File, OutputStream etc.
docxHTMLConverter.closeBufferAndGenerateHTML("c:/test/dynamic.html");

Command line arguments for the DocxHTMLConverter.jar file

You can easily execute the DocxHTMLConverter.jar from the command line and perform document conversions without writing code using the available command line arguments as follows:

Usage:
java -jar DocxHTMLConverter.jar absolute_url destination_file

Example:
C:\test>java -jar "C:\test\DocxHTMLConverter.jar" http://your_url_here c:/test/test-html.html

Methods available in the sferyx.administration.htmlgenerator.DocxHTMLConverter class

Method Summary
 void addPageBreakToContentBuffer()
          Adds a HTML page break to the content buffer and all the content appended after that will be on the next page when printed
 void appendDocxToContentBuffer(java.io.File file)
          Appends the whole content of the Docx file from the File to the content buffer.
 void appendDocxToContentBuffer(java.net.URL file)
          Appends the whole content of the Docx file from the given URL to the content buffer.
 void appendHTMLContentToContentBuffer(java.lang.String content)
          Appends new HTML string to existing content buffer.
 void clearContentBuffer()
          Closes the content buffer and clears the content.
 String closeBufferAndGenerateHTML()
          Generates the HTML content automatically for given content buffer created prevuiously by using openContentBuffer() and appendContentXXX() methods.
 void closeBufferAndGenerateHTML(java.io.OutputStream destinationStream)
          Closes the existing content buffer and generates the resulting content from the DocxHTML Converter - it will be saved in the given OutputStream.
 void closeBufferAndGenerateHTML(java.lang.String destinationFile)
          Generates the HTML content automatically for given content buffer created prevuiously by using openContentBuffer() and appendContentXXX() methods.
 String generateHTMLFromContent(java.lang.String content)
          Generates HTML automatically for given image or HTML content.
 void generateHTMLFromContent(java.lang.String content, java.io.File destinationFile)
          Generates html automatically for given html content.
 void generateHTMLFromContent(java.lang.String content, java.io.OutputStream destinationStream)
          Generates HTML automatically for given image or html content.
 void generateHTMLFromContent(java.lang.String content, java.lang.String destinationFile)
          Generates the HTML automatically for given html content.
 String generateHTMLFromDocxURL(java.lang.String sourceURL)
          Generates HTML automatically for given URL source containing a MS Word Docx file.
 void generateHTMLFromDocxURL(java.lang.String sourceURL, java.io.File destinationFile)
          Generates HTML automatically for given URL source containing a MS Word Docx file.
 void generateHTMLFromDocxURL(java.lang.String sourceURL, java.lang.String destinationFile)
          Generates HTML automatically for given URL source containing a MS Word Docx file.
String generateHTMLFromDocxURL(java.net.URL sourceURL)
          Generates HTML automatically for given URL source containing a MS Word Docx file.
 void generateHTMLFromDocxURL(java.net.URL sourceURL, java.io.File destinationFile)
          Generates HTML automatically for given URL source containing a MS Word Docx file.
 void generateHTMLFromDocxURL(java.net.URL sourceURL, java.io.OutputStream fos)
          Generates HTML automatically for given URL source containing a MS Word Docx file.
 void generateHTMLFromURL(java.lang.String sourceURL)
          Generates HTML automatically for given URL source.
 void generateHTMLFromURL(java.lang.String sourceURL, java.io.File destinationFile)
          Generates HTML automatically for given URL source and saves the result to destinationFile as string.
 void generateHTMLFromURL(java.lang.String sourceURL, java.io.OutputStream destinationStream)
          Generates HTML automatically for given URL source and saves the result to the given OutputStream as a string.
 void generateHTMLFromURL(java.lang.String sourceURL, java.lang.String destinationFile)
          Generates HTML automatically for given URL source and saves the result to destinationFile as a string.
 String generateHTMLFromURL(java.net.URL sourceURL)
          Generates HTML automatically for given URL source and saves the result will be returned as a String.
 void generateHTMLFromURL(java.net.URL sourceURL, java.io.OutputStream destinationStream)
          Generates HTML automatically for given URL source and saves the result to destinationStream as string.
 void openContentBuffer()
          Opens the new content buffer for inserting content to be used for dynamic HTML generation.
 

Customers

Sferyx customer base counts more than 1000 corporate customers and institutions from over 40 countries and different industrial sectors as follows: Media and publishing companies, Internet Service Providers, Research  Labs, Fortune 500 companies, Universities, Colleges and Schools, Software Developers, Content Management Systems developers, Web design agencies.

More than 1000 corporate customers, among them:

 

| Home | Users Manual | LicenseDemo & Download

Copyright © 2001-2023 Sferyx Srl. All rights reserved. Sferyx and the Sferyx logo are registered trademarks of Sferyx Srl. http://www.sferyx.com