2. Converting a DOCX file

[Important]About Evaluation Edition

Do not be surprised because XMLmind Word To XML Evaluation Edition generates output containing random words replaced by string "[XMLmind]". Of course, this does not happen with Professional Edition!

Figure 1. w2x-app main window
w2x-app main window

Procedure:

  1. In the "Input DOCX file" frame, click to select the ".docx" file which is to be converted to XML.

    This ".docx" file must have been created using MS-Word 2007+.

  2. In the "Conversion specification" frame,

    • Click "Convert to" to select predefined conversions then choose a predefined conversion from the combobox.

      NameDescription
      DITA bookmapA DITA bookmap containing possibly nested chapter and topicref elements, themselves referencing files containing topic elements.
      DITA conceptA DITA concept possibly containing nested concept elements.
      DITA mapA DITA map containing possibly nested topicref elements, themselves referencing files containing topic elements.
      DITA topicA DITA topic possibly containing nested topic elements.
      DocBook V4.5 articleDocBook v4.5 article possibly containing CALS (not HTML) tables.
      DocBook V4.5 bookDocBook v4.5 book possibly containing CALS (not HTML) tables.
      DocBook V5.0 articleDocBook v5.0 article possibly containing CALS (not HTML) tables.
      DocBook V5.0 bookDocBook v5.0 book possibly containing CALS (not HTML) tables.
      DocBook V5.1 assemblyDocBook v5.1 assembly in which topics possibly contain CALS (not HTML) tables.
      EPUB 2.0 containing styled XHTML

      Similar to "Multi-page styled (X)HTML" except that the generated XHTML pages are packaged as an EPUB 2.0 book.

      Output file extension must be ".epub".

      EPUB 2.0 containing “semantic” XHTML 1.1

      Similar to "Multi-page “semantic” (X)HTML 1.1" except that the generated XHTML pages are packaged as an EPUB 2.0 book.

      Output file extension must be ".epub".

      Multi-page styled (X)HTML

      Similar to "Single-page styled (X)HTML" except that the source DOCX document is automatically split into parts.

      A new part is created each time a paragraph having an outline level less than or equal to specified split-before-level parameter is found in the source. An outline level is an integer between 0 (e.g. style Heading 1) and 8 (e.g. style Heading 9). The default value of parameter split-before-level is 0, which means: for each Heading 1, create a new page starting with this Heading 1.

      The file specified in the "Output XML file" field will contain the generated frameset. Output file extension must be ".html".

      While an obsolete HTML feature, a frameset makes it easy browsing these HTML pages. Moreover the table of contents used as the left frame is a convenient way to programmatically list all the generated HTML pages.

      Multi-page “semantic” (X)HTML 1.0 Strict

      Similar to "Single-page “semantic” (X)HTML 1.0 Strict" except that the intermediate, automatically generated, single (X)HTML page is automatically split into multiple pages.

      A new page is created each time a heading (h1, h2, h3, h4, h5, h6) having a “level” less than or equal to specified split-before-level parameter is found in the source. A “level” is an integer between 0 (corresponds to h1) and 5 (corresponds to h6). The default value of parameter split-before-level is 0, which means: for each h1, create a new page starting with this h1.

      The file specified in the "Output XML file" field will contain the generated frameset. Output file extension must be ".html".

      While an obsolete HTML feature, a frameset makes it easy browsing these HTML pages. Moreover the table of contents used as the left frame is a convenient way to programmatically list all the generated HTML pages.

      Multi-page “semantic” (X)HTML 1.0 Transitional

      Similar to "Single-page “semantic” (X)HTML 1.0 Transitional" except that the intermediate, automatically generated, single (X)HTML page is automatically split into multiple pages.

      A new page is created each time a heading (h1, h2, h3, h4, h5, h6) having a “level” less than or equal to specified split-before-level parameter is found in the source. A “level” is an integer between 0 (corresponds to h1) and 5 (corresponds to h6). The default value of parameter split-before-level is 0, which means: for each h1, create a new page starting with this h1.

      The file specified in the "Output XML file" field will contain the generated frameset. Output file extension must be ".html".

      While an obsolete HTML feature, a frameset makes it easy browsing these HTML pages. Moreover the table of contents used as the left frame is a convenient way to programmatically list all the generated HTML pages.

      Multi-page “semantic” (X)HTML 1.1

      Similar to "Single-page “semantic” (X)HTML 1.1" except that the intermediate, automatically generated, single (X)HTML page is automatically split into multiple pages.

      A new page is created each time a heading (h1, h2, h3, h4, h5, h6) having a “level” less than or equal to specified split-before-level parameter is found in the source. A “level” is an integer between 0 (corresponds to h1) and 5 (corresponds to h6). The default value of parameter split-before-level is 0, which means: for each h1, create a new page starting with this h1.

      The file specified in the "Output XML file" field will contain the generated frameset. Output file extension must be ".html".

      While an obsolete HTML feature, a frameset makes it easy browsing these HTML pages. Moreover the table of contents used as the left frame is a convenient way to programmatically list all the generated HTML pages.

      Multi-page “semantic” (X)HTML 5.0

      Similar to "Single-page “semantic” (X)HTML 5.0" except that the intermediate, automatically generated, single (X)HTML page is automatically split into multiple pages.

      A new page is created each time a heading (h1, h2, h3, h4, h5, h6) having a “level” less than or equal to specified split-before-level parameter is found in the source. A “level” is an integer between 0 (corresponds to h1) and 5 (corresponds to h6). The default value of parameter split-before-level is 0, which means: for each h1, create a new page starting with this h1.

      The file specified in the "Output XML file" field will contain the generated frameset. Output file extension must be ".html".

      While an obsolete HTML feature, a frameset makes it easy browsing these HTML pages. Moreover the table of contents used as the left frame is a convenient way to programmatically list all the generated HTML pages.

      Styled (X)HTML

      Single-page styled XHTML 1.0 Transitional document, embedding a CSS stylesheet, looking very much like the input DOCX file.

      This document has a "text/html;charset=UTF-8" Content-Type meta in order to be treated as HTML (not XHTML) by Web browsers.

      Output file extension must be ".html".

      Single page “semantic” (X)HTML 5.0

      Semantic (non-styled) XHTML 5.0 document.

      This XHTML 5.0 document may contain nested section elements.

      This document has a "text/html;charset=UTF-8" Content-Type meta in order to be treated as HTML (not XHTML) by Web browsers.

      Output file extension must be ".html".

      Single page “semantic” XHTML 1.0 Strict

      Semantic (non-styled) XHTML 1.0 Strict document.

      Output file extension must be ".xhtml".

      Single page “semantic” XHTML 1.0 Transitional

      Semantic (non-styled) XHTML 1.0 Transitional document.

      Output file extension must be ".xhtml".

      Single page “semantic” XHTML 1.1

      Semantic (non-styled) XHTML 1.1 document.

      Output file extension must be ".xhtml".

      Web Help containing styled (X)HTML

      Similar to "Single-page styled (X)HTML" except that the generated (X)HTML pages are compiled into a Web Help. The Web Help compiler used to do this is free, open source, XMLmind Web Help Compiler.

      The file specified in the "Output XML file" field will contain the entry point of the Web Help. Output file extension must be ".html".

      Web Help containing “semantic” (X)HTML 1.0 Strict

      Similar to "Multi-page “semantic” (X)HTML 1.0 Strict" except that the generated (X)HTML pages are compiled into a Web Help. The Web Help compiler used to do this is free, open source, XMLmind Web Help Compiler.

      The file specified in the "Output XML file" field will contain the entry point of the Web Help. Output file extension must be ".html".

      Web Help containing “semantic” (X)HTML 1.0 Transitional

      Similar to "Multi-page “semantic” (X)HTML 1.0 Transitional" except that the generated (X)HTML pages are compiled into a Web Help. The Web Help compiler used to do this is free, open source, XMLmind Web Help Compiler.

      The file specified in the "Output XML file" field will contain the entry point of the Web Help. Output file extension must be ".html".

      Web Help containing “semantic” (X)HTML 1.1

      Similar to "Multi-page “semantic” (X)HTML 1.1" except that the generated (X)HTML pages are compiled into a Web Help. The Web Help compiler used to do this is free, open source, XMLmind Web Help Compiler.

      The file specified in the "Output XML file" field will contain the entry point of the Web Help. Output file extension must be ".html".

      Web Help containing “semantic” (X)HTML 5.0

      Similar to "Multi-page “semantic” (X)HTML 5.0" except that the generated (X)HTML pages are compiled into a Web Help. The Web Help compiler used to do this is free, open source, XMLmind Web Help Compiler.

      The file specified in the "Output XML file" field will contain the entry point of the Web Help. Output file extension must be ".html".

      The combobox may contain additional entries corresponding to custom conversions specified by the means of plugins (if any).

    • OR Click "Use text file containing w2x options" to select custom conversions then click to choose the ".txt" file containing the custom conversion specification.

      A custom conversion specification is simply a text file containing w2x command-line options. More information in Section 3, “Custom conversion specifications”.

  3. In the "Output XML file" frame, click to select the XML file which is the result of the conversion.

    The "Single-page styled (X)HTML" and "Web Help" conversions generate several files in the directory containing the file specified in the "Output XML file" field. Therefore it is recommended to specify a new output directory each time you use these conversions. Note that this output directory is created on the fly if needed too. If on the contrary, such output directory already exists, it is not automatically made empty. However, in such case, a dialog box is displayed to ask you if you want to make the output directory empty before proceeding with the conversion.

  4. Optionally check "Open output file in associated application"[2] if you want to preview the result of the conversion in an external application.

    The file extension of the output file must have been associated to an external application. For example, if the name of the output file ends with ".html", this external application will be the system's Web browser.

    [Tip]

    After performing the conversion, it's also possible to click to open the folder containing the output file in the standard file explorer of the operating system (e.g. the Finder on the Mac).

  5. Click Convert.

    During the file conversion, the Convert button becomes a Cancel button. Click Cancel to cancel the current file conversion. Click Cancel while pressing the Shift key if you want to forcibly cancel the current file conversion. Use Shift+click only when the file conversion seems to be blocked for a long time.



[2] This checkbox may be disabled on some platforms (e.g. Linux).