Importing from Microsoft Word is complex, because it is not a structured format. So the results will depend a lot on the input quality. The most important consideration is usually that Word styles should have been applied consistently.
There are three options when preparing import from Microsoft Word:
-
Use the Import Wizard to directly import the Ms Word content. It can handle most well-structured Word documents.
-
Convert to DocBook using the (optional) Oxygen XML editor with the Paligo plugin.
-
Convert to XML with a purchased customization from support. This is used if your content is not easily imported directly. For very large documents or many documents, the latter method is sometimes preferable.
The simplest method is to import the Ms Word files directly by using the Import Wizard. Paligo can handle most well-structured Word documents.
-
Zip each individual Word file. Do not include multiple Word files in one zip file.
-
Use the Import Wizard to import the zip files.
Select Word (.docx) as the type of file to import.
By using Oxygen XML editor, you can convert the content to a DocBook document before importing it.
-
In Oxygen, create a DocBook 5.1 Article, by selecting File > New, and then selecting the proper DocBook template.
-
Remove the first "sect1" element.
-
Insert a new
section
element by pressing enter on your keyboard and selectSection
in the element list. This will not really be used, and we'll remove it at the end. But it's needed because of a quirk in Oxygen. -
Place the cursor inside the
section
element. -
Save the document with any name you choose.
It is however important to save it with a name, otherwise images will not be properly saved.
-
Copy the text from your Word document. You should leave out any Table of Contents or similar, since it won't be needed.
-
Paste the content into the
section
element in Oxygen.You will get a warning saying that it needs to place it inside the closest Article element. Go ahead and accept that.
-
When you paste the content, it will be automatically converted to the proper XML elements.
-
Remove the empty
section
tag. It will be at the bottom or at the top of the document. -
Save the document again.
-
Use the Import Wizard to import the resulting DocBook document.
The third method involves a preconversion to XML using a script package that requires a purchased customization. It is slightly more complex, but with good results. It is especially preferable for very large documents or many documents, as it is very fast and can be tweaked more to adapt to your content.
Comments
0 comments
Article is closed for comments.