To import content from Microsoft Word into Paligo, you will need to apply certain styles to your content and make some adjustments. This is because MS Word is unstructured content, and to import it into Paligo, it needs to be given some form of structure.
We recommend that you save a copy of your Word file for use as a Word document. Then use another copy for the Paligo import.
To prepare your MS Word file for import, complete the following tasks.
We recommend that you give your MS Word document a title. To create the title, add your title text and apply the Title style to it. For more information on using styles, see the official Microsoft Word documentation.
If you try to import a Word document that does not have a title, Paligo will apply a default title, such as "Article d3e1".
In some cases, MS Word documents that have no title may cause the import process to fail. If this happens, you may see the following error message:
"The import file could not be created. Nothing to import. The intermediate transformation resulted in an empty document."
Paligo uses the style names from MS Word to map your Word content to the appropriate elements in Paligo. For example, Paligo uses the Heading 1 style in Word to determine which parts of a Word document should be a top-level topic in Paligo.
For the import and mapping to work, it is important that you use the styles for formatting your MS Word content. You can create and apply styles from the Styles panel in Word:
To work with Paligo, the styles need to have specific names:
-
Create a style called Title and apply it to the name of your document, for example, the title on the front cover. Make sure that the title is the first text that appears in the document.
-
If your Word document has a subtitle, create a Subtitle style and apply it to the subtitle text.
-
Make sure your headings use styles named Heading 1 through to Heading 6.
-
If you have code examples, create a paragraph style called Source Code and apply it to the code text. This has to be a paragraph style and not an inline style that is only applied to words inside a sentence.
The names of the styles are important for the Paligo import, so please make sure they match the names given above.
Tip
To find out how to set up styles, see the Microsoft documentation.
Paligo uses the hierarchy of heading styles to determine which parts of your document should be converted into top-level topics, second-level topics, third-level topics, and so on. For this to work correctly, it is important that you use the heading levels as intended and consistently in your MS Word doc.
-
Use Heading 1 for chapter headings
-
Use Heading 2 for first-level subsections in chapters
-
Use Heading 3 for second-level subsections in chapters
-
Use Heading 4 for third-level subsections in chapters
Always use them in this order and do not skip a heading level, for example, do not have a Heading 1 followed by a Heading 3. The sequence of Headings should always be Heading 1, Heading 2, Heading 3, and so on.
Typically, technical communication is organized into three or four levels of content for ease of use. If you have more levels, you can use Heading 5 and Heading 6. But if you need to use Headings 5 and 6, it could suggest a problem with the information architecture. It may be worth reconsidering the structure and reorganising it into 3 or 4 levels maximum.
Your MS Word document most likely contains some features that are needed for Word, but they are either not needed in Paligo or they cannot be imported. These include page breaks, a table of contents, and review comments.
-
Paligo cannot import text that is inside frames. If you have text in frames, you will need to move it into the regular content or it will be lost.
-
If there are any active review comments or track changes, make sure they are accepted before you import.
-
Remove the header and footer content
-
Remove all page breaks and section breaks
-
Remove the Word table of contents (TOC). Paligo will generate its own table of contents when you publish in Paligo.
In Microsoft Word, there are different ways to add content that is indented inside a list item (or is not indented, but really should be!). This can make it difficult to import in a consistent way.
For the cleanest import, we recommend that you format your indented list item content like this in MS Word:
-
Create a new list item for the indented content.
-
Add the text or image for the indented content in the new step.
-
Position the cursor at the start of the new list item and press backspace. The list item formatting is removed so that the content is now indented as part of the previous step.
When you create indented content in MS Word in this way, Paligo will import the list and give it the expected structure:
Note that each list item has a listitem
element and then the text for the list item is inside a para element. Here, we have imported a list that has an indented image for the third list item. The image uses a mediaobject
element in Paligo and you can see that it is inside the listitem
.
For indented paragraphs, the paragraph is inserted in a para
element inside the list item
:
You could have sublists indented in the same way. The important part is that the content is nested inside the listitem
to which it relates.
Note
If you have content in a list and it was indented in a different way in MS Word, it may import as:
-
Regular paragraphs that break the flow of the list.
-
Literallayout
elements, which are valid Paligo XML, but are not the recommended way to structure lists. In the editor, the list item can look like code, for example:
In both cases, you can use the Paligo editor to fix the issues. You can use the XML tree to indent the regular paragraphs and create new listitems
for the literallayouts
and move the content into those. Or you may prefer to address the problems in the MS Word document and re-import the content, so that the issues are fixed at the source.
Paligo can import tables from Microsoft Word, but for best results we recommend that you:
-
Avoid using large tables that cover multiple pages in MS Word.
While large tables can import correctly, they may go beyond the boundaries of the Paligo editor, making them more difficult to edit (you will need to use the source code editor). It is better to try and break large tables down into smaller sets of tables. This makes them more usable in both Paligo and MS Word.
When redesigning your tables, think about what information your users need from each table. Do they need to compare items? If yes, then can those related items be organized into smaller groups, perhaps smaller tables organized by product type or product name? If no, then think about making smaller tables for groups of related items rather than one oversized table.
-
Try to use simple tables where possible.
Tables with merged cells will sometimes import as two separate columns, depending on how they are formatted in the background code in MS Word. The more complex the table, the greater the chance that you will need to manually edit the table in Paligo to get the result you want.
-
Use tables appropriately.
Tables are a good option for presenting data, but are sometimes used for information that would work better as a regular paragraph. For example, if you have a cell with many paragraphs in it, then this could be a good candidate for being a section in its own right. The table cell would then just need a link to the appropriate section.
If you intend on publishing to the web, it is also worth noting that tables of content can be more difficult to use on small devices such as smartphones. This is especially true of larger tables where vertical and horizontal scrolling is needed.
If you have used customized styles in your MS Word document, the styles must be renamed to Source code
, before the document is imported into Paligo. Otherwise the code listings from the customized styles will not be imported.
Tip
For more information about styles in MS Word, see Microsoft Support page.
In this instruction, the customized style is called "Linux Commands".
-
Open the document in MS Word.
-
Open the Styles pane.
-
Locate the customized commands style (in this example called "Linux Commands").
-
In the context menu, choose Modify Style.
-
Rename the style to "Source Code" (instead of "Linux Commands").
-
Select OK.
-
Save the document, zip it and import it to Paligo.
All content using the custom commands style should now be shown as
programlisting
since the name "Source code" is the one recognized by the tool converting from Word.
A cross-reference points to information in the same document and a hyperlink refers to other documents.
Links in MS Word documents that point to a heading (created with Insert → Link → Link → This document and selecting a Heading) are converted well when imported to Paligo.
The following limitations are known with MS Word links when imported to Paligo:
-
Bookmarks (created with Insert → Link → Bookmark) do convert, but are tricky as they use arbitrary anchors, which is a problem when migrating to a structured content form as the one used in Paligo, as one cornerstone is that links need to point to a content element that makes sense. They can break during import and will require manual adjustment.
-
Cross references selecting an entity in the document (created with Insert → Link → Cross reference and selecting an entity in the document) do not convert in the standard process, as they use a linking mechanism that only exists in MS Word.
Comments
0 comments
Article is closed for comments.