4 Using Writer2xhtml and Calc2xhtml

Writer2xhtml is producing standards compliant XHTML files, in particular it can be used to put math on the web using the XHTML + MathML combination. Thus Writer2xhtml can convert into any of these XHTML variants:

Note that the default file extension and the recommended MIME types varies with the output format:

Output format

Default file extenstion

MIME type

XHTML 1.0

.html

text/html

XHTML 1.1 + MathML 2.0

.xhtml

application/xhtml+xml

XHTML 1.1 + MathML 2.0 (with xsl transformation)

.xml

application/xml

Writer2xhtml is quite flexible; in particular with respect to the handling of formatting:

Calc2xhtml is a companion to Writer2xhtml that produces XHTML 1.0 strict from your Calc documents.

4.1 Converting to XHTML from the command line

To convert a file to XHTML use the command line

w2l [options] <document/directory to convert>

  [<output path and/or file name>]

The available options are

This will produce an XHTML file with the specified name. If no output file is specified, Writer2xhtml will use the same name as the original document, but a different file extension.

Examples:

w2l -xhtml+mathml+xsl mydocument.sxw

or

w2l -xhtml -config myconfig.xml mydocument.sxw

The script w2l also provides a shorthand notation to use the sample configuration file included in writer2latex05.zip. The command line is

w2l -cleanxhtml <writer document to convert> [<output path and/or file name>]

This configuration file produces a ”clean” xhtml file (see section 4.4), for example:

w2l -cleanxhtml mydocument.sxw mypath/myoutputdoc.html

It is recommended that you create scripts to support your own configuration files.

4.2 Using Writer2xhtml as an export filter

If you choose File – Export in Writer you should be able to choose XHTML 1.0 strict, XHTML 1.1 + MathML 2.0 or XHTML 1.1 + MathML 2.0 (xsl) as file type. Using Calc2xhtml as an export filter is not yet supported.

Note: You have to use the export menu because Writer2xhtml does not provide an import filter for XHTML. You should always save in the native format of OOo as well!

4.3 Configuration

XHTML export can be configured with a configuration file.  Where the configuration is read from depends on how you use Writer2xhtml:

If you use Writer2xhtml as an export filter in OOo, the configuration is handled as follows:

If, on the other hand, you use Writer2xhtml from the command line, you will have to specify on the command line which configuration file to use.

The configuration is a file in xml format. Here is a sample configuration file:

<?xml version="1.0" encoding="UTF-8"?>

<config>

  <option name="xhtml_custom_stylesheet" value="/mystyle.css" />

  <option name="xhtml_ignore_styles" value="false" />

  <option name="xhtml_use_dublin_core" value="true" />

  <option name="xhtml_convert_to_px" value="true" />

  <option name="xhtml_split_level" value="1" />

  <xhtml-style-map name="mystyle" class="paragraph" element="p" css="mycssstyle" />

</config>

Options

Style maps

In addition to the options, you can specify that certain styles in Writer should be mapped to specific XHTML elements and CSS style classes. Here are some examples showing how to use some of the built-in Writer styles to create XHTML elements:

<?xml version="1.0" encoding="UTF-8"?>

<config>

  <!-- map OOo paragraph styles to xhtml elements -->

  <xhtml-style-map name="Text body" class="paragraph"   

           element="p" css="(none)" />  

  <xhtml-style-map name="Sender" class="paragraph"

           element="address" css="(none)" />

  <xhtml-style-map name="Quotations" class="paragraph"

           block-element="blockquote" block-css="(none)"

           element="p" css="(none)" />

  <!-- map OOo text styles to xhtml elements -->

  <xhtml-style-map name="Citation" class="text"

           element="cite" css="(none)" />

  <xhtml-style-map name="Emphasis" class="text"

           element="em" css="(none)" />

  

  <!-- map hard formatting attributes to xhtml elements -->

  <xhtml-style-map name="bold" class="attribute"

           element="b" css="(none)" />

  <xhtml-style-map name="italics" class="attribute"

           element="i" css="(none)" />

</config>

An extended version of this is distributed with Writer2LaTeX, please see the file cleanxhtml.xml.

The attributes of the xhtml-style-map element are used as follows:

For example the rules above produces code like this:

<p>This paragraph is Text body</p>

<address>This paragraph is Sender</address>

<blockquote>

  <p>This paragraph is Quotations</p>

  <p>This paragraph is also Quotations</p>

</blockquote>

<p>This paragraph is also Text body and has some <em>text with emphasis style</em> and uses some <b>hard formatting</b>.</p>

You can use your own Writer styles together with your own CSS style sheet to create further style mappings, for example:

<xhtml-style-map name="Some OOo style" class="paragraph"

           block-element="div" block-css="block_style"

           element="p" css="par_style" />

to produce output like this:

<div class=”block_style”>

  <p class=”par_style”>Paragraph with Some OOo style</p>

  <p class=”par_style”>Yet another</p>

</div>

Note that the rules for hard formatting are only used when xhtml_ignore_styles is set to true. It is not recommended to rely on these rules, using real text styles is preferable. They are included because the use of hard character formatting is very common even in otherwise well-structured documents.

4.4 Using OpenOffice.org to create XHTML documents

The configuration file cleanxhtml.xml that is distributed with Writer2LaTeX, can be used to create semantically rich XHTML content, which can be formatted with your own stylesheet (you should edit the file to add the URL to the stylesheet you want to use).

A subset of the built-in styles in Writer are mapped to XHTML elements (note that the style names are localized, so this is for the english version of OpenOffice.org):

OOo Writer style

OOo Writer class

XHTML element

Text body

paragraph style

p

Sender

paragraph style

address

Quotations

paragraph style

blockquote

Preformatted Text

paragraph style

pre

List Heading

paragraph style

dt (in dl)

List Contents

paragraph style

dd (in dl)

Horizontal Rule

paragraph style

hr

Citation

text style

cite

Definition

text style

dfn

Emphasis

text style

em

Example

text style

samp

Source Text

text style

code

Strong Emphasis

text style

strong

Teletype

text style

tt

User entry

text style

kbd

Variable

text style

var

bold

hard formatting attribute

b

italics

hard formatting attribute

i

fixed pitch font

hard formatting attribute

tt

superscript

hard formatting attribute

sup

subscript

hard formatting attribute

sub

So by using these styles only, you will create well-structured XHTML documents. See the document sample-xhtml.sxw for an example of how to use this.

Warning: Some elements are not allowed inside pre, so this might in some cases lead to invalid documents. This will be fixed in a later version of Writer2xhtml.

7 This and the following options replaces the former option xhtml_ignore_styles.