Go to Table of contents, next or previous page.


Using Universal Converter

To run Thessalonica’s Universal Converter, click the “Universal Converter” button on the “Thessalonica” toolbar. The following dialog box should be displayed:

The “Thessalonica — Universal
Converter” dialog

The “Thessalonica — Universal Converter” dialog

Setting conversion type

Here in the “Conversion Type” frame you can specify, if the conversion should be performed from a 8-bit encoding to Unicode, or from Unicode to a 8-bit encoding.

Setting formatting attributes

Both for 8-bit text and Unicode text you can specify several formatting attributes, namely font family, weight, shape, point size and language. Depending from the conversion type you have set Thessalonica will search for occurences of text with one set of formatting attributes and apply another set to text which is already converted. There is no need to set all formatting attributes: if a specific attribute doesn’t matter for your conversion, simply do the following:

Selecting an encoding

The most important control in the dialog is the “8-bit encoding” list box: its value specifies which set of rules will be applied to the conversion from ANSI to Unicode or vice versa. Currently Thessalonica supports several 8-bit encodings for Polytonic Greek. It also may be used to fix the common problem with Cyrillic text, which is misenterpreted by OpenOffice.org as a sequence of accented Latin characters when converting from various 8-bit formats (like Word 6.0/95). However, note that Thessalonica is designed as a converter of formatted text: if you just want to replace all Latin-1 characters with Cyrillic letters, consider using the Cyrillic Tools macro library (available from here) instead. Of course you can also extend Thessalonica with your own conversion tables.

Handling symbol fonts

Some 8-bit fonts designed for non-standard character sets (for example, Polytonic Greek) may have MS Symbol encoding. Unlike Unicode True Type fonts, symbol fonts may include only 256 characters, and in Unicode applications those characters are mapped to the 0xF000–0xF0FF Unicode range. These codes are used e. g. when you insert some characters via the “Special Characters” dialog box in OpenOffice.org. However, you also can type some text formatted with a symbol font directly. In this case normal Unicode representation will be used for all characters produced by your keyboard, although they may have the same look as those inserted via the dialog.

So if you want to make Thessalonica search for characters from symbolic range or use them in the replacement, enable the “MS Symbol Encoding” checkbox in the “Universal Converter” dialog. Note that in some situations it is hard to say how your characters formatted with a symbol font are actually represented. For example, in documents converted from MS Word both representations may be used. In this case you can simply perform your conversion operation twice: both with the “MS Symbol Encoding” option enabled and disabled.

Also note that on Unix systems OpenOffice.org may handle 8-bit Type 1 fonts (i. e. those having a character set included into the .pfb file) exactly as symbol True Type, and so for such fonts this option may also be useful.

Performing conversion

Once you have set all options in the dialog as desired, press the “Convert now” button. This will start the conversion operation. If you have selected some text before running the converter, only this fragment of text will be converted; otherwise, substitutions will be performed in the whole document. While conversion is performed, Thessalonica displays some messages and progress indicators on the taskbar, showing the current state of the conversion process.


Go to Table of contents, next or previous page.