31.7.6 Assigning CSS classes based on Unicode character ranges

Suppose your document is translated to a non-Western language: Japanese, for example. After translation, a certain number of words might remain in Latin characters: product names, feature names, and acronyms, for example. The glyphs for Latin characters in common Unicode fonts (such as Mincho) that include Japanese characters might be unacceptably ugly. What you need is an automatic way to specify a different font to use for those glyphs.

DITA2Go provides settings that allow you to assign a CSS class to a range of Unicode characters. You can specify more than one class for a given element; the values are additive, and in case of conflict the latest value in the CSS file overrides earlier values. The order of values in the class attribute itself does not matter. The net effect is that you can use this feature without messing up the display of elements for which you already have other CSS rules. This is essential for the safe use of the feature.

To activate assignment of classes to Unicode character ranges:

[CSS]
; UseCharRangeClasses = No (default); or Yes (to activate settings in
; [CharacterRangeClasses] for marking spans by Unicode char range)
UseCharRangeClasses = Yes

To specify a class to use for spans of characters:

[CharacterRangeClasses]
; starting U+ code point (four or five hex digits) = class name,
; - (exclude from all classes), or * (allow in any class).
xxxx = classname   optional comment here
yyyy = *           allow in all classes
zzzz = -           exclude from all classes

The named class applies to the character code specified, plus all following character codes up to the next setting. Any text after the first term (class name or symbol) is a comment. The initial state is * (for allow in any class); the last setting should specify - (exclude from all classes).

For example, to flag English and European-language text remaining in a Japanese translation:

[CharacterRangeClasses]
0021 = latin  common symbols
0030 = *      digits
003A = latin  alpha, some symbols
00A5 = *      Yen sign
00A6 = latin  Latin-1, diacritics
0342 = greek  Greek diacritics
0346 = latin  Latin diacritics
0374 = greek  Greek letters
03E2 = -      Ethiopic and many more
1E00 = latin  Latin extended
1F00 = greek  Greek extended
2000 = *      lots of punctuation
2E80 = -      rest of the world

To flag Cyrillic in an English document:

[CharacterRangeClasses]
0021 = -
0400 = Russian
0514 = -
2000 = *
3000 = -

Previous Topic:  31.7.5 Assigning CSS classes to text and table footnotes

Next Topic:  31.7.7 Using link format names as CSS class names

Parent Topic:  31.7 Assigning CSS classes

Sibling Topics:

31.7.1 Understanding CSS class name restrictions

31.7.2 Mapping paragraph formats to CSS classes

31.7.3 Mapping character formats to tags or span classes

31.7.4 Assigning CSS classes to table formats

31.7.5 Assigning CSS classes to text and table footnotes

31.7.7 Using link format names as CSS class names

31.7.8 Using CSS class names as tags for XML

31.7.9 Omitting tags from CSS selectors

31.7.10 Overriding CSS class for selected paragraphs

Table of ContentsIndex