HTML is based on Unicode. DITA2Go does not directly support non-Unicode double-byte languages (except for Asian and Cyrillic code pages for HTML Help), nor right-to-left languages such as Hebrew and Arabic.
Character encoding determines what method is used to represent double-byte characters in the <body> section of HTML output. To specify encoding or, alternatively, numeric references:
[HTMLOptions] ; Encoding = ISO-8859-1 (HTML default, numeric refs), ; or None (write 0x80-0xFF as single characters) Encoding=ISO-8859-1 ; QuotedEncoding = No (default, W3C usage, required for JavaHelp), ; or Yes (put encoding in meta tag in single quotes, needed by some ; older browsers) QuotedEncoding=No ; NumericCharRefs = Yes (default, always use &#nnn;) ; or No (use UTF-8 for XML) NumericCharRefs=Yes
For XHTML, the DITA2Go default is to claim UTF-8 as the encoding, but to use numeric references of the form &#nnn; for all characters that would have to be encoded; this satisfies all browsers. That is, DITA2Go does not actually produce any characters with values greater than 127 using the UTF-8 encoding; instead, DITA2Go uses entities for such characters, readable under any eight-bit encoding scheme.
For XHTML, you can specify a value for XMLEncoding (see §23.2.3 Specifying character encoding for generic XML) other than the default UTF-8. If you set Encoding=UTF-8, you get real UTF-8 encoding (two characters) in place of the numeric character references. However, you can still force use of numeric references by also setting NumericCharRefs=Yes.
While Encoding=None is not strictly compliant, this setting can be useful in places like Russia, where almost the entire text would otherwise consist of numeric character references. Encoding=None provides a 6:1 reduction in such references.
To direct DITA2Go to supply single quotes around the charset attribute value, specify QuotedEncoding=Yes:
<meta http-equiv="Content-type" content="text/html; charset='ISO-8859-1'">