Viewing File: /usr/share/doc/texinfo/html/HTML-Xref-8_002dbit-Character-Expansion.html

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!-- This manual is for GNU Texinfo (version 6.7, 23 September 2019),
a documentation system that can produce both online information and a
printed manual from a single source using semantic markup.

Copyright (C) 1988, 1990, 1991, 1992, 1993, 1995, 1996, 1997,
1998, 1999, 2001, 2001, 2003, 2004, 2005, 2006, 2007, 2008, 2009,
2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Free Software
Foundation, Inc.

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
Texts.  A copy of the license is included in the section entitled
"GNU Free Documentation License". -->
<!-- Created by GNU Texinfo 6.7, http://www.gnu.org/software/texinfo/ -->
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>HTML Xref 8-bit Character Expansion (GNU Texinfo 6.7)</title>

<meta name="description" content="HTML Xref 8-bit Character Expansion (GNU Texinfo 6.7)">
<meta name="keywords" content="HTML Xref 8-bit Character Expansion (GNU Texinfo 6.7)">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content="texi2any">
<link href="index.html" rel="start" title="Top">
<link href="Command-and-Variable-Index.html" rel="index" title="Command and Variable Index">
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="HTML-Xref.html" rel="up" title="HTML Xref">
<link href="HTML-Xref-Mismatch.html" rel="next" title="HTML Xref Mismatch">
<link href="HTML-Xref-Command-Expansion.html" rel="prev" title="HTML Xref Command Expansion">
<style type="text/css">
<!--
a.summary-letter {text-decoration: none}
blockquote.indentedblock {margin-right: 0em}
div.display {margin-left: 3.2em}
div.example {margin-left: 3.2em}
div.lisp {margin-left: 3.2em}
kbd {font-style: oblique}
pre.display {font-family: inherit}
pre.format {font-family: inherit}
pre.menu-comment {font-family: serif}
pre.menu-preformatted {font-family: serif}
span.nolinebreak {white-space: nowrap}
span.roman {font-family: initial; font-weight: normal}
span.sansserif {font-family: sans-serif; font-weight: normal}
ul.no-bullet {list-style: none}
-->
</style>


</head>

<body lang="en">
<span id="HTML-Xref-8_002dbit-Character-Expansion"></span><div class="header">
<p>
Next: <a href="HTML-Xref-Mismatch.html" accesskey="n" rel="next">HTML Xref Mismatch</a>, Previous: <a href="HTML-Xref-Command-Expansion.html" accesskey="p" rel="prev">HTML Xref Command Expansion</a>, Up: <a href="HTML-Xref.html" accesskey="u" rel="up">HTML Xref</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Command-and-Variable-Index.html" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<span id="HTML-Cross_002dreference-8_002dbit-Character-Expansion"></span><h4 class="subsection">22.4.4 HTML Cross-reference 8-bit Character Expansion</h4>
<span id="index-HTML-cross_002dreference-8_002dbit-character-expansion"></span>
<span id="index-8_002dbit-characters_002c-in-HTML-cross_002dreferences"></span>
<span id="index-Expansion-of-8_002dbit-characters-in-HTML-cross_002dreferences"></span>
<span id="index-Transliteration-of-8_002dbit-characters-in-HTML-cross_002dreferences"></span>

<p>Usually, characters other than plain 7-bit ASCII are transformed into
the corresponding Unicode code point(s) in Normalization Form&nbsp;C,
which uses precomposed characters where available.  (This is the
normalization form recommended by the W3C and other bodies.)  This
holds when that code point is <code>0xffff</code> or less, as it almost
always is.
</p>
<p>These will then be further transformed by the rules above into the
string &lsquo;<samp>_<var>hhhh</var></samp>&rsquo;, where <var>hhhh</var> is the code point in hex.
</p>
<p>For example, combining this rule and the previous section:
</p>
<div class="example">
<pre class="example">@node @b{A} @TeX{} @u{B} @point{}@enddots{}
&rArr; A-TeX-B_0306-_2605_002e_002e_002e
</pre></div>

<p>Notice: 1)&nbsp;<code>@enddots</code> expands to three periods which in
turn expands to three &lsquo;<samp>_002e</samp>&rsquo;&rsquo;s; 2)&nbsp;<code>@u{B}</code> is a &lsquo;B&rsquo;
with a breve accent, which does not exist as a pre-accented Unicode
character, therefore expands to &lsquo;<samp>B_0306</samp>&rsquo; (B with combining
breve).
</p>
<p>When the Unicode code point is above <code>0xffff</code>, the transformation
is &lsquo;<samp>__<var>xxxxxx</var></samp>&rsquo;, that is, two leading underscores followed by
six hex digits.  Since Unicode has declared that their highest code
point is <code>0x10ffff</code>, this is sufficient.  (We felt it was better
to define this extra escape than to always use six hex digits, since
the first two would nearly always be zeros.)
</p>
<p>This method works fine if the node name consists mostly of ASCII
characters and contains only few 8-bit ones.  But if the document is
written in a language whose script is not based on the Latin alphabet
(for example, Ukrainian), it will create file names consisting almost
entirely of &lsquo;<samp>_<var>xxxx</var></samp>&rsquo; notations, which is inconvenient and
all but unreadable.  To handle such cases, <code>makeinfo</code> offers
the <samp>--transliterate-file-names</samp> command line option.  This
option enables <em>transliteration</em> of node names into ASCII
characters for the purposes of file name creation and referencing.
The transliteration is based on phonetic principles, which makes the
generated file names more easily understanable.
</p>
<span id="index-Normalization-Form-C_002c-Unicode"></span>
<p>For the definition of Unicode Normalization Form&nbsp;C, see Unicode
report UAX#15, <a href="http://www.unicode.org/reports/tr15/">http://www.unicode.org/reports/tr15/</a>.  Many
related documents and implementations are available elsewhere on the
web.
</p>

<hr>
<div class="header">
<p>
Next: <a href="HTML-Xref-Mismatch.html" accesskey="n" rel="next">HTML Xref Mismatch</a>, Previous: <a href="HTML-Xref-Command-Expansion.html" accesskey="p" rel="prev">HTML Xref Command Expansion</a>, Up: <a href="HTML-Xref.html" accesskey="u" rel="up">HTML Xref</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Command-and-Variable-Index.html" title="Index" rel="index">Index</a>]</p>
</div>



</body>
</html>
Back to Directory File Manager