jpayne@68
|
1 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html401/loose.dtd">
|
jpayne@68
|
2 <html>
|
jpayne@68
|
3 <!-- Created on February, 21 2024 by texi2html 1.78a -->
|
jpayne@68
|
4 <!--
|
jpayne@68
|
5 Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
|
jpayne@68
|
6 Karl Berry <karl@freefriends.org>
|
jpayne@68
|
7 Olaf Bachmann <obachman@mathematik.uni-kl.de>
|
jpayne@68
|
8 and many others.
|
jpayne@68
|
9 Maintained by: Many creative people.
|
jpayne@68
|
10 Send bugs and suggestions to <texi2html-bug@nongnu.org>
|
jpayne@68
|
11
|
jpayne@68
|
12 -->
|
jpayne@68
|
13 <head>
|
jpayne@68
|
14 <title>GNU gettext utilities: 4. Preparing Program Sources</title>
|
jpayne@68
|
15
|
jpayne@68
|
16 <meta name="description" content="GNU gettext utilities: 4. Preparing Program Sources">
|
jpayne@68
|
17 <meta name="keywords" content="GNU gettext utilities: 4. Preparing Program Sources">
|
jpayne@68
|
18 <meta name="resource-type" content="document">
|
jpayne@68
|
19 <meta name="distribution" content="global">
|
jpayne@68
|
20 <meta name="Generator" content="texi2html 1.78a">
|
jpayne@68
|
21 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
jpayne@68
|
22 <style type="text/css">
|
jpayne@68
|
23 <!--
|
jpayne@68
|
24 a.summary-letter {text-decoration: none}
|
jpayne@68
|
25 pre.display {font-family: serif}
|
jpayne@68
|
26 pre.format {font-family: serif}
|
jpayne@68
|
27 pre.menu-comment {font-family: serif}
|
jpayne@68
|
28 pre.menu-preformatted {font-family: serif}
|
jpayne@68
|
29 pre.smalldisplay {font-family: serif; font-size: smaller}
|
jpayne@68
|
30 pre.smallexample {font-size: smaller}
|
jpayne@68
|
31 pre.smallformat {font-family: serif; font-size: smaller}
|
jpayne@68
|
32 pre.smalllisp {font-size: smaller}
|
jpayne@68
|
33 span.roman {font-family:serif; font-weight:normal;}
|
jpayne@68
|
34 span.sansserif {font-family:sans-serif; font-weight:normal;}
|
jpayne@68
|
35 ul.toc {list-style: none}
|
jpayne@68
|
36 -->
|
jpayne@68
|
37 </style>
|
jpayne@68
|
38
|
jpayne@68
|
39
|
jpayne@68
|
40 </head>
|
jpayne@68
|
41
|
jpayne@68
|
42 <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
|
jpayne@68
|
43
|
jpayne@68
|
44 <table cellpadding="1" cellspacing="1" border="0">
|
jpayne@68
|
45 <tr><td valign="middle" align="left">[<a href="gettext_3.html#SEC16" title="Beginning of this chapter or previous chapter"> << </a>]</td>
|
jpayne@68
|
46 <td valign="middle" align="left">[<a href="gettext_5.html#SEC35" title="Next chapter"> >> </a>]</td>
|
jpayne@68
|
47 <td valign="middle" align="left"> </td>
|
jpayne@68
|
48 <td valign="middle" align="left"> </td>
|
jpayne@68
|
49 <td valign="middle" align="left"> </td>
|
jpayne@68
|
50 <td valign="middle" align="left"> </td>
|
jpayne@68
|
51 <td valign="middle" align="left"> </td>
|
jpayne@68
|
52 <td valign="middle" align="left">[<a href="gettext_toc.html#SEC_Top" title="Cover (top) of document">Top</a>]</td>
|
jpayne@68
|
53 <td valign="middle" align="left">[<a href="gettext_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
|
jpayne@68
|
54 <td valign="middle" align="left">[<a href="gettext_21.html#SEC389" title="Index">Index</a>]</td>
|
jpayne@68
|
55 <td valign="middle" align="left">[<a href="gettext_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
|
jpayne@68
|
56 </tr></table>
|
jpayne@68
|
57
|
jpayne@68
|
58 <hr size="2">
|
jpayne@68
|
59 <a name="Sources"></a>
|
jpayne@68
|
60 <a name="SEC17"></a>
|
jpayne@68
|
61 <h1 class="chapter"> <a href="gettext_toc.html#TOC17">4. Preparing Program Sources</a> </h1>
|
jpayne@68
|
62
|
jpayne@68
|
63
|
jpayne@68
|
64 <p>For the programmer, changes to the C source code fall into three
|
jpayne@68
|
65 categories. First, you have to make the localization functions
|
jpayne@68
|
66 known to all modules needing message translation. Second, you should
|
jpayne@68
|
67 properly trigger the operation of GNU <code>gettext</code> when the program
|
jpayne@68
|
68 initializes, usually from the <code>main</code> function. Last, you should
|
jpayne@68
|
69 identify, adjust and mark all constant strings in your program
|
jpayne@68
|
70 needing translation.
|
jpayne@68
|
71 </p>
|
jpayne@68
|
72
|
jpayne@68
|
73
|
jpayne@68
|
74 <a name="Importing"></a>
|
jpayne@68
|
75 <a name="SEC18"></a>
|
jpayne@68
|
76 <h2 class="section"> <a href="gettext_toc.html#TOC18">4.1 Importing the <code>gettext</code> declaration</a> </h2>
|
jpayne@68
|
77
|
jpayne@68
|
78 <p>Presuming that your set of programs, or package, has been adjusted
|
jpayne@68
|
79 so all needed GNU <code>gettext</code> files are available, and your
|
jpayne@68
|
80 ‘<tt>Makefile</tt>’ files are adjusted (see section <a href="gettext_13.html#SEC230">The Maintainer's View</a>), each C module
|
jpayne@68
|
81 having translated C strings should contain the line:
|
jpayne@68
|
82 </p>
|
jpayne@68
|
83 <a name="IDX116"></a>
|
jpayne@68
|
84 <table><tr><td> </td><td><pre class="example">#include <libintl.h>
|
jpayne@68
|
85 </pre></td></tr></table>
|
jpayne@68
|
86
|
jpayne@68
|
87 <p>Similarly, each C module containing <code>printf()</code>/<code>fprintf()</code>/...
|
jpayne@68
|
88 calls with a format string that could be a translated C string (even if
|
jpayne@68
|
89 the C string comes from a different C module) should contain the line:
|
jpayne@68
|
90 </p>
|
jpayne@68
|
91 <table><tr><td> </td><td><pre class="example">#include <libintl.h>
|
jpayne@68
|
92 </pre></td></tr></table>
|
jpayne@68
|
93
|
jpayne@68
|
94
|
jpayne@68
|
95 <a name="Triggering"></a>
|
jpayne@68
|
96 <a name="SEC19"></a>
|
jpayne@68
|
97 <h2 class="section"> <a href="gettext_toc.html#TOC19">4.2 Triggering <code>gettext</code> Operations</a> </h2>
|
jpayne@68
|
98
|
jpayne@68
|
99 <p>The initialization of locale data should be done with more or less
|
jpayne@68
|
100 the same code in every program, as demonstrated below:
|
jpayne@68
|
101 </p>
|
jpayne@68
|
102 <table><tr><td> </td><td><pre class="example">int
|
jpayne@68
|
103 main (int argc, char *argv[])
|
jpayne@68
|
104 {
|
jpayne@68
|
105 …
|
jpayne@68
|
106 setlocale (LC_ALL, "");
|
jpayne@68
|
107 bindtextdomain (PACKAGE, LOCALEDIR);
|
jpayne@68
|
108 textdomain (PACKAGE);
|
jpayne@68
|
109 …
|
jpayne@68
|
110 }
|
jpayne@68
|
111 </pre></td></tr></table>
|
jpayne@68
|
112
|
jpayne@68
|
113 <p><var>PACKAGE</var> and <var>LOCALEDIR</var> should be provided either by
|
jpayne@68
|
114 ‘<tt>config.h</tt>’ or by the Makefile. For now consult the <code>gettext</code>
|
jpayne@68
|
115 or <code>hello</code> sources for more information.
|
jpayne@68
|
116 </p>
|
jpayne@68
|
117 <a name="IDX117"></a>
|
jpayne@68
|
118 <a name="IDX118"></a>
|
jpayne@68
|
119 <p>The use of <code>LC_ALL</code> might not be appropriate for you.
|
jpayne@68
|
120 <code>LC_ALL</code> includes all locale categories and especially
|
jpayne@68
|
121 <code>LC_CTYPE</code>. This latter category is responsible for determining
|
jpayne@68
|
122 character classes with the <code>isalnum</code> etc. functions from
|
jpayne@68
|
123 ‘<tt>ctype.h</tt>’ which could especially for programs, which process some
|
jpayne@68
|
124 kind of input language, be wrong. For example this would mean that a
|
jpayne@68
|
125 source code using the ç (c-cedilla character) is runnable in
|
jpayne@68
|
126 France but not in the U.S.
|
jpayne@68
|
127 </p>
|
jpayne@68
|
128 <p>Some systems also have problems with parsing numbers using the
|
jpayne@68
|
129 <code>scanf</code> functions if an other but the <code>LC_ALL</code> locale category is
|
jpayne@68
|
130 used. The standards say that additional formats but the one known in the
|
jpayne@68
|
131 <code>"C"</code> locale might be recognized. But some systems seem to reject
|
jpayne@68
|
132 numbers in the <code>"C"</code> locale format. In some situation, it might
|
jpayne@68
|
133 also be a problem with the notation itself which makes it impossible to
|
jpayne@68
|
134 recognize whether the number is in the <code>"C"</code> locale or the local
|
jpayne@68
|
135 format. This can happen if thousands separator characters are used.
|
jpayne@68
|
136 Some locales define this character according to the national
|
jpayne@68
|
137 conventions to <code>'.'</code> which is the same character used in the
|
jpayne@68
|
138 <code>"C"</code> locale to denote the decimal point.
|
jpayne@68
|
139 </p>
|
jpayne@68
|
140 <p>So it is sometimes necessary to replace the <code>LC_ALL</code> line in the
|
jpayne@68
|
141 code above by a sequence of <code>setlocale</code> lines
|
jpayne@68
|
142 </p>
|
jpayne@68
|
143 <table><tr><td> </td><td><pre class="example">{
|
jpayne@68
|
144 …
|
jpayne@68
|
145 setlocale (LC_CTYPE, "");
|
jpayne@68
|
146 setlocale (LC_MESSAGES, "");
|
jpayne@68
|
147 …
|
jpayne@68
|
148 }
|
jpayne@68
|
149 </pre></td></tr></table>
|
jpayne@68
|
150
|
jpayne@68
|
151 <a name="IDX119"></a>
|
jpayne@68
|
152 <a name="IDX120"></a>
|
jpayne@68
|
153 <a name="IDX121"></a>
|
jpayne@68
|
154 <a name="IDX122"></a>
|
jpayne@68
|
155 <a name="IDX123"></a>
|
jpayne@68
|
156 <a name="IDX124"></a>
|
jpayne@68
|
157 <a name="IDX125"></a>
|
jpayne@68
|
158 <p>On all POSIX conformant systems the locale categories <code>LC_CTYPE</code>,
|
jpayne@68
|
159 <code>LC_MESSAGES</code>, <code>LC_COLLATE</code>, <code>LC_MONETARY</code>,
|
jpayne@68
|
160 <code>LC_NUMERIC</code>, and <code>LC_TIME</code> are available. On some systems
|
jpayne@68
|
161 which are only ISO C compliant, <code>LC_MESSAGES</code> is missing, but
|
jpayne@68
|
162 a substitute for it is defined in GNU gettext's <code><libintl.h></code> and
|
jpayne@68
|
163 in GNU gnulib's <code><locale.h></code>.
|
jpayne@68
|
164 </p>
|
jpayne@68
|
165 <p>Note that changing the <code>LC_CTYPE</code> also affects the functions
|
jpayne@68
|
166 declared in the <code><ctype.h></code> standard header and some functions
|
jpayne@68
|
167 declared in the <code><string.h></code> and <code><stdlib.h></code> standard headers.
|
jpayne@68
|
168 If this is not
|
jpayne@68
|
169 desirable in your application (for example in a compiler's parser),
|
jpayne@68
|
170 you can use a set of substitute functions which hardwire the C locale,
|
jpayne@68
|
171 such as found in the modules ‘<samp>c-ctype</samp>’, ‘<samp>c-strcase</samp>’,
|
jpayne@68
|
172 ‘<samp>c-strcasestr</samp>’, ‘<samp>c-strtod</samp>’, ‘<samp>c-strtold</samp>’ in the GNU gnulib
|
jpayne@68
|
173 source distribution.
|
jpayne@68
|
174 </p>
|
jpayne@68
|
175 <p>It is also possible to switch the locale forth and back between the
|
jpayne@68
|
176 environment dependent locale and the C locale, but this approach is
|
jpayne@68
|
177 normally avoided because a <code>setlocale</code> call is expensive,
|
jpayne@68
|
178 because it is tedious to determine the places where a locale switch
|
jpayne@68
|
179 is needed in a large program's source, and because switching a locale
|
jpayne@68
|
180 is not multithread-safe.
|
jpayne@68
|
181 </p>
|
jpayne@68
|
182
|
jpayne@68
|
183 <a name="Preparing-Strings"></a>
|
jpayne@68
|
184 <a name="SEC20"></a>
|
jpayne@68
|
185 <h2 class="section"> <a href="gettext_toc.html#TOC20">4.3 Preparing Translatable Strings</a> </h2>
|
jpayne@68
|
186
|
jpayne@68
|
187 <p>Before strings can be marked for translations, they sometimes need to
|
jpayne@68
|
188 be adjusted. Usually preparing a string for translation is done right
|
jpayne@68
|
189 before marking it, during the marking phase which is described in the
|
jpayne@68
|
190 next sections. What you have to keep in mind while doing that is the
|
jpayne@68
|
191 following.
|
jpayne@68
|
192 </p>
|
jpayne@68
|
193 <ul>
|
jpayne@68
|
194 <li>
|
jpayne@68
|
195 Decent English style.
|
jpayne@68
|
196
|
jpayne@68
|
197 </li><li>
|
jpayne@68
|
198 Entire sentences.
|
jpayne@68
|
199
|
jpayne@68
|
200 </li><li>
|
jpayne@68
|
201 Split at paragraphs.
|
jpayne@68
|
202
|
jpayne@68
|
203 </li><li>
|
jpayne@68
|
204 Use format strings instead of string concatenation.
|
jpayne@68
|
205
|
jpayne@68
|
206 </li><li>
|
jpayne@68
|
207 Use placeholders in format strings instead of embedded URLs.
|
jpayne@68
|
208
|
jpayne@68
|
209 </li><li>
|
jpayne@68
|
210 Use placeholders in format strings instead of programmer-defined format
|
jpayne@68
|
211 string directives.
|
jpayne@68
|
212
|
jpayne@68
|
213 </li><li>
|
jpayne@68
|
214 Avoid unusual markup and unusual control characters.
|
jpayne@68
|
215 </li></ul>
|
jpayne@68
|
216
|
jpayne@68
|
217 <p>Let's look at some examples of these guidelines.
|
jpayne@68
|
218 </p>
|
jpayne@68
|
219 <a name="SEC21"></a>
|
jpayne@68
|
220 <h3 class="subheading"> Decent English style </h3>
|
jpayne@68
|
221
|
jpayne@68
|
222 <p>Translatable strings should be in good English style. If slang language
|
jpayne@68
|
223 with abbreviations and shortcuts is used, often translators will not
|
jpayne@68
|
224 understand the message and will produce very inappropriate translations.
|
jpayne@68
|
225 </p>
|
jpayne@68
|
226 <table><tr><td> </td><td><pre class="example">"%s: is parameter\n"
|
jpayne@68
|
227 </pre></td></tr></table>
|
jpayne@68
|
228
|
jpayne@68
|
229 <p>This is nearly untranslatable: Is the displayed item <em>a</em> parameter or
|
jpayne@68
|
230 <em>the</em> parameter?
|
jpayne@68
|
231 </p>
|
jpayne@68
|
232 <table><tr><td> </td><td><pre class="example">"No match"
|
jpayne@68
|
233 </pre></td></tr></table>
|
jpayne@68
|
234
|
jpayne@68
|
235 <p>The ambiguity in this message makes it unintelligible: Is the program
|
jpayne@68
|
236 attempting to set something on fire? Does it mean "The given object does
|
jpayne@68
|
237 not match the template"? Does it mean "The template does not fit for any
|
jpayne@68
|
238 of the objects"?
|
jpayne@68
|
239 </p>
|
jpayne@68
|
240 <a name="IDX126"></a>
|
jpayne@68
|
241 <p>In both cases, adding more words to the message will help both the
|
jpayne@68
|
242 translator and the English speaking user.
|
jpayne@68
|
243 </p>
|
jpayne@68
|
244 <a name="SEC22"></a>
|
jpayne@68
|
245 <h3 class="subheading"> Entire sentences </h3>
|
jpayne@68
|
246
|
jpayne@68
|
247 <p>Translatable strings should be entire sentences. It is often not possible
|
jpayne@68
|
248 to translate single verbs or adjectives in a substitutable way.
|
jpayne@68
|
249 </p>
|
jpayne@68
|
250 <table><tr><td> </td><td><pre class="example">printf ("File %s is %s protected", filename, rw ? "write" : "read");
|
jpayne@68
|
251 </pre></td></tr></table>
|
jpayne@68
|
252
|
jpayne@68
|
253 <p>Most translators will not look at the source and will thus only see the
|
jpayne@68
|
254 string <code>"File %s is %s protected"</code>, which is unintelligible. Change
|
jpayne@68
|
255 this to
|
jpayne@68
|
256 </p>
|
jpayne@68
|
257 <table><tr><td> </td><td><pre class="example">printf (rw ? "File %s is write protected" : "File %s is read protected",
|
jpayne@68
|
258 filename);
|
jpayne@68
|
259 </pre></td></tr></table>
|
jpayne@68
|
260
|
jpayne@68
|
261 <p>This way the translator will not only understand the message, she will
|
jpayne@68
|
262 also be able to find the appropriate grammatical construction. A French
|
jpayne@68
|
263 translator for example translates "write protected" like "protected
|
jpayne@68
|
264 against writing".
|
jpayne@68
|
265 </p>
|
jpayne@68
|
266 <p>Entire sentences are also important because in many languages, the
|
jpayne@68
|
267 declination of some word in a sentence depends on the gender or the
|
jpayne@68
|
268 number (singular/plural) of another part of the sentence. There are
|
jpayne@68
|
269 usually more interdependencies between words than in English. The
|
jpayne@68
|
270 consequence is that asking a translator to translate two half-sentences
|
jpayne@68
|
271 and then combining these two half-sentences through dumb string concatenation
|
jpayne@68
|
272 will not work, for many languages, even though it would work for English.
|
jpayne@68
|
273 That's why translators need to handle entire sentences.
|
jpayne@68
|
274 </p>
|
jpayne@68
|
275 <p>Often sentences don't fit into a single line. If a sentence is output
|
jpayne@68
|
276 using two subsequent <code>printf</code> statements, like this
|
jpayne@68
|
277 </p>
|
jpayne@68
|
278 <table><tr><td> </td><td><pre class="example">printf ("Locale charset \"%s\" is different from\n", lcharset);
|
jpayne@68
|
279 printf ("input file charset \"%s\".\n", fcharset);
|
jpayne@68
|
280 </pre></td></tr></table>
|
jpayne@68
|
281
|
jpayne@68
|
282 <p>the translator would have to translate two half sentences, but nothing
|
jpayne@68
|
283 in the POT file would tell her that the two half sentences belong together.
|
jpayne@68
|
284 It is necessary to merge the two <code>printf</code> statements so that the
|
jpayne@68
|
285 translator can handle the entire sentence at once and decide at which
|
jpayne@68
|
286 place to insert a line break in the translation (if at all):
|
jpayne@68
|
287 </p>
|
jpayne@68
|
288 <table><tr><td> </td><td><pre class="example">printf ("Locale charset \"%s\" is different from\n\
|
jpayne@68
|
289 input file charset \"%s\".\n", lcharset, fcharset);
|
jpayne@68
|
290 </pre></td></tr></table>
|
jpayne@68
|
291
|
jpayne@68
|
292 <p>You may now ask: how about two or more adjacent sentences? Like in this case:
|
jpayne@68
|
293 </p>
|
jpayne@68
|
294 <table><tr><td> </td><td><pre class="example">puts ("Apollo 13 scenario: Stack overflow handling failed.");
|
jpayne@68
|
295 puts ("On the next stack overflow we will crash!!!");
|
jpayne@68
|
296 </pre></td></tr></table>
|
jpayne@68
|
297
|
jpayne@68
|
298 <p>Should these two statements merged into a single one? I would recommend to
|
jpayne@68
|
299 merge them if the two sentences are related to each other, because then it
|
jpayne@68
|
300 makes it easier for the translator to understand and translate both. On
|
jpayne@68
|
301 the other hand, if one of the two messages is a stereotypic one, occurring
|
jpayne@68
|
302 in other places as well, you will do a favour to the translator by not
|
jpayne@68
|
303 merging the two. (Identical messages occurring in several places are
|
jpayne@68
|
304 combined by xgettext, so the translator has to handle them once only.)
|
jpayne@68
|
305 </p>
|
jpayne@68
|
306 <a name="SEC23"></a>
|
jpayne@68
|
307 <h3 class="subheading"> Split at paragraphs </h3>
|
jpayne@68
|
308
|
jpayne@68
|
309 <p>Translatable strings should be limited to one paragraph; don't let a
|
jpayne@68
|
310 single message be longer than ten lines. The reason is that when the
|
jpayne@68
|
311 translatable string changes, the translator is faced with the task of
|
jpayne@68
|
312 updating the entire translated string. Maybe only a single word will
|
jpayne@68
|
313 have changed in the English string, but the translator doesn't see that
|
jpayne@68
|
314 (with the current translation tools), therefore she has to proofread
|
jpayne@68
|
315 the entire message.
|
jpayne@68
|
316 </p>
|
jpayne@68
|
317 <a name="IDX127"></a>
|
jpayne@68
|
318 <p>Many GNU programs have a ‘<samp>--help</samp>’ output that extends over several
|
jpayne@68
|
319 screen pages. It is a courtesy towards the translators to split such a
|
jpayne@68
|
320 message into several ones of five to ten lines each. While doing that,
|
jpayne@68
|
321 you can also attempt to split the documented options into groups,
|
jpayne@68
|
322 such as the input options, the output options, and the informative
|
jpayne@68
|
323 output options. This will help every user to find the option he is
|
jpayne@68
|
324 looking for.
|
jpayne@68
|
325 </p>
|
jpayne@68
|
326 <a name="SEC24"></a>
|
jpayne@68
|
327 <h3 class="subheading"> No string concatenation </h3>
|
jpayne@68
|
328
|
jpayne@68
|
329 <p>Hardcoded string concatenation is sometimes used to construct English
|
jpayne@68
|
330 strings:
|
jpayne@68
|
331 </p>
|
jpayne@68
|
332 <table><tr><td> </td><td><pre class="example">strcpy (s, "Replace ");
|
jpayne@68
|
333 strcat (s, object1);
|
jpayne@68
|
334 strcat (s, " with ");
|
jpayne@68
|
335 strcat (s, object2);
|
jpayne@68
|
336 strcat (s, "?");
|
jpayne@68
|
337 </pre></td></tr></table>
|
jpayne@68
|
338
|
jpayne@68
|
339 <p>In order to present to the translator only entire sentences, and also
|
jpayne@68
|
340 because in some languages the translator might want to swap the order
|
jpayne@68
|
341 of <code>object1</code> and <code>object2</code>, it is necessary to change this
|
jpayne@68
|
342 to use a format string:
|
jpayne@68
|
343 </p>
|
jpayne@68
|
344 <table><tr><td> </td><td><pre class="example">sprintf (s, "Replace %s with %s?", object1, object2);
|
jpayne@68
|
345 </pre></td></tr></table>
|
jpayne@68
|
346
|
jpayne@68
|
347 <a name="IDX128"></a>
|
jpayne@68
|
348 <p>A similar case is compile time concatenation of strings. The ISO C 99
|
jpayne@68
|
349 include file <code><inttypes.h></code> contains a macro <code>PRId64</code> that
|
jpayne@68
|
350 can be used as a formatting directive for outputting an ‘<samp>int64_t</samp>’
|
jpayne@68
|
351 integer through <code>printf</code>. It expands to a constant string, usually
|
jpayne@68
|
352 "d" or "ld" or "lld" or something like this, depending on the platform.
|
jpayne@68
|
353 Assume you have code like
|
jpayne@68
|
354 </p>
|
jpayne@68
|
355 <table><tr><td> </td><td><pre class="example">printf ("The amount is %0" PRId64 "\n", number);
|
jpayne@68
|
356 </pre></td></tr></table>
|
jpayne@68
|
357
|
jpayne@68
|
358 <p>The <code>gettext</code> tools and library have special support for these
|
jpayne@68
|
359 <code><inttypes.h></code> macros. You can therefore simply write
|
jpayne@68
|
360 </p>
|
jpayne@68
|
361 <table><tr><td> </td><td><pre class="example">printf (gettext ("The amount is %0" PRId64 "\n"), number);
|
jpayne@68
|
362 </pre></td></tr></table>
|
jpayne@68
|
363
|
jpayne@68
|
364 <p>The PO file will contain the string "The amount is %0<PRId64>\n".
|
jpayne@68
|
365 The translators will provide a translation containing "%0<PRId64>"
|
jpayne@68
|
366 as well, and at runtime the <code>gettext</code> function's result will
|
jpayne@68
|
367 contain the appropriate constant string, "d" or "ld" or "lld".
|
jpayne@68
|
368 </p>
|
jpayne@68
|
369 <p>This works only for the predefined <code><inttypes.h></code> macros. If
|
jpayne@68
|
370 you have defined your own similar macros, let's say ‘<samp>MYPRId64</samp>’,
|
jpayne@68
|
371 that are not known to <code>xgettext</code>, the solution for this problem
|
jpayne@68
|
372 is to change the code like this:
|
jpayne@68
|
373 </p>
|
jpayne@68
|
374 <table><tr><td> </td><td><pre class="example">char buf1[100];
|
jpayne@68
|
375 sprintf (buf1, "%0" MYPRId64, number);
|
jpayne@68
|
376 printf (gettext ("The amount is %s\n"), buf1);
|
jpayne@68
|
377 </pre></td></tr></table>
|
jpayne@68
|
378
|
jpayne@68
|
379 <p>This means, you put the platform dependent code in one statement, and the
|
jpayne@68
|
380 internationalization code in a different statement. Note that a buffer length
|
jpayne@68
|
381 of 100 is safe, because all available hardware integer types are limited to
|
jpayne@68
|
382 128 bits, and to print a 128 bit integer one needs at most 54 characters,
|
jpayne@68
|
383 regardless whether in decimal, octal or hexadecimal.
|
jpayne@68
|
384 </p>
|
jpayne@68
|
385 <a name="IDX129"></a>
|
jpayne@68
|
386 <a name="IDX130"></a>
|
jpayne@68
|
387 <p>All this applies to other programming languages as well. For example, in
|
jpayne@68
|
388 Java and C#, string concatenation is very frequently used, because it is a
|
jpayne@68
|
389 compiler built-in operator. Like in C, in Java, you would change
|
jpayne@68
|
390 </p>
|
jpayne@68
|
391 <table><tr><td> </td><td><pre class="example">System.out.println("Replace "+object1+" with "+object2+"?");
|
jpayne@68
|
392 </pre></td></tr></table>
|
jpayne@68
|
393
|
jpayne@68
|
394 <p>into a statement involving a format string:
|
jpayne@68
|
395 </p>
|
jpayne@68
|
396 <table><tr><td> </td><td><pre class="example">System.out.println(
|
jpayne@68
|
397 MessageFormat.format("Replace {0} with {1}?",
|
jpayne@68
|
398 new Object[] { object1, object2 }));
|
jpayne@68
|
399 </pre></td></tr></table>
|
jpayne@68
|
400
|
jpayne@68
|
401 <p>Similarly, in C#, you would change
|
jpayne@68
|
402 </p>
|
jpayne@68
|
403 <table><tr><td> </td><td><pre class="example">Console.WriteLine("Replace "+object1+" with "+object2+"?");
|
jpayne@68
|
404 </pre></td></tr></table>
|
jpayne@68
|
405
|
jpayne@68
|
406 <p>into a statement involving a format string:
|
jpayne@68
|
407 </p>
|
jpayne@68
|
408 <table><tr><td> </td><td><pre class="example">Console.WriteLine(
|
jpayne@68
|
409 String.Format("Replace {0} with {1}?", object1, object2));
|
jpayne@68
|
410 </pre></td></tr></table>
|
jpayne@68
|
411
|
jpayne@68
|
412 <a name="SEC25"></a>
|
jpayne@68
|
413 <h3 class="subheading"> No embedded URLs </h3>
|
jpayne@68
|
414
|
jpayne@68
|
415 <p>It is good to not embed URLs in translatable strings, for several reasons:
|
jpayne@68
|
416 </p><ul>
|
jpayne@68
|
417 <li>
|
jpayne@68
|
418 It avoids possible mistakes during copy and paste.
|
jpayne@68
|
419 </li><li>
|
jpayne@68
|
420 Translators cannot translate the URLs or, by mistake, use the URLs from
|
jpayne@68
|
421 other packages that are present in their compendium.
|
jpayne@68
|
422 </li><li>
|
jpayne@68
|
423 When the URLs change, translators don't need to revisit the translation
|
jpayne@68
|
424 of the string.
|
jpayne@68
|
425 </li></ul>
|
jpayne@68
|
426
|
jpayne@68
|
427 <p>The same holds for email addresses.
|
jpayne@68
|
428 </p>
|
jpayne@68
|
429 <p>So, you would change
|
jpayne@68
|
430 </p>
|
jpayne@68
|
431 <table><tr><td> </td><td><pre class="smallexample">fputs (_("GNU GPL version 3 <https://gnu.org/licenses/gpl.html>\n"),
|
jpayne@68
|
432 stream);
|
jpayne@68
|
433 </pre></td></tr></table>
|
jpayne@68
|
434
|
jpayne@68
|
435 <p>to
|
jpayne@68
|
436 </p>
|
jpayne@68
|
437 <table><tr><td> </td><td><pre class="smallexample">fprintf (stream, _("GNU GPL version 3 <%s>\n"),
|
jpayne@68
|
438 "https://gnu.org/licenses/gpl.html");
|
jpayne@68
|
439 </pre></td></tr></table>
|
jpayne@68
|
440
|
jpayne@68
|
441 <a name="SEC26"></a>
|
jpayne@68
|
442 <h3 class="subheading"> No programmer-defined format string directives </h3>
|
jpayne@68
|
443
|
jpayne@68
|
444 <p>The GNU C Library's <code><printf.h></code> facility and the C++ standard library's <code><format></code> header file make it possible for the programmer to define their own format string directives. However, such format directives cannot be used in translatable strings, for two reasons:
|
jpayne@68
|
445 </p><ul>
|
jpayne@68
|
446 <li>
|
jpayne@68
|
447 There is no reference documentation for format strings with such directives, that the translators could consult. They would therefore have to guess where the directive starts and where it ends.
|
jpayne@68
|
448 </li><li>
|
jpayne@68
|
449 An ‘<samp>msgfmt -c</samp>’ invocation cannot check whether the translator has produced a compatible translation of the format string. As a consequence, when a format string contains a programmer-defined directive, the program may crash at runtime when it uses the translated format string.
|
jpayne@68
|
450 </li></ul>
|
jpayne@68
|
451
|
jpayne@68
|
452 <p>To avoid this situation, you need to move the formatting with the custom directive into a format string that does not get translated.
|
jpayne@68
|
453 </p>
|
jpayne@68
|
454 <p>For example, assuming code that makes use of a <code>%r</code> directive:
|
jpayne@68
|
455 </p>
|
jpayne@68
|
456 <table><tr><td> </td><td><pre class="smallexample">fprintf (stream, _("The contents is: %r"), data);
|
jpayne@68
|
457 </pre></td></tr></table>
|
jpayne@68
|
458
|
jpayne@68
|
459 <p>you would rewrite it to:
|
jpayne@68
|
460 </p>
|
jpayne@68
|
461 <table><tr><td> </td><td><pre class="smallexample">char *tmp;
|
jpayne@68
|
462 if (asprintf (&tmp, "%r", data) < 0)
|
jpayne@68
|
463 error (...);
|
jpayne@68
|
464 fprintf (stream, _("The contents is: %s"), tmp);
|
jpayne@68
|
465 free (tmp);
|
jpayne@68
|
466 </pre></td></tr></table>
|
jpayne@68
|
467
|
jpayne@68
|
468 <p>Similarly, in C++, assuming you have defined a custom <code>formatter</code> for the type of <code>data</code>, the code
|
jpayne@68
|
469 </p>
|
jpayne@68
|
470 <table><tr><td> </td><td><pre class="smallexample">cout << format (_("The contents is: {:#$#}"), data);
|
jpayne@68
|
471 </pre></td></tr></table>
|
jpayne@68
|
472
|
jpayne@68
|
473 <p>should be rewritten to:
|
jpayne@68
|
474 </p>
|
jpayne@68
|
475 <table><tr><td> </td><td><pre class="smallexample">string tmp = format ("{:#$#}", data);
|
jpayne@68
|
476 cout << format (_("The contents is: {}"), tmp);
|
jpayne@68
|
477 </pre></td></tr></table>
|
jpayne@68
|
478
|
jpayne@68
|
479 <a name="SEC27"></a>
|
jpayne@68
|
480 <h3 class="subheading"> No unusual markup </h3>
|
jpayne@68
|
481
|
jpayne@68
|
482 <p>Unusual markup or control characters should not be used in translatable
|
jpayne@68
|
483 strings. Translators will likely not understand the particular meaning
|
jpayne@68
|
484 of the markup or control characters.
|
jpayne@68
|
485 </p>
|
jpayne@68
|
486 <p>For example, if you have a convention that ‘<samp>|</samp>’ delimits the
|
jpayne@68
|
487 left-hand and right-hand part of some GUI elements, translators will
|
jpayne@68
|
488 often not understand it without specific comments. It might be
|
jpayne@68
|
489 better to have the translator translate the left-hand and right-hand
|
jpayne@68
|
490 part separately.
|
jpayne@68
|
491 </p>
|
jpayne@68
|
492 <p>Another example is the ‘<samp>argp</samp>’ convention to use a single ‘<samp>\v</samp>’
|
jpayne@68
|
493 (vertical tab) control character to delimit two sections inside a
|
jpayne@68
|
494 string. This is flawed. Some translators may convert it to a simple
|
jpayne@68
|
495 newline, some to blank lines. With some PO file editors it may not be
|
jpayne@68
|
496 easy to even enter a vertical tab control character. So, you cannot
|
jpayne@68
|
497 be sure that the translation will contain a ‘<samp>\v</samp>’ character, at the
|
jpayne@68
|
498 corresponding position. The solution is, again, to let the translator
|
jpayne@68
|
499 translate two separate strings and combine at run-time the two translated
|
jpayne@68
|
500 strings with the ‘<samp>\v</samp>’ required by the convention.
|
jpayne@68
|
501 </p>
|
jpayne@68
|
502 <p>HTML markup, however, is common enough that it's probably ok to use in
|
jpayne@68
|
503 translatable strings. But please bear in mind that the GNU gettext tools
|
jpayne@68
|
504 don't verify that the translations are well-formed HTML.
|
jpayne@68
|
505 </p>
|
jpayne@68
|
506
|
jpayne@68
|
507 <a name="Mark-Keywords"></a>
|
jpayne@68
|
508 <a name="SEC28"></a>
|
jpayne@68
|
509 <h2 class="section"> <a href="gettext_toc.html#TOC21">4.4 How Marks Appear in Sources</a> </h2>
|
jpayne@68
|
510
|
jpayne@68
|
511 <p>All strings requiring translation should be marked in the C sources. Marking
|
jpayne@68
|
512 is done in such a way that each translatable string appears to be
|
jpayne@68
|
513 the sole argument of some function or preprocessor macro. There are
|
jpayne@68
|
514 only a few such possible functions or macros meant for translation,
|
jpayne@68
|
515 and their names are said to be marking keywords. The marking is
|
jpayne@68
|
516 attached to strings themselves, rather than to what we do with them.
|
jpayne@68
|
517 This approach has more uses. A blatant example is an error message
|
jpayne@68
|
518 produced by formatting. The format string needs translation, as
|
jpayne@68
|
519 well as some strings inserted through some ‘<samp>%s</samp>’ specification
|
jpayne@68
|
520 in the format, while the result from <code>sprintf</code> may have so many
|
jpayne@68
|
521 different instances that it is impractical to list them all in some
|
jpayne@68
|
522 ‘<samp>error_string_out()</samp>’ routine, say.
|
jpayne@68
|
523 </p>
|
jpayne@68
|
524 <p>This marking operation has two goals. The first goal of marking
|
jpayne@68
|
525 is for triggering the retrieval of the translation, at run time.
|
jpayne@68
|
526 The keyword is possibly resolved into a routine able to dynamically
|
jpayne@68
|
527 return the proper translation, as far as possible or wanted, for the
|
jpayne@68
|
528 argument string. Most localizable strings are found in executable
|
jpayne@68
|
529 positions, that is, attached to variables or given as parameters to
|
jpayne@68
|
530 functions. But this is not universal usage, and some translatable
|
jpayne@68
|
531 strings appear in structured initializations. See section <a href="#SEC31">Special Cases of Translatable Strings</a>.
|
jpayne@68
|
532 </p>
|
jpayne@68
|
533 <p>The second goal of the marking operation is to help <code>xgettext</code>
|
jpayne@68
|
534 at properly extracting all translatable strings when it scans a set
|
jpayne@68
|
535 of program sources and produces PO file templates.
|
jpayne@68
|
536 </p>
|
jpayne@68
|
537 <p>The canonical keyword for marking translatable strings is
|
jpayne@68
|
538 ‘<samp>gettext</samp>’, it gave its name to the whole GNU <code>gettext</code>
|
jpayne@68
|
539 package. For packages making only light use of the ‘<samp>gettext</samp>’
|
jpayne@68
|
540 keyword, macro or function, it is easily used <em>as is</em>. However,
|
jpayne@68
|
541 for packages using the <code>gettext</code> interface more heavily, it
|
jpayne@68
|
542 is usually more convenient to give the main keyword a shorter, less
|
jpayne@68
|
543 obtrusive name. Indeed, the keyword might appear on a lot of strings
|
jpayne@68
|
544 all over the package, and programmers usually do not want nor need
|
jpayne@68
|
545 their program sources to remind them forcefully, all the time, that they
|
jpayne@68
|
546 are internationalized. Further, a long keyword has the disadvantage
|
jpayne@68
|
547 of using more horizontal space, forcing more indentation work on
|
jpayne@68
|
548 sources for those trying to keep them within 79 or 80 columns.
|
jpayne@68
|
549 </p>
|
jpayne@68
|
550 <a name="IDX131"></a>
|
jpayne@68
|
551 <p>Many packages use ‘<samp>_</samp>’ (a simple underline) as a keyword,
|
jpayne@68
|
552 and write ‘<samp>_("Translatable string")</samp>’ instead of ‘<samp>gettext
|
jpayne@68
|
553 ("Translatable string")</samp>’. Further, the coding rule, from GNU standards,
|
jpayne@68
|
554 wanting that there is a space between the keyword and the opening
|
jpayne@68
|
555 parenthesis is relaxed, in practice, for this particular usage.
|
jpayne@68
|
556 So, the textual overhead per translatable string is reduced to
|
jpayne@68
|
557 only three characters: the underline and the two parentheses.
|
jpayne@68
|
558 However, even if GNU <code>gettext</code> uses this convention internally,
|
jpayne@68
|
559 it does not offer it officially. The real, genuine keyword is truly
|
jpayne@68
|
560 ‘<samp>gettext</samp>’ indeed. It is fairly easy for those wanting to use
|
jpayne@68
|
561 ‘<samp>_</samp>’ instead of ‘<samp>gettext</samp>’ to declare:
|
jpayne@68
|
562 </p>
|
jpayne@68
|
563 <table><tr><td> </td><td><pre class="example">#include <libintl.h>
|
jpayne@68
|
564 #define _(String) gettext (String)
|
jpayne@68
|
565 </pre></td></tr></table>
|
jpayne@68
|
566
|
jpayne@68
|
567 <p>instead of merely using ‘<samp>#include <libintl.h></samp>’.
|
jpayne@68
|
568 </p>
|
jpayne@68
|
569 <p>The marking keywords ‘<samp>gettext</samp>’ and ‘<samp>_</samp>’ take the translatable
|
jpayne@68
|
570 string as sole argument. It is also possible to define marking functions
|
jpayne@68
|
571 that take it at another argument position. It is even possible to make
|
jpayne@68
|
572 the marked argument position depend on the total number of arguments of
|
jpayne@68
|
573 the function call; this is useful in C++. All this is achieved using
|
jpayne@68
|
574 <code>xgettext</code>'s ‘<samp>--keyword</samp>’ option. How to pass such an option
|
jpayne@68
|
575 to <code>xgettext</code>, assuming that <code>gettextize</code> is used, is described
|
jpayne@68
|
576 in <a href="gettext_13.html#SEC237">‘<tt>Makevars</tt>’ in ‘<tt>po/</tt>’</a> and <a href="gettext_13.html#SEC252">AM_XGETTEXT_OPTION in ‘<tt>po.m4</tt>’</a>.
|
jpayne@68
|
577 </p>
|
jpayne@68
|
578 <p>Note also that long strings can be split across lines, into multiple
|
jpayne@68
|
579 adjacent string tokens. Automatic string concatenation is performed
|
jpayne@68
|
580 at compile time according to ISO C and ISO C++; <code>xgettext</code> also
|
jpayne@68
|
581 supports this syntax.
|
jpayne@68
|
582 </p>
|
jpayne@68
|
583 <p>In C++, marking a C++ format string requires a small code change,
|
jpayne@68
|
584 because the first argument to <code>std::format</code> must be a constant
|
jpayne@68
|
585 expression.
|
jpayne@68
|
586 For example,
|
jpayne@68
|
587 </p><table><tr><td> </td><td><pre class="smallexample">std::format ("{} {}!", "Hello", "world")
|
jpayne@68
|
588 </pre></td></tr></table>
|
jpayne@68
|
589 <p>needs to be changed to
|
jpayne@68
|
590 </p><table><tr><td> </td><td><pre class="smallexample">std::vformat (gettext ("{} {}!"), std::make_format_args("Hello", "world"))
|
jpayne@68
|
591 </pre></td></tr></table>
|
jpayne@68
|
592
|
jpayne@68
|
593 <p>Later on, the maintenance is relatively easy. If, as a programmer,
|
jpayne@68
|
594 you add or modify a string, you will have to ask yourself if the
|
jpayne@68
|
595 new or altered string requires translation, and include it within
|
jpayne@68
|
596 ‘<samp>_()</samp>’ if you think it should be translated. For example, ‘<samp>"%s"</samp>’
|
jpayne@68
|
597 is an example of string <em>not</em> requiring translation. But
|
jpayne@68
|
598 ‘<samp>"%s: %d"</samp>’ <em>does</em> require translation, because in French, unlike
|
jpayne@68
|
599 in English, it's customary to put a space before a colon.
|
jpayne@68
|
600 </p>
|
jpayne@68
|
601
|
jpayne@68
|
602 <a name="Marking"></a>
|
jpayne@68
|
603 <a name="SEC29"></a>
|
jpayne@68
|
604 <h2 class="section"> <a href="gettext_toc.html#TOC22">4.5 Marking Translatable Strings</a> </h2>
|
jpayne@68
|
605
|
jpayne@68
|
606 <p>In PO mode, one set of features is meant more for the programmer than
|
jpayne@68
|
607 for the translator, and allows him to interactively mark which strings,
|
jpayne@68
|
608 in a set of program sources, are translatable, and which are not.
|
jpayne@68
|
609 Even if it is a fairly easy job for a programmer to find and mark
|
jpayne@68
|
610 such strings by other means, using any editor of his choice, PO mode
|
jpayne@68
|
611 makes this work more comfortable. Further, this gives translators
|
jpayne@68
|
612 who feel a little like programmers, or programmers who feel a little
|
jpayne@68
|
613 like translators, a tool letting them work at marking translatable
|
jpayne@68
|
614 strings in the program sources, while simultaneously producing a set of
|
jpayne@68
|
615 translation in some language, for the package being internationalized.
|
jpayne@68
|
616 </p>
|
jpayne@68
|
617 <a name="IDX132"></a>
|
jpayne@68
|
618 <p>The set of program sources, targeted by the PO mode commands describe
|
jpayne@68
|
619 here, should have an Emacs tags table constructed for your project,
|
jpayne@68
|
620 prior to using these PO file commands. This is easy to do. In any
|
jpayne@68
|
621 shell window, change the directory to the root of your project, then
|
jpayne@68
|
622 execute a command resembling:
|
jpayne@68
|
623 </p>
|
jpayne@68
|
624 <table><tr><td> </td><td><pre class="example">etags src/*.[hc] lib/*.[hc]
|
jpayne@68
|
625 </pre></td></tr></table>
|
jpayne@68
|
626
|
jpayne@68
|
627 <p>presuming here you want to process all ‘<tt>.h</tt>’ and ‘<tt>.c</tt>’ files
|
jpayne@68
|
628 from the ‘<tt>src/</tt>’ and ‘<tt>lib/</tt>’ directories. This command will
|
jpayne@68
|
629 explore all said files and create a ‘<tt>TAGS</tt>’ file in your root
|
jpayne@68
|
630 directory, somewhat summarizing the contents using a special file
|
jpayne@68
|
631 format Emacs can understand.
|
jpayne@68
|
632 </p>
|
jpayne@68
|
633 <a name="IDX133"></a>
|
jpayne@68
|
634 <p>For packages following the GNU coding standards, there is
|
jpayne@68
|
635 a make goal <code>tags</code> or <code>TAGS</code> which constructs the tag files in
|
jpayne@68
|
636 all directories and for all files containing source code.
|
jpayne@68
|
637 </p>
|
jpayne@68
|
638 <p>Once your ‘<tt>TAGS</tt>’ file is ready, the following commands assist
|
jpayne@68
|
639 the programmer at marking translatable strings in his set of sources.
|
jpayne@68
|
640 But these commands are necessarily driven from within a PO file
|
jpayne@68
|
641 window, and it is likely that you do not even have such a PO file yet.
|
jpayne@68
|
642 This is not a problem at all, as you may safely open a new, empty PO
|
jpayne@68
|
643 file, mainly for using these commands. This empty PO file will slowly
|
jpayne@68
|
644 fill in while you mark strings as translatable in your program sources.
|
jpayne@68
|
645 </p>
|
jpayne@68
|
646 <dl compact="compact">
|
jpayne@68
|
647 <dt> <kbd>,</kbd></dt>
|
jpayne@68
|
648 <dd><a name="IDX134"></a>
|
jpayne@68
|
649 <p>Search through program sources for a string which looks like a
|
jpayne@68
|
650 candidate for translation (<code>po-tags-search</code>).
|
jpayne@68
|
651 </p>
|
jpayne@68
|
652 </dd>
|
jpayne@68
|
653 <dt> <kbd>M-,</kbd></dt>
|
jpayne@68
|
654 <dd><a name="IDX135"></a>
|
jpayne@68
|
655 <p>Mark the last string found with ‘<samp>_()</samp>’ (<code>po-mark-translatable</code>).
|
jpayne@68
|
656 </p>
|
jpayne@68
|
657 </dd>
|
jpayne@68
|
658 <dt> <kbd>M-.</kbd></dt>
|
jpayne@68
|
659 <dd><a name="IDX136"></a>
|
jpayne@68
|
660 <p>Mark the last string found with a keyword taken from a set of possible
|
jpayne@68
|
661 keywords. This command with a prefix allows some management of these
|
jpayne@68
|
662 keywords (<code>po-select-mark-and-mark</code>).
|
jpayne@68
|
663 </p>
|
jpayne@68
|
664 </dd>
|
jpayne@68
|
665 </dl>
|
jpayne@68
|
666
|
jpayne@68
|
667 <a name="IDX137"></a>
|
jpayne@68
|
668 <p>The <kbd>,</kbd> (<code>po-tags-search</code>) command searches for the next
|
jpayne@68
|
669 occurrence of a string which looks like a possible candidate for
|
jpayne@68
|
670 translation, and displays the program source in another Emacs window,
|
jpayne@68
|
671 positioned in such a way that the string is near the top of this other
|
jpayne@68
|
672 window. If the string is too big to fit whole in this window, it is
|
jpayne@68
|
673 positioned so only its end is shown. In any case, the cursor
|
jpayne@68
|
674 is left in the PO file window. If the shown string would be better
|
jpayne@68
|
675 presented differently in different native languages, you may mark it
|
jpayne@68
|
676 using <kbd>M-,</kbd> or <kbd>M-.</kbd>. Otherwise, you might rather ignore it
|
jpayne@68
|
677 and skip to the next string by merely repeating the <kbd>,</kbd> command.
|
jpayne@68
|
678 </p>
|
jpayne@68
|
679 <p>A string is a good candidate for translation if it contains a sequence
|
jpayne@68
|
680 of three or more letters. A string containing at most two letters in
|
jpayne@68
|
681 a row will be considered as a candidate if it has more letters than
|
jpayne@68
|
682 non-letters. The command disregards strings containing no letters,
|
jpayne@68
|
683 or isolated letters only. It also disregards strings within comments,
|
jpayne@68
|
684 or strings already marked with some keyword PO mode knows (see below).
|
jpayne@68
|
685 </p>
|
jpayne@68
|
686 <p>If you have never told Emacs about some ‘<tt>TAGS</tt>’ file to use, the
|
jpayne@68
|
687 command will request that you specify one from the minibuffer, the
|
jpayne@68
|
688 first time you use the command. You may later change your ‘<tt>TAGS</tt>’
|
jpayne@68
|
689 file by using the regular Emacs command <kbd>M-x visit-tags-table</kbd>,
|
jpayne@68
|
690 which will ask you to name the precise ‘<tt>TAGS</tt>’ file you want
|
jpayne@68
|
691 to use. See <a href="../emacs/Tags.html#Tags">(emacs)Tags</a> section `Tag Tables' in <cite>The Emacs Editor</cite>.
|
jpayne@68
|
692 </p>
|
jpayne@68
|
693 <p>Each time you use the <kbd>,</kbd> command, the search resumes from where it was
|
jpayne@68
|
694 left by the previous search, and goes through all program sources,
|
jpayne@68
|
695 obeying the ‘<tt>TAGS</tt>’ file, until all sources have been processed.
|
jpayne@68
|
696 However, by giving a prefix argument to the command (<kbd>C-u
|
jpayne@68
|
697 ,</kbd>), you may request that the search be restarted all over again
|
jpayne@68
|
698 from the first program source; but in this case, strings that you
|
jpayne@68
|
699 recently marked as translatable will be automatically skipped.
|
jpayne@68
|
700 </p>
|
jpayne@68
|
701 <p>Using this <kbd>,</kbd> command does not prevent using of other regular
|
jpayne@68
|
702 Emacs tags commands. For example, regular <code>tags-search</code> or
|
jpayne@68
|
703 <code>tags-query-replace</code> commands may be used without disrupting the
|
jpayne@68
|
704 independent <kbd>,</kbd> search sequence. However, as implemented, the
|
jpayne@68
|
705 <em>initial</em> <kbd>,</kbd> command (or the <kbd>,</kbd> command is used with a
|
jpayne@68
|
706 prefix) might also reinitialize the regular Emacs tags searching to the
|
jpayne@68
|
707 first tags file, this reinitialization might be considered spurious.
|
jpayne@68
|
708 </p>
|
jpayne@68
|
709 <a name="IDX138"></a>
|
jpayne@68
|
710 <a name="IDX139"></a>
|
jpayne@68
|
711 <p>The <kbd>M-,</kbd> (<code>po-mark-translatable</code>) command will mark the
|
jpayne@68
|
712 recently found string with the ‘<samp>_</samp>’ keyword. The <kbd>M-.</kbd>
|
jpayne@68
|
713 (<code>po-select-mark-and-mark</code>) command will request that you type
|
jpayne@68
|
714 one keyword from the minibuffer and use that keyword for marking
|
jpayne@68
|
715 the string. Both commands will automatically create a new PO file
|
jpayne@68
|
716 untranslated entry for the string being marked, and make it the
|
jpayne@68
|
717 current entry (making it easy for you to immediately proceed to its
|
jpayne@68
|
718 translation, if you feel like doing it right away). It is possible
|
jpayne@68
|
719 that the modifications made to the program source by <kbd>M-,</kbd> or
|
jpayne@68
|
720 <kbd>M-.</kbd> render some source line longer than 80 columns, forcing you
|
jpayne@68
|
721 to break and re-indent this line differently. You may use the <kbd>O</kbd>
|
jpayne@68
|
722 command from PO mode, or any other window changing command from
|
jpayne@68
|
723 Emacs, to break out into the program source window, and do any
|
jpayne@68
|
724 needed adjustments. You will have to use some regular Emacs command
|
jpayne@68
|
725 to return the cursor to the PO file window, if you want command
|
jpayne@68
|
726 <kbd>,</kbd> for the next string, say.
|
jpayne@68
|
727 </p>
|
jpayne@68
|
728 <p>The <kbd>M-.</kbd> command has a few built-in speedups, so you do not
|
jpayne@68
|
729 have to explicitly type all keywords all the time. The first such
|
jpayne@68
|
730 speedup is that you are presented with a <em>preferred</em> keyword,
|
jpayne@68
|
731 which you may accept by merely typing <kbd><RET></kbd> at the prompt.
|
jpayne@68
|
732 The second speedup is that you may type any non-ambiguous prefix of the
|
jpayne@68
|
733 keyword you really mean, and the command will complete it automatically
|
jpayne@68
|
734 for you. This also means that PO mode has to <em>know</em> all
|
jpayne@68
|
735 your possible keywords, and that it will not accept mistyped keywords.
|
jpayne@68
|
736 </p>
|
jpayne@68
|
737 <p>If you reply <kbd>?</kbd> to the keyword request, the command gives a
|
jpayne@68
|
738 list of all known keywords, from which you may choose. When the
|
jpayne@68
|
739 command is prefixed by an argument (<kbd>C-u M-.</kbd>), it inhibits
|
jpayne@68
|
740 updating any program source or PO file buffer, and does some simple
|
jpayne@68
|
741 keyword management instead. In this case, the command asks for a
|
jpayne@68
|
742 keyword, written in full, which becomes a new allowed keyword for
|
jpayne@68
|
743 later <kbd>M-.</kbd> commands. Moreover, this new keyword automatically
|
jpayne@68
|
744 becomes the <em>preferred</em> keyword for later commands. By typing
|
jpayne@68
|
745 an already known keyword in response to <kbd>C-u M-.</kbd>, one merely
|
jpayne@68
|
746 changes the <em>preferred</em> keyword and does nothing more.
|
jpayne@68
|
747 </p>
|
jpayne@68
|
748 <p>All keywords known for <kbd>M-.</kbd> are recognized by the <kbd>,</kbd> command
|
jpayne@68
|
749 when scanning for strings, and strings already marked by any of those
|
jpayne@68
|
750 known keywords are automatically skipped. If many PO files are opened
|
jpayne@68
|
751 simultaneously, each one has its own independent set of known keywords.
|
jpayne@68
|
752 There is no provision in PO mode, currently, for deleting a known
|
jpayne@68
|
753 keyword, you have to quit the file (maybe using <kbd>q</kbd>) and reopen
|
jpayne@68
|
754 it afresh. When a PO file is newly brought up in an Emacs window, only
|
jpayne@68
|
755 ‘<samp>gettext</samp>’ and ‘<samp>_</samp>’ are known as keywords, and ‘<samp>gettext</samp>’
|
jpayne@68
|
756 is preferred for the <kbd>M-.</kbd> command. In fact, this is not useful to
|
jpayne@68
|
757 prefer ‘<samp>_</samp>’, as this one is already built in the <kbd>M-,</kbd> command.
|
jpayne@68
|
758 </p>
|
jpayne@68
|
759
|
jpayne@68
|
760 <a name="c_002dformat-Flag"></a>
|
jpayne@68
|
761 <a name="SEC30"></a>
|
jpayne@68
|
762 <h2 class="section"> <a href="gettext_toc.html#TOC23">4.6 Special Comments preceding Keywords</a> </h2>
|
jpayne@68
|
763
|
jpayne@68
|
764
|
jpayne@68
|
765 <p>In C programs strings are often used within calls of functions from the
|
jpayne@68
|
766 <code>printf</code> family. The special thing about these format strings is
|
jpayne@68
|
767 that they can contain format specifiers introduced with <kbd>%</kbd>. Assume
|
jpayne@68
|
768 we have the code
|
jpayne@68
|
769 </p>
|
jpayne@68
|
770 <table><tr><td> </td><td><pre class="example">printf (gettext ("String `%s' has %d characters\n"), s, strlen (s));
|
jpayne@68
|
771 </pre></td></tr></table>
|
jpayne@68
|
772
|
jpayne@68
|
773 <p>A possible German translation for the above string might be:
|
jpayne@68
|
774 </p>
|
jpayne@68
|
775 <table><tr><td> </td><td><pre class="example">"%d Zeichen lang ist die Zeichenkette `%s'"
|
jpayne@68
|
776 </pre></td></tr></table>
|
jpayne@68
|
777
|
jpayne@68
|
778 <p>A C programmer, even if he cannot speak German, will recognize that
|
jpayne@68
|
779 there is something wrong here. The order of the two format specifiers
|
jpayne@68
|
780 is changed but of course the arguments in the <code>printf</code> don't have.
|
jpayne@68
|
781 This will most probably lead to problems because now the length of the
|
jpayne@68
|
782 string is regarded as the address.
|
jpayne@68
|
783 </p>
|
jpayne@68
|
784 <p>To prevent errors at runtime caused by translations, the <code>msgfmt</code>
|
jpayne@68
|
785 tool can check statically whether the arguments in the original and the
|
jpayne@68
|
786 translation string match in type and number. If this is not the case
|
jpayne@68
|
787 and the ‘<samp>-c</samp>’ option has been passed to <code>msgfmt</code>, <code>msgfmt</code>
|
jpayne@68
|
788 will give an error and refuse to produce a MO file. Thus consistent
|
jpayne@68
|
789 use of ‘<samp>msgfmt -c</samp>’ will catch the error, so that it cannot cause
|
jpayne@68
|
790 problems at runtime.
|
jpayne@68
|
791 </p>
|
jpayne@68
|
792 <p>If the word order in the above German translation would be correct one
|
jpayne@68
|
793 would have to write
|
jpayne@68
|
794 </p>
|
jpayne@68
|
795 <table><tr><td> </td><td><pre class="example">"%2$d Zeichen lang ist die Zeichenkette `%1$s'"
|
jpayne@68
|
796 </pre></td></tr></table>
|
jpayne@68
|
797
|
jpayne@68
|
798 <p>The routines in <code>msgfmt</code> know about this special notation.
|
jpayne@68
|
799 </p>
|
jpayne@68
|
800 <p>Because not all strings in a program will be format strings, it is not
|
jpayne@68
|
801 useful for <code>msgfmt</code> to test all the strings in the ‘<tt>.po</tt>’ file.
|
jpayne@68
|
802 This might cause problems because the string might contain what looks
|
jpayne@68
|
803 like a format specifier, but the string is not used in <code>printf</code>.
|
jpayne@68
|
804 </p>
|
jpayne@68
|
805 <p>Therefore <code>xgettext</code> adds a special tag to those messages it
|
jpayne@68
|
806 thinks might be a format string. There is no absolute rule for this,
|
jpayne@68
|
807 only a heuristic. In the ‘<tt>.po</tt>’ file the entry is marked using the
|
jpayne@68
|
808 <code>c-format</code> flag in the <code>#,</code> comment line (see section <a href="gettext_3.html#SEC16">The Format of PO Files</a>).
|
jpayne@68
|
809 </p>
|
jpayne@68
|
810 <a name="IDX140"></a>
|
jpayne@68
|
811 <a name="IDX141"></a>
|
jpayne@68
|
812 <p>The careful reader now might say that this again can cause problems.
|
jpayne@68
|
813 The heuristic might guess it wrong. This is true and therefore
|
jpayne@68
|
814 <code>xgettext</code> knows about a special kind of comment which lets
|
jpayne@68
|
815 the programmer take over the decision. If in the same line as or
|
jpayne@68
|
816 the immediately preceding line to the <code>gettext</code> keyword
|
jpayne@68
|
817 the <code>xgettext</code> program finds a comment containing the words
|
jpayne@68
|
818 <code>xgettext:c-format</code>, it will mark the string in any case with
|
jpayne@68
|
819 the <code>c-format</code> flag. This kind of comment should be used when
|
jpayne@68
|
820 <code>xgettext</code> does not recognize the string as a format string but
|
jpayne@68
|
821 it really is one and it should be tested. Please note that when the
|
jpayne@68
|
822 comment is in the same line as the <code>gettext</code> keyword, it must be
|
jpayne@68
|
823 before the string to be translated. Also note that a comment such as
|
jpayne@68
|
824 <code>xgettext:c-format</code> applies only to the first string in the same
|
jpayne@68
|
825 or the next line, not to multiple strings.
|
jpayne@68
|
826 </p>
|
jpayne@68
|
827 <p>This situation happens quite often. The <code>printf</code> function is often
|
jpayne@68
|
828 called with strings which do not contain a format specifier. Of course
|
jpayne@68
|
829 one would normally use <code>fputs</code> but it does happen. In this case
|
jpayne@68
|
830 <code>xgettext</code> does not recognize this as a format string but what
|
jpayne@68
|
831 happens if the translation introduces a valid format specifier? The
|
jpayne@68
|
832 <code>printf</code> function will try to access one of the parameters but none
|
jpayne@68
|
833 exists because the original code does not pass any parameters.
|
jpayne@68
|
834 </p>
|
jpayne@68
|
835 <p><code>xgettext</code> of course could make a wrong decision the other way
|
jpayne@68
|
836 round, i.e. a string marked as a format string actually is not a format
|
jpayne@68
|
837 string. In this case the <code>msgfmt</code> might give too many warnings and
|
jpayne@68
|
838 would prevent translating the ‘<tt>.po</tt>’ file. The method to prevent
|
jpayne@68
|
839 this wrong decision is similar to the one used above, only the comment
|
jpayne@68
|
840 to use must contain the string <code>xgettext:no-c-format</code>.
|
jpayne@68
|
841 </p>
|
jpayne@68
|
842 <p>If a string is marked with <code>c-format</code> and this is not correct the
|
jpayne@68
|
843 user can find out who is responsible for the decision. See
|
jpayne@68
|
844 <a href="gettext_5.html#SEC36">Invoking the <code>xgettext</code> Program</a> to see how the <code>--debug</code> option can be
|
jpayne@68
|
845 used for solving this problem.
|
jpayne@68
|
846 </p>
|
jpayne@68
|
847
|
jpayne@68
|
848 <a name="Special-cases"></a>
|
jpayne@68
|
849 <a name="SEC31"></a>
|
jpayne@68
|
850 <h2 class="section"> <a href="gettext_toc.html#TOC24">4.7 Special Cases of Translatable Strings</a> </h2>
|
jpayne@68
|
851
|
jpayne@68
|
852 <p>The attentive reader might now point out that it is not always possible
|
jpayne@68
|
853 to mark translatable string with <code>gettext</code> or something like this.
|
jpayne@68
|
854 Consider the following case:
|
jpayne@68
|
855 </p>
|
jpayne@68
|
856 <table><tr><td> </td><td><pre class="example">{
|
jpayne@68
|
857 static const char *messages[] = {
|
jpayne@68
|
858 "some very meaningful message",
|
jpayne@68
|
859 "and another one"
|
jpayne@68
|
860 };
|
jpayne@68
|
861 const char *string;
|
jpayne@68
|
862 …
|
jpayne@68
|
863 string
|
jpayne@68
|
864 = index > 1 ? "a default message" : messages[index];
|
jpayne@68
|
865
|
jpayne@68
|
866 fputs (string);
|
jpayne@68
|
867 …
|
jpayne@68
|
868 }
|
jpayne@68
|
869 </pre></td></tr></table>
|
jpayne@68
|
870
|
jpayne@68
|
871 <p>While it is no problem to mark the string <code>"a default message"</code> it
|
jpayne@68
|
872 is not possible to mark the string initializers for <code>messages</code>.
|
jpayne@68
|
873 What is to be done? We have to fulfill two tasks. First we have to mark the
|
jpayne@68
|
874 strings so that the <code>xgettext</code> program (see section <a href="gettext_5.html#SEC36">Invoking the <code>xgettext</code> Program</a>)
|
jpayne@68
|
875 can find them, and second we have to translate the string at runtime
|
jpayne@68
|
876 before printing them.
|
jpayne@68
|
877 </p>
|
jpayne@68
|
878 <p>The first task can be fulfilled by creating a new keyword, which names a
|
jpayne@68
|
879 no-op. For the second we have to mark all access points to a string
|
jpayne@68
|
880 from the array. So one solution can look like this:
|
jpayne@68
|
881 </p>
|
jpayne@68
|
882 <table><tr><td> </td><td><pre class="example">#define gettext_noop(String) String
|
jpayne@68
|
883
|
jpayne@68
|
884 {
|
jpayne@68
|
885 static const char *messages[] = {
|
jpayne@68
|
886 gettext_noop ("some very meaningful message"),
|
jpayne@68
|
887 gettext_noop ("and another one")
|
jpayne@68
|
888 };
|
jpayne@68
|
889 const char *string;
|
jpayne@68
|
890 …
|
jpayne@68
|
891 string
|
jpayne@68
|
892 = index > 1 ? gettext ("a default message") : gettext (messages[index]);
|
jpayne@68
|
893
|
jpayne@68
|
894 fputs (string);
|
jpayne@68
|
895 …
|
jpayne@68
|
896 }
|
jpayne@68
|
897 </pre></td></tr></table>
|
jpayne@68
|
898
|
jpayne@68
|
899 <p>Please convince yourself that the string which is written by
|
jpayne@68
|
900 <code>fputs</code> is translated in any case. How to get <code>xgettext</code> know
|
jpayne@68
|
901 the additional keyword <code>gettext_noop</code> is explained in <a href="gettext_5.html#SEC36">Invoking the <code>xgettext</code> Program</a>.
|
jpayne@68
|
902 </p>
|
jpayne@68
|
903 <p>The above is of course not the only solution. You could also come along
|
jpayne@68
|
904 with the following one:
|
jpayne@68
|
905 </p>
|
jpayne@68
|
906 <table><tr><td> </td><td><pre class="example">#define gettext_noop(String) String
|
jpayne@68
|
907
|
jpayne@68
|
908 {
|
jpayne@68
|
909 static const char *messages[] = {
|
jpayne@68
|
910 gettext_noop ("some very meaningful message"),
|
jpayne@68
|
911 gettext_noop ("and another one")
|
jpayne@68
|
912 };
|
jpayne@68
|
913 const char *string;
|
jpayne@68
|
914 …
|
jpayne@68
|
915 string
|
jpayne@68
|
916 = index > 1 ? gettext_noop ("a default message") : messages[index];
|
jpayne@68
|
917
|
jpayne@68
|
918 fputs (gettext (string));
|
jpayne@68
|
919 …
|
jpayne@68
|
920 }
|
jpayne@68
|
921 </pre></td></tr></table>
|
jpayne@68
|
922
|
jpayne@68
|
923 <p>But this has a drawback. The programmer has to take care that
|
jpayne@68
|
924 he uses <code>gettext_noop</code> for the string <code>"a default message"</code>.
|
jpayne@68
|
925 A use of <code>gettext</code> could have in rare cases unpredictable results.
|
jpayne@68
|
926 </p>
|
jpayne@68
|
927 <p>One advantage is that you need not make control flow analysis to make
|
jpayne@68
|
928 sure the output is really translated in any case. But this analysis is
|
jpayne@68
|
929 generally not very difficult. If it should be in any situation you can
|
jpayne@68
|
930 use this second method in this situation.
|
jpayne@68
|
931 </p>
|
jpayne@68
|
932
|
jpayne@68
|
933 <a name="Bug-Report-Address"></a>
|
jpayne@68
|
934 <a name="SEC32"></a>
|
jpayne@68
|
935 <h2 class="section"> <a href="gettext_toc.html#TOC25">4.8 Letting Users Report Translation Bugs</a> </h2>
|
jpayne@68
|
936
|
jpayne@68
|
937 <p>Code sometimes has bugs, but translations sometimes have bugs too. The
|
jpayne@68
|
938 users need to be able to report them. Reporting translation bugs to the
|
jpayne@68
|
939 programmer or maintainer of a package is not very useful, since the
|
jpayne@68
|
940 maintainer must never change a translation, except on behalf of the
|
jpayne@68
|
941 translator. Hence the translation bugs must be reported to the
|
jpayne@68
|
942 translators.
|
jpayne@68
|
943 </p>
|
jpayne@68
|
944 <p>Here is a way to organize this so that the maintainer does not need to
|
jpayne@68
|
945 forward translation bug reports, nor even keep a list of the addresses of
|
jpayne@68
|
946 the translators or their translation teams.
|
jpayne@68
|
947 </p>
|
jpayne@68
|
948 <p>Every program has a place where is shows the bug report address. For
|
jpayne@68
|
949 GNU programs, it is the code which handles the “–help” option,
|
jpayne@68
|
950 typically in a function called “usage”. In this place, instruct the
|
jpayne@68
|
951 translator to add her own bug reporting address. For example, if that
|
jpayne@68
|
952 code has a statement
|
jpayne@68
|
953 </p>
|
jpayne@68
|
954 <table><tr><td> </td><td><pre class="example">printf (_("Report bugs to <%s>.\n"), PACKAGE_BUGREPORT);
|
jpayne@68
|
955 </pre></td></tr></table>
|
jpayne@68
|
956
|
jpayne@68
|
957 <p>you can add some translator instructions like this:
|
jpayne@68
|
958 </p>
|
jpayne@68
|
959 <table><tr><td> </td><td><pre class="example">/* TRANSLATORS: The placeholder indicates the bug-reporting address
|
jpayne@68
|
960 for this package. Please add _another line_ saying
|
jpayne@68
|
961 "Report translation bugs to <...>\n" with the address for translation
|
jpayne@68
|
962 bugs (typically your translation team's web or email address). */
|
jpayne@68
|
963 printf (_("Report bugs to <%s>.\n"), PACKAGE_BUGREPORT);
|
jpayne@68
|
964 </pre></td></tr></table>
|
jpayne@68
|
965
|
jpayne@68
|
966 <p>These will be extracted by ‘<samp>xgettext</samp>’, leading to a .pot file that
|
jpayne@68
|
967 contains this:
|
jpayne@68
|
968 </p>
|
jpayne@68
|
969 <table><tr><td> </td><td><pre class="example">#. TRANSLATORS: The placeholder indicates the bug-reporting address
|
jpayne@68
|
970 #. for this package. Please add _another line_ saying
|
jpayne@68
|
971 #. "Report translation bugs to <...>\n" with the address for translation
|
jpayne@68
|
972 #. bugs (typically your translation team's web or email address).
|
jpayne@68
|
973 #: src/hello.c:178
|
jpayne@68
|
974 #, c-format
|
jpayne@68
|
975 msgid "Report bugs to <%s>.\n"
|
jpayne@68
|
976 msgstr ""
|
jpayne@68
|
977 </pre></td></tr></table>
|
jpayne@68
|
978
|
jpayne@68
|
979
|
jpayne@68
|
980 <a name="Names"></a>
|
jpayne@68
|
981 <a name="SEC33"></a>
|
jpayne@68
|
982 <h2 class="section"> <a href="gettext_toc.html#TOC26">4.9 Marking Proper Names for Translation</a> </h2>
|
jpayne@68
|
983
|
jpayne@68
|
984 <p>Should names of persons, cities, locations etc. be marked for translation
|
jpayne@68
|
985 or not? People who only know languages that can be written with Latin
|
jpayne@68
|
986 letters (English, Spanish, French, German, etc.) are tempted to say “no”,
|
jpayne@68
|
987 because names usually do not change when transported between these languages.
|
jpayne@68
|
988 However, in general when translating from one script to another, names
|
jpayne@68
|
989 are translated too, usually phonetically or by transliteration. For
|
jpayne@68
|
990 example, Russian or Greek names are converted to the Latin alphabet when
|
jpayne@68
|
991 being translated to English, and English or French names are converted
|
jpayne@68
|
992 to the Katakana script when being translated to Japanese. This is
|
jpayne@68
|
993 necessary because the speakers of the target language in general cannot
|
jpayne@68
|
994 read the script the name is originally written in.
|
jpayne@68
|
995 </p>
|
jpayne@68
|
996 <p>As a programmer, you should therefore make sure that names are marked
|
jpayne@68
|
997 for translation, with a special comment telling the translators that it
|
jpayne@68
|
998 is a proper name and how to pronounce it. In its simple form, it looks
|
jpayne@68
|
999 like this:
|
jpayne@68
|
1000 </p>
|
jpayne@68
|
1001 <table><tr><td> </td><td><pre class="example">printf (_("Written by %s.\n"),
|
jpayne@68
|
1002 /* TRANSLATORS: This is a proper name. See the gettext
|
jpayne@68
|
1003 manual, section Names. Note this is actually a non-ASCII
|
jpayne@68
|
1004 name: The first name is (with Unicode escapes)
|
jpayne@68
|
1005 "Fran\u00e7ois" or (with HTML entities) "Fran&ccedil;ois".
|
jpayne@68
|
1006 Pronunciation is like "fraa-swa pee-nar". */
|
jpayne@68
|
1007 _("Francois Pinard"));
|
jpayne@68
|
1008 </pre></td></tr></table>
|
jpayne@68
|
1009
|
jpayne@68
|
1010 <p>The GNU gnulib library offers a module ‘<samp>propername</samp>’
|
jpayne@68
|
1011 (<a href="https://www.gnu.org/software/gnulib/MODULES.html#module=propername">https://www.gnu.org/software/gnulib/MODULES.html#module=propername</a>)
|
jpayne@68
|
1012 which takes care to automatically append the original name, in parentheses,
|
jpayne@68
|
1013 to the translated name. For names that cannot be written in ASCII, it
|
jpayne@68
|
1014 also frees the translator from the task of entering the appropriate non-ASCII
|
jpayne@68
|
1015 characters if no script change is needed. In this more comfortable form,
|
jpayne@68
|
1016 it looks like this:
|
jpayne@68
|
1017 </p>
|
jpayne@68
|
1018 <table><tr><td> </td><td><pre class="example">printf (_("Written by %s and %s.\n"),
|
jpayne@68
|
1019 proper_name ("Ulrich Drepper"),
|
jpayne@68
|
1020 /* TRANSLATORS: This is a proper name. See the gettext
|
jpayne@68
|
1021 manual, section Names. Note this is actually a non-ASCII
|
jpayne@68
|
1022 name: The first name is (with Unicode escapes)
|
jpayne@68
|
1023 "Fran\u00e7ois" or (with HTML entities) "Fran&ccedil;ois".
|
jpayne@68
|
1024 Pronunciation is like "fraa-swa pee-nar". */
|
jpayne@68
|
1025 proper_name_utf8 ("Francois Pinard", "Fran\303\247ois Pinard"));
|
jpayne@68
|
1026 </pre></td></tr></table>
|
jpayne@68
|
1027
|
jpayne@68
|
1028 <p>You can also write the original name directly in Unicode (rather than with
|
jpayne@68
|
1029 Unicode escapes or HTML entities) and denote the pronunciation using the
|
jpayne@68
|
1030 International Phonetic Alphabet (see
|
jpayne@68
|
1031 <a href="https://en.wikipedia.org/wiki/International_Phonetic_Alphabet">https://en.wikipedia.org/wiki/International_Phonetic_Alphabet</a>).
|
jpayne@68
|
1032 </p>
|
jpayne@68
|
1033 <p>As a translator, you should use some care when translating names, because
|
jpayne@68
|
1034 it is frustrating if people see their names mutilated or distorted.
|
jpayne@68
|
1035 </p>
|
jpayne@68
|
1036 <p>If your language uses the Latin script, all you need to do is to reproduce
|
jpayne@68
|
1037 the name as perfectly as you can within the usual character set of your
|
jpayne@68
|
1038 language. In this particular case, this means to provide a translation
|
jpayne@68
|
1039 containing the c-cedilla character. If your language uses a different
|
jpayne@68
|
1040 script and the people speaking it don't usually read Latin words, it means
|
jpayne@68
|
1041 transliteration. If the programmer used the simple case, you should still
|
jpayne@68
|
1042 give, in parentheses, the original writing of the name – for the sake of
|
jpayne@68
|
1043 the people that do read the Latin script. If the programmer used the
|
jpayne@68
|
1044 ‘<samp>propername</samp>’ module mentioned above, you don't need to give the original
|
jpayne@68
|
1045 writing of the name in parentheses, because the program will already do so.
|
jpayne@68
|
1046 Here is an example, using Greek as the target script:
|
jpayne@68
|
1047 </p>
|
jpayne@68
|
1048 <table><tr><td> </td><td><pre class="example">#. This is a proper name. See the gettext
|
jpayne@68
|
1049 #. manual, section Names. Note this is actually a non-ASCII
|
jpayne@68
|
1050 #. name: The first name is (with Unicode escapes)
|
jpayne@68
|
1051 #. "Fran\u00e7ois" or (with HTML entities) "Fran&ccedil;ois".
|
jpayne@68
|
1052 #. Pronunciation is like "fraa-swa pee-nar".
|
jpayne@68
|
1053 msgid "Francois Pinard"
|
jpayne@68
|
1054 msgstr "\phi\rho\alpha\sigma\omicron\alpha \pi\iota\nu\alpha\rho"
|
jpayne@68
|
1055 " (Francois Pinard)"
|
jpayne@68
|
1056 </pre></td></tr></table>
|
jpayne@68
|
1057
|
jpayne@68
|
1058 <p>Because translation of names is such a sensitive domain, it is a good
|
jpayne@68
|
1059 idea to test your translation before submitting it.
|
jpayne@68
|
1060 </p>
|
jpayne@68
|
1061
|
jpayne@68
|
1062 <a name="Libraries"></a>
|
jpayne@68
|
1063 <a name="SEC34"></a>
|
jpayne@68
|
1064 <h2 class="section"> <a href="gettext_toc.html#TOC27">4.10 Preparing Library Sources</a> </h2>
|
jpayne@68
|
1065
|
jpayne@68
|
1066 <p>When you are preparing a library, not a program, for the use of
|
jpayne@68
|
1067 <code>gettext</code>, only a few details are different. Here we assume that
|
jpayne@68
|
1068 the library has a translation domain and a POT file of its own. (If
|
jpayne@68
|
1069 it uses the translation domain and POT file of the main program, then
|
jpayne@68
|
1070 the previous sections apply without changes.)
|
jpayne@68
|
1071 </p>
|
jpayne@68
|
1072 <ol>
|
jpayne@68
|
1073 <li>
|
jpayne@68
|
1074 The library code doesn't call <code>setlocale (LC_ALL, "")</code>. It's the
|
jpayne@68
|
1075 responsibility of the main program to set the locale. The library's
|
jpayne@68
|
1076 documentation should mention this fact, so that developers of programs
|
jpayne@68
|
1077 using the library are aware of it.
|
jpayne@68
|
1078
|
jpayne@68
|
1079 </li><li>
|
jpayne@68
|
1080 The library code doesn't call <code>textdomain (PACKAGE)</code>, because it
|
jpayne@68
|
1081 would interfere with the text domain set by the main program.
|
jpayne@68
|
1082
|
jpayne@68
|
1083 </li><li>
|
jpayne@68
|
1084 The initialization code for a program was
|
jpayne@68
|
1085
|
jpayne@68
|
1086 <table><tr><td> </td><td><pre class="smallexample"> setlocale (LC_ALL, "");
|
jpayne@68
|
1087 bindtextdomain (PACKAGE, LOCALEDIR);
|
jpayne@68
|
1088 textdomain (PACKAGE);
|
jpayne@68
|
1089 </pre></td></tr></table>
|
jpayne@68
|
1090
|
jpayne@68
|
1091 <p>For a library it is reduced to
|
jpayne@68
|
1092 </p>
|
jpayne@68
|
1093 <table><tr><td> </td><td><pre class="smallexample"> bindtextdomain (PACKAGE, LOCALEDIR);
|
jpayne@68
|
1094 </pre></td></tr></table>
|
jpayne@68
|
1095
|
jpayne@68
|
1096 <p>If your library's API doesn't already have an initialization function,
|
jpayne@68
|
1097 you need to create one, containing at least the <code>bindtextdomain</code>
|
jpayne@68
|
1098 invocation. However, you usually don't need to export and document this
|
jpayne@68
|
1099 initialization function: It is sufficient that all entry points of the
|
jpayne@68
|
1100 library call the initialization function if it hasn't been called before.
|
jpayne@68
|
1101 The typical idiom used to achieve this is a static boolean variable that
|
jpayne@68
|
1102 indicates whether the initialization function has been called. If the
|
jpayne@68
|
1103 library is meant to be used in multithreaded applications, this variable
|
jpayne@68
|
1104 needs to be marked <code>volatile</code>, so that its value get propagated
|
jpayne@68
|
1105 between threads. Like this:
|
jpayne@68
|
1106 </p>
|
jpayne@68
|
1107 <table><tr><td> </td><td><pre class="example">static volatile bool libfoo_initialized;
|
jpayne@68
|
1108
|
jpayne@68
|
1109 static void
|
jpayne@68
|
1110 libfoo_initialize (void)
|
jpayne@68
|
1111 {
|
jpayne@68
|
1112 bindtextdomain (PACKAGE, LOCALEDIR);
|
jpayne@68
|
1113 libfoo_initialized = true;
|
jpayne@68
|
1114 }
|
jpayne@68
|
1115
|
jpayne@68
|
1116 /* This function is part of the exported API. */
|
jpayne@68
|
1117 struct foo *
|
jpayne@68
|
1118 create_foo (...)
|
jpayne@68
|
1119 {
|
jpayne@68
|
1120 /* Must ensure the initialization is performed. */
|
jpayne@68
|
1121 if (!libfoo_initialized)
|
jpayne@68
|
1122 libfoo_initialize ();
|
jpayne@68
|
1123 ...
|
jpayne@68
|
1124 }
|
jpayne@68
|
1125
|
jpayne@68
|
1126 /* This function is part of the exported API. The argument must be
|
jpayne@68
|
1127 non-NULL and have been created through create_foo(). */
|
jpayne@68
|
1128 int
|
jpayne@68
|
1129 foo_refcount (struct foo *argument)
|
jpayne@68
|
1130 {
|
jpayne@68
|
1131 /* No need to invoke the initialization function here, because
|
jpayne@68
|
1132 create_foo() must already have been called before. */
|
jpayne@68
|
1133 ...
|
jpayne@68
|
1134 }
|
jpayne@68
|
1135 </pre></td></tr></table>
|
jpayne@68
|
1136
|
jpayne@68
|
1137 <p>The more general solution for initialization functions, POSIX
|
jpayne@68
|
1138 <code>pthread_once</code>, is not needed in this case.
|
jpayne@68
|
1139 </p>
|
jpayne@68
|
1140 </li><li>
|
jpayne@68
|
1141 The usual declaration of the ‘<samp>_</samp>’ macro in each source file was
|
jpayne@68
|
1142
|
jpayne@68
|
1143 <table><tr><td> </td><td><pre class="smallexample">#include <libintl.h>
|
jpayne@68
|
1144 #define _(String) gettext (String)
|
jpayne@68
|
1145 </pre></td></tr></table>
|
jpayne@68
|
1146
|
jpayne@68
|
1147 <p>for a program. For a library, which has its own translation domain,
|
jpayne@68
|
1148 it reads like this:
|
jpayne@68
|
1149 </p>
|
jpayne@68
|
1150 <table><tr><td> </td><td><pre class="smallexample">#include <libintl.h>
|
jpayne@68
|
1151 #define _(String) dgettext (PACKAGE, String)
|
jpayne@68
|
1152 </pre></td></tr></table>
|
jpayne@68
|
1153
|
jpayne@68
|
1154 <p>In other words, <code>dgettext</code> is used instead of <code>gettext</code>.
|
jpayne@68
|
1155 Similarly, the <code>dngettext</code> function should be used in place of the
|
jpayne@68
|
1156 <code>ngettext</code> function.
|
jpayne@68
|
1157 </p></li></ol>
|
jpayne@68
|
1158
|
jpayne@68
|
1159
|
jpayne@68
|
1160 <table cellpadding="1" cellspacing="1" border="0">
|
jpayne@68
|
1161 <tr><td valign="middle" align="left">[<a href="#SEC17" title="Beginning of this chapter or previous chapter"> << </a>]</td>
|
jpayne@68
|
1162 <td valign="middle" align="left">[<a href="gettext_5.html#SEC35" title="Next chapter"> >> </a>]</td>
|
jpayne@68
|
1163 <td valign="middle" align="left"> </td>
|
jpayne@68
|
1164 <td valign="middle" align="left"> </td>
|
jpayne@68
|
1165 <td valign="middle" align="left"> </td>
|
jpayne@68
|
1166 <td valign="middle" align="left"> </td>
|
jpayne@68
|
1167 <td valign="middle" align="left"> </td>
|
jpayne@68
|
1168 <td valign="middle" align="left">[<a href="gettext_toc.html#SEC_Top" title="Cover (top) of document">Top</a>]</td>
|
jpayne@68
|
1169 <td valign="middle" align="left">[<a href="gettext_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
|
jpayne@68
|
1170 <td valign="middle" align="left">[<a href="gettext_21.html#SEC389" title="Index">Index</a>]</td>
|
jpayne@68
|
1171 <td valign="middle" align="left">[<a href="gettext_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
|
jpayne@68
|
1172 </tr></table>
|
jpayne@68
|
1173 <p>
|
jpayne@68
|
1174 <font size="-1">
|
jpayne@68
|
1175 This document was generated by <em>Bruno Haible</em> on <em>February, 21 2024</em> using <a href="https://www.nongnu.org/texi2html/"><em>texi2html 1.78a</em></a>.
|
jpayne@68
|
1176 </font>
|
jpayne@68
|
1177 <br>
|
jpayne@68
|
1178
|
jpayne@68
|
1179 </p>
|
jpayne@68
|
1180 </body>
|
jpayne@68
|
1181 </html>
|