comparison CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/doc/gettext/gettext_4.html @ 68:5028fdace37b

planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author jpayne
date Tue, 18 Mar 2025 16:23:26 -0400
parents
children
comparison
equal deleted inserted replaced
67:0e9998148a16 68:5028fdace37b
1 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html401/loose.dtd">
2 <html>
3 <!-- Created on February, 21 2024 by texi2html 1.78a -->
4 <!--
5 Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
6 Karl Berry <karl@freefriends.org>
7 Olaf Bachmann <obachman@mathematik.uni-kl.de>
8 and many others.
9 Maintained by: Many creative people.
10 Send bugs and suggestions to <texi2html-bug@nongnu.org>
11
12 -->
13 <head>
14 <title>GNU gettext utilities: 4. Preparing Program Sources</title>
15
16 <meta name="description" content="GNU gettext utilities: 4. Preparing Program Sources">
17 <meta name="keywords" content="GNU gettext utilities: 4. Preparing Program Sources">
18 <meta name="resource-type" content="document">
19 <meta name="distribution" content="global">
20 <meta name="Generator" content="texi2html 1.78a">
21 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
22 <style type="text/css">
23 <!--
24 a.summary-letter {text-decoration: none}
25 pre.display {font-family: serif}
26 pre.format {font-family: serif}
27 pre.menu-comment {font-family: serif}
28 pre.menu-preformatted {font-family: serif}
29 pre.smalldisplay {font-family: serif; font-size: smaller}
30 pre.smallexample {font-size: smaller}
31 pre.smallformat {font-family: serif; font-size: smaller}
32 pre.smalllisp {font-size: smaller}
33 span.roman {font-family:serif; font-weight:normal;}
34 span.sansserif {font-family:sans-serif; font-weight:normal;}
35 ul.toc {list-style: none}
36 -->
37 </style>
38
39
40 </head>
41
42 <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
43
44 <table cellpadding="1" cellspacing="1" border="0">
45 <tr><td valign="middle" align="left">[<a href="gettext_3.html#SEC16" title="Beginning of this chapter or previous chapter"> &lt;&lt; </a>]</td>
46 <td valign="middle" align="left">[<a href="gettext_5.html#SEC35" title="Next chapter"> &gt;&gt; </a>]</td>
47 <td valign="middle" align="left"> &nbsp; </td>
48 <td valign="middle" align="left"> &nbsp; </td>
49 <td valign="middle" align="left"> &nbsp; </td>
50 <td valign="middle" align="left"> &nbsp; </td>
51 <td valign="middle" align="left"> &nbsp; </td>
52 <td valign="middle" align="left">[<a href="gettext_toc.html#SEC_Top" title="Cover (top) of document">Top</a>]</td>
53 <td valign="middle" align="left">[<a href="gettext_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
54 <td valign="middle" align="left">[<a href="gettext_21.html#SEC389" title="Index">Index</a>]</td>
55 <td valign="middle" align="left">[<a href="gettext_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
56 </tr></table>
57
58 <hr size="2">
59 <a name="Sources"></a>
60 <a name="SEC17"></a>
61 <h1 class="chapter"> <a href="gettext_toc.html#TOC17">4. Preparing Program Sources</a> </h1>
62
63
64 <p>For the programmer, changes to the C source code fall into three
65 categories. First, you have to make the localization functions
66 known to all modules needing message translation. Second, you should
67 properly trigger the operation of GNU <code>gettext</code> when the program
68 initializes, usually from the <code>main</code> function. Last, you should
69 identify, adjust and mark all constant strings in your program
70 needing translation.
71 </p>
72
73
74 <a name="Importing"></a>
75 <a name="SEC18"></a>
76 <h2 class="section"> <a href="gettext_toc.html#TOC18">4.1 Importing the <code>gettext</code> declaration</a> </h2>
77
78 <p>Presuming that your set of programs, or package, has been adjusted
79 so all needed GNU <code>gettext</code> files are available, and your
80 &lsquo;<tt>Makefile</tt>&rsquo; files are adjusted (see section <a href="gettext_13.html#SEC230">The Maintainer's View</a>), each C module
81 having translated C strings should contain the line:
82 </p>
83 <a name="IDX116"></a>
84 <table><tr><td>&nbsp;</td><td><pre class="example">#include &lt;libintl.h&gt;
85 </pre></td></tr></table>
86
87 <p>Similarly, each C module containing <code>printf()</code>/<code>fprintf()</code>/...
88 calls with a format string that could be a translated C string (even if
89 the C string comes from a different C module) should contain the line:
90 </p>
91 <table><tr><td>&nbsp;</td><td><pre class="example">#include &lt;libintl.h&gt;
92 </pre></td></tr></table>
93
94
95 <a name="Triggering"></a>
96 <a name="SEC19"></a>
97 <h2 class="section"> <a href="gettext_toc.html#TOC19">4.2 Triggering <code>gettext</code> Operations</a> </h2>
98
99 <p>The initialization of locale data should be done with more or less
100 the same code in every program, as demonstrated below:
101 </p>
102 <table><tr><td>&nbsp;</td><td><pre class="example">int
103 main (int argc, char *argv[])
104 {
105 &hellip;
106 setlocale (LC_ALL, &quot;&quot;);
107 bindtextdomain (PACKAGE, LOCALEDIR);
108 textdomain (PACKAGE);
109 &hellip;
110 }
111 </pre></td></tr></table>
112
113 <p><var>PACKAGE</var> and <var>LOCALEDIR</var> should be provided either by
114 &lsquo;<tt>config.h</tt>&rsquo; or by the Makefile. For now consult the <code>gettext</code>
115 or <code>hello</code> sources for more information.
116 </p>
117 <a name="IDX117"></a>
118 <a name="IDX118"></a>
119 <p>The use of <code>LC_ALL</code> might not be appropriate for you.
120 <code>LC_ALL</code> includes all locale categories and especially
121 <code>LC_CTYPE</code>. This latter category is responsible for determining
122 character classes with the <code>isalnum</code> etc. functions from
123 &lsquo;<tt>ctype.h</tt>&rsquo; which could especially for programs, which process some
124 kind of input language, be wrong. For example this would mean that a
125 source code using the &ccedil; (c-cedilla character) is runnable in
126 France but not in the U.S.
127 </p>
128 <p>Some systems also have problems with parsing numbers using the
129 <code>scanf</code> functions if an other but the <code>LC_ALL</code> locale category is
130 used. The standards say that additional formats but the one known in the
131 <code>&quot;C&quot;</code> locale might be recognized. But some systems seem to reject
132 numbers in the <code>&quot;C&quot;</code> locale format. In some situation, it might
133 also be a problem with the notation itself which makes it impossible to
134 recognize whether the number is in the <code>&quot;C&quot;</code> locale or the local
135 format. This can happen if thousands separator characters are used.
136 Some locales define this character according to the national
137 conventions to <code>'.'</code> which is the same character used in the
138 <code>&quot;C&quot;</code> locale to denote the decimal point.
139 </p>
140 <p>So it is sometimes necessary to replace the <code>LC_ALL</code> line in the
141 code above by a sequence of <code>setlocale</code> lines
142 </p>
143 <table><tr><td>&nbsp;</td><td><pre class="example">{
144 &hellip;
145 setlocale (LC_CTYPE, &quot;&quot;);
146 setlocale (LC_MESSAGES, &quot;&quot;);
147 &hellip;
148 }
149 </pre></td></tr></table>
150
151 <a name="IDX119"></a>
152 <a name="IDX120"></a>
153 <a name="IDX121"></a>
154 <a name="IDX122"></a>
155 <a name="IDX123"></a>
156 <a name="IDX124"></a>
157 <a name="IDX125"></a>
158 <p>On all POSIX conformant systems the locale categories <code>LC_CTYPE</code>,
159 <code>LC_MESSAGES</code>, <code>LC_COLLATE</code>, <code>LC_MONETARY</code>,
160 <code>LC_NUMERIC</code>, and <code>LC_TIME</code> are available. On some systems
161 which are only ISO C compliant, <code>LC_MESSAGES</code> is missing, but
162 a substitute for it is defined in GNU gettext's <code>&lt;libintl.h&gt;</code> and
163 in GNU gnulib's <code>&lt;locale.h&gt;</code>.
164 </p>
165 <p>Note that changing the <code>LC_CTYPE</code> also affects the functions
166 declared in the <code>&lt;ctype.h&gt;</code> standard header and some functions
167 declared in the <code>&lt;string.h&gt;</code> and <code>&lt;stdlib.h&gt;</code> standard headers.
168 If this is not
169 desirable in your application (for example in a compiler's parser),
170 you can use a set of substitute functions which hardwire the C locale,
171 such as found in the modules &lsquo;<samp>c-ctype</samp>&rsquo;, &lsquo;<samp>c-strcase</samp>&rsquo;,
172 &lsquo;<samp>c-strcasestr</samp>&rsquo;, &lsquo;<samp>c-strtod</samp>&rsquo;, &lsquo;<samp>c-strtold</samp>&rsquo; in the GNU gnulib
173 source distribution.
174 </p>
175 <p>It is also possible to switch the locale forth and back between the
176 environment dependent locale and the C locale, but this approach is
177 normally avoided because a <code>setlocale</code> call is expensive,
178 because it is tedious to determine the places where a locale switch
179 is needed in a large program's source, and because switching a locale
180 is not multithread-safe.
181 </p>
182
183 <a name="Preparing-Strings"></a>
184 <a name="SEC20"></a>
185 <h2 class="section"> <a href="gettext_toc.html#TOC20">4.3 Preparing Translatable Strings</a> </h2>
186
187 <p>Before strings can be marked for translations, they sometimes need to
188 be adjusted. Usually preparing a string for translation is done right
189 before marking it, during the marking phase which is described in the
190 next sections. What you have to keep in mind while doing that is the
191 following.
192 </p>
193 <ul>
194 <li>
195 Decent English style.
196
197 </li><li>
198 Entire sentences.
199
200 </li><li>
201 Split at paragraphs.
202
203 </li><li>
204 Use format strings instead of string concatenation.
205
206 </li><li>
207 Use placeholders in format strings instead of embedded URLs.
208
209 </li><li>
210 Use placeholders in format strings instead of programmer-defined format
211 string directives.
212
213 </li><li>
214 Avoid unusual markup and unusual control characters.
215 </li></ul>
216
217 <p>Let's look at some examples of these guidelines.
218 </p>
219 <a name="SEC21"></a>
220 <h3 class="subheading"> Decent English style </h3>
221
222 <p>Translatable strings should be in good English style. If slang language
223 with abbreviations and shortcuts is used, often translators will not
224 understand the message and will produce very inappropriate translations.
225 </p>
226 <table><tr><td>&nbsp;</td><td><pre class="example">&quot;%s: is parameter\n&quot;
227 </pre></td></tr></table>
228
229 <p>This is nearly untranslatable: Is the displayed item <em>a</em> parameter or
230 <em>the</em> parameter?
231 </p>
232 <table><tr><td>&nbsp;</td><td><pre class="example">&quot;No match&quot;
233 </pre></td></tr></table>
234
235 <p>The ambiguity in this message makes it unintelligible: Is the program
236 attempting to set something on fire? Does it mean &quot;The given object does
237 not match the template&quot;? Does it mean &quot;The template does not fit for any
238 of the objects&quot;?
239 </p>
240 <a name="IDX126"></a>
241 <p>In both cases, adding more words to the message will help both the
242 translator and the English speaking user.
243 </p>
244 <a name="SEC22"></a>
245 <h3 class="subheading"> Entire sentences </h3>
246
247 <p>Translatable strings should be entire sentences. It is often not possible
248 to translate single verbs or adjectives in a substitutable way.
249 </p>
250 <table><tr><td>&nbsp;</td><td><pre class="example">printf (&quot;File %s is %s protected&quot;, filename, rw ? &quot;write&quot; : &quot;read&quot;);
251 </pre></td></tr></table>
252
253 <p>Most translators will not look at the source and will thus only see the
254 string <code>&quot;File %s is %s protected&quot;</code>, which is unintelligible. Change
255 this to
256 </p>
257 <table><tr><td>&nbsp;</td><td><pre class="example">printf (rw ? &quot;File %s is write protected&quot; : &quot;File %s is read protected&quot;,
258 filename);
259 </pre></td></tr></table>
260
261 <p>This way the translator will not only understand the message, she will
262 also be able to find the appropriate grammatical construction. A French
263 translator for example translates &quot;write protected&quot; like &quot;protected
264 against writing&quot;.
265 </p>
266 <p>Entire sentences are also important because in many languages, the
267 declination of some word in a sentence depends on the gender or the
268 number (singular/plural) of another part of the sentence. There are
269 usually more interdependencies between words than in English. The
270 consequence is that asking a translator to translate two half-sentences
271 and then combining these two half-sentences through dumb string concatenation
272 will not work, for many languages, even though it would work for English.
273 That's why translators need to handle entire sentences.
274 </p>
275 <p>Often sentences don't fit into a single line. If a sentence is output
276 using two subsequent <code>printf</code> statements, like this
277 </p>
278 <table><tr><td>&nbsp;</td><td><pre class="example">printf (&quot;Locale charset \&quot;%s\&quot; is different from\n&quot;, lcharset);
279 printf (&quot;input file charset \&quot;%s\&quot;.\n&quot;, fcharset);
280 </pre></td></tr></table>
281
282 <p>the translator would have to translate two half sentences, but nothing
283 in the POT file would tell her that the two half sentences belong together.
284 It is necessary to merge the two <code>printf</code> statements so that the
285 translator can handle the entire sentence at once and decide at which
286 place to insert a line break in the translation (if at all):
287 </p>
288 <table><tr><td>&nbsp;</td><td><pre class="example">printf (&quot;Locale charset \&quot;%s\&quot; is different from\n\
289 input file charset \&quot;%s\&quot;.\n&quot;, lcharset, fcharset);
290 </pre></td></tr></table>
291
292 <p>You may now ask: how about two or more adjacent sentences? Like in this case:
293 </p>
294 <table><tr><td>&nbsp;</td><td><pre class="example">puts (&quot;Apollo 13 scenario: Stack overflow handling failed.&quot;);
295 puts (&quot;On the next stack overflow we will crash!!!&quot;);
296 </pre></td></tr></table>
297
298 <p>Should these two statements merged into a single one? I would recommend to
299 merge them if the two sentences are related to each other, because then it
300 makes it easier for the translator to understand and translate both. On
301 the other hand, if one of the two messages is a stereotypic one, occurring
302 in other places as well, you will do a favour to the translator by not
303 merging the two. (Identical messages occurring in several places are
304 combined by xgettext, so the translator has to handle them once only.)
305 </p>
306 <a name="SEC23"></a>
307 <h3 class="subheading"> Split at paragraphs </h3>
308
309 <p>Translatable strings should be limited to one paragraph; don't let a
310 single message be longer than ten lines. The reason is that when the
311 translatable string changes, the translator is faced with the task of
312 updating the entire translated string. Maybe only a single word will
313 have changed in the English string, but the translator doesn't see that
314 (with the current translation tools), therefore she has to proofread
315 the entire message.
316 </p>
317 <a name="IDX127"></a>
318 <p>Many GNU programs have a &lsquo;<samp>--help</samp>&rsquo; output that extends over several
319 screen pages. It is a courtesy towards the translators to split such a
320 message into several ones of five to ten lines each. While doing that,
321 you can also attempt to split the documented options into groups,
322 such as the input options, the output options, and the informative
323 output options. This will help every user to find the option he is
324 looking for.
325 </p>
326 <a name="SEC24"></a>
327 <h3 class="subheading"> No string concatenation </h3>
328
329 <p>Hardcoded string concatenation is sometimes used to construct English
330 strings:
331 </p>
332 <table><tr><td>&nbsp;</td><td><pre class="example">strcpy (s, &quot;Replace &quot;);
333 strcat (s, object1);
334 strcat (s, &quot; with &quot;);
335 strcat (s, object2);
336 strcat (s, &quot;?&quot;);
337 </pre></td></tr></table>
338
339 <p>In order to present to the translator only entire sentences, and also
340 because in some languages the translator might want to swap the order
341 of <code>object1</code> and <code>object2</code>, it is necessary to change this
342 to use a format string:
343 </p>
344 <table><tr><td>&nbsp;</td><td><pre class="example">sprintf (s, &quot;Replace %s with %s?&quot;, object1, object2);
345 </pre></td></tr></table>
346
347 <a name="IDX128"></a>
348 <p>A similar case is compile time concatenation of strings. The ISO C 99
349 include file <code>&lt;inttypes.h&gt;</code> contains a macro <code>PRId64</code> that
350 can be used as a formatting directive for outputting an &lsquo;<samp>int64_t</samp>&rsquo;
351 integer through <code>printf</code>. It expands to a constant string, usually
352 &quot;d&quot; or &quot;ld&quot; or &quot;lld&quot; or something like this, depending on the platform.
353 Assume you have code like
354 </p>
355 <table><tr><td>&nbsp;</td><td><pre class="example">printf (&quot;The amount is %0&quot; PRId64 &quot;\n&quot;, number);
356 </pre></td></tr></table>
357
358 <p>The <code>gettext</code> tools and library have special support for these
359 <code>&lt;inttypes.h&gt;</code> macros. You can therefore simply write
360 </p>
361 <table><tr><td>&nbsp;</td><td><pre class="example">printf (gettext (&quot;The amount is %0&quot; PRId64 &quot;\n&quot;), number);
362 </pre></td></tr></table>
363
364 <p>The PO file will contain the string &quot;The amount is %0&lt;PRId64&gt;\n&quot;.
365 The translators will provide a translation containing &quot;%0&lt;PRId64&gt;&quot;
366 as well, and at runtime the <code>gettext</code> function's result will
367 contain the appropriate constant string, &quot;d&quot; or &quot;ld&quot; or &quot;lld&quot;.
368 </p>
369 <p>This works only for the predefined <code>&lt;inttypes.h&gt;</code> macros. If
370 you have defined your own similar macros, let's say &lsquo;<samp>MYPRId64</samp>&rsquo;,
371 that are not known to <code>xgettext</code>, the solution for this problem
372 is to change the code like this:
373 </p>
374 <table><tr><td>&nbsp;</td><td><pre class="example">char buf1[100];
375 sprintf (buf1, &quot;%0&quot; MYPRId64, number);
376 printf (gettext (&quot;The amount is %s\n&quot;), buf1);
377 </pre></td></tr></table>
378
379 <p>This means, you put the platform dependent code in one statement, and the
380 internationalization code in a different statement. Note that a buffer length
381 of 100 is safe, because all available hardware integer types are limited to
382 128 bits, and to print a 128 bit integer one needs at most 54 characters,
383 regardless whether in decimal, octal or hexadecimal.
384 </p>
385 <a name="IDX129"></a>
386 <a name="IDX130"></a>
387 <p>All this applies to other programming languages as well. For example, in
388 Java and C#, string concatenation is very frequently used, because it is a
389 compiler built-in operator. Like in C, in Java, you would change
390 </p>
391 <table><tr><td>&nbsp;</td><td><pre class="example">System.out.println(&quot;Replace &quot;+object1+&quot; with &quot;+object2+&quot;?&quot;);
392 </pre></td></tr></table>
393
394 <p>into a statement involving a format string:
395 </p>
396 <table><tr><td>&nbsp;</td><td><pre class="example">System.out.println(
397 MessageFormat.format(&quot;Replace {0} with {1}?&quot;,
398 new Object[] { object1, object2 }));
399 </pre></td></tr></table>
400
401 <p>Similarly, in C#, you would change
402 </p>
403 <table><tr><td>&nbsp;</td><td><pre class="example">Console.WriteLine(&quot;Replace &quot;+object1+&quot; with &quot;+object2+&quot;?&quot;);
404 </pre></td></tr></table>
405
406 <p>into a statement involving a format string:
407 </p>
408 <table><tr><td>&nbsp;</td><td><pre class="example">Console.WriteLine(
409 String.Format(&quot;Replace {0} with {1}?&quot;, object1, object2));
410 </pre></td></tr></table>
411
412 <a name="SEC25"></a>
413 <h3 class="subheading"> No embedded URLs </h3>
414
415 <p>It is good to not embed URLs in translatable strings, for several reasons:
416 </p><ul>
417 <li>
418 It avoids possible mistakes during copy and paste.
419 </li><li>
420 Translators cannot translate the URLs or, by mistake, use the URLs from
421 other packages that are present in their compendium.
422 </li><li>
423 When the URLs change, translators don't need to revisit the translation
424 of the string.
425 </li></ul>
426
427 <p>The same holds for email addresses.
428 </p>
429 <p>So, you would change
430 </p>
431 <table><tr><td>&nbsp;</td><td><pre class="smallexample">fputs (_(&quot;GNU GPL version 3 &lt;https://gnu.org/licenses/gpl.html&gt;\n&quot;),
432 stream);
433 </pre></td></tr></table>
434
435 <p>to
436 </p>
437 <table><tr><td>&nbsp;</td><td><pre class="smallexample">fprintf (stream, _(&quot;GNU GPL version 3 &lt;%s&gt;\n&quot;),
438 &quot;https://gnu.org/licenses/gpl.html&quot;);
439 </pre></td></tr></table>
440
441 <a name="SEC26"></a>
442 <h3 class="subheading"> No programmer-defined format string directives </h3>
443
444 <p>The GNU C Library's <code>&lt;printf.h&gt;</code> facility and the C++ standard library's <code>&lt;format&gt;</code> header file make it possible for the programmer to define their own format string directives. However, such format directives cannot be used in translatable strings, for two reasons:
445 </p><ul>
446 <li>
447 There is no reference documentation for format strings with such directives, that the translators could consult. They would therefore have to guess where the directive starts and where it ends.
448 </li><li>
449 An &lsquo;<samp>msgfmt -c</samp>&rsquo; invocation cannot check whether the translator has produced a compatible translation of the format string. As a consequence, when a format string contains a programmer-defined directive, the program may crash at runtime when it uses the translated format string.
450 </li></ul>
451
452 <p>To avoid this situation, you need to move the formatting with the custom directive into a format string that does not get translated.
453 </p>
454 <p>For example, assuming code that makes use of a <code>%r</code> directive:
455 </p>
456 <table><tr><td>&nbsp;</td><td><pre class="smallexample">fprintf (stream, _(&quot;The contents is: %r&quot;), data);
457 </pre></td></tr></table>
458
459 <p>you would rewrite it to:
460 </p>
461 <table><tr><td>&nbsp;</td><td><pre class="smallexample">char *tmp;
462 if (asprintf (&amp;tmp, &quot;%r&quot;, data) &lt; 0)
463 error (...);
464 fprintf (stream, _(&quot;The contents is: %s&quot;), tmp);
465 free (tmp);
466 </pre></td></tr></table>
467
468 <p>Similarly, in C++, assuming you have defined a custom <code>formatter</code> for the type of <code>data</code>, the code
469 </p>
470 <table><tr><td>&nbsp;</td><td><pre class="smallexample">cout &lt;&lt; format (_(&quot;The contents is: {:#$#}&quot;), data);
471 </pre></td></tr></table>
472
473 <p>should be rewritten to:
474 </p>
475 <table><tr><td>&nbsp;</td><td><pre class="smallexample">string tmp = format (&quot;{:#$#}&quot;, data);
476 cout &lt;&lt; format (_(&quot;The contents is: {}&quot;), tmp);
477 </pre></td></tr></table>
478
479 <a name="SEC27"></a>
480 <h3 class="subheading"> No unusual markup </h3>
481
482 <p>Unusual markup or control characters should not be used in translatable
483 strings. Translators will likely not understand the particular meaning
484 of the markup or control characters.
485 </p>
486 <p>For example, if you have a convention that &lsquo;<samp>|</samp>&rsquo; delimits the
487 left-hand and right-hand part of some GUI elements, translators will
488 often not understand it without specific comments. It might be
489 better to have the translator translate the left-hand and right-hand
490 part separately.
491 </p>
492 <p>Another example is the &lsquo;<samp>argp</samp>&rsquo; convention to use a single &lsquo;<samp>\v</samp>&rsquo;
493 (vertical tab) control character to delimit two sections inside a
494 string. This is flawed. Some translators may convert it to a simple
495 newline, some to blank lines. With some PO file editors it may not be
496 easy to even enter a vertical tab control character. So, you cannot
497 be sure that the translation will contain a &lsquo;<samp>\v</samp>&rsquo; character, at the
498 corresponding position. The solution is, again, to let the translator
499 translate two separate strings and combine at run-time the two translated
500 strings with the &lsquo;<samp>\v</samp>&rsquo; required by the convention.
501 </p>
502 <p>HTML markup, however, is common enough that it's probably ok to use in
503 translatable strings. But please bear in mind that the GNU gettext tools
504 don't verify that the translations are well-formed HTML.
505 </p>
506
507 <a name="Mark-Keywords"></a>
508 <a name="SEC28"></a>
509 <h2 class="section"> <a href="gettext_toc.html#TOC21">4.4 How Marks Appear in Sources</a> </h2>
510
511 <p>All strings requiring translation should be marked in the C sources. Marking
512 is done in such a way that each translatable string appears to be
513 the sole argument of some function or preprocessor macro. There are
514 only a few such possible functions or macros meant for translation,
515 and their names are said to be marking keywords. The marking is
516 attached to strings themselves, rather than to what we do with them.
517 This approach has more uses. A blatant example is an error message
518 produced by formatting. The format string needs translation, as
519 well as some strings inserted through some &lsquo;<samp>%s</samp>&rsquo; specification
520 in the format, while the result from <code>sprintf</code> may have so many
521 different instances that it is impractical to list them all in some
522 &lsquo;<samp>error_string_out()</samp>&rsquo; routine, say.
523 </p>
524 <p>This marking operation has two goals. The first goal of marking
525 is for triggering the retrieval of the translation, at run time.
526 The keyword is possibly resolved into a routine able to dynamically
527 return the proper translation, as far as possible or wanted, for the
528 argument string. Most localizable strings are found in executable
529 positions, that is, attached to variables or given as parameters to
530 functions. But this is not universal usage, and some translatable
531 strings appear in structured initializations. See section <a href="#SEC31">Special Cases of Translatable Strings</a>.
532 </p>
533 <p>The second goal of the marking operation is to help <code>xgettext</code>
534 at properly extracting all translatable strings when it scans a set
535 of program sources and produces PO file templates.
536 </p>
537 <p>The canonical keyword for marking translatable strings is
538 &lsquo;<samp>gettext</samp>&rsquo;, it gave its name to the whole GNU <code>gettext</code>
539 package. For packages making only light use of the &lsquo;<samp>gettext</samp>&rsquo;
540 keyword, macro or function, it is easily used <em>as is</em>. However,
541 for packages using the <code>gettext</code> interface more heavily, it
542 is usually more convenient to give the main keyword a shorter, less
543 obtrusive name. Indeed, the keyword might appear on a lot of strings
544 all over the package, and programmers usually do not want nor need
545 their program sources to remind them forcefully, all the time, that they
546 are internationalized. Further, a long keyword has the disadvantage
547 of using more horizontal space, forcing more indentation work on
548 sources for those trying to keep them within 79 or 80 columns.
549 </p>
550 <a name="IDX131"></a>
551 <p>Many packages use &lsquo;<samp>_</samp>&rsquo; (a simple underline) as a keyword,
552 and write &lsquo;<samp>_(&quot;Translatable string&quot;)</samp>&rsquo; instead of &lsquo;<samp>gettext
553 (&quot;Translatable string&quot;)</samp>&rsquo;. Further, the coding rule, from GNU standards,
554 wanting that there is a space between the keyword and the opening
555 parenthesis is relaxed, in practice, for this particular usage.
556 So, the textual overhead per translatable string is reduced to
557 only three characters: the underline and the two parentheses.
558 However, even if GNU <code>gettext</code> uses this convention internally,
559 it does not offer it officially. The real, genuine keyword is truly
560 &lsquo;<samp>gettext</samp>&rsquo; indeed. It is fairly easy for those wanting to use
561 &lsquo;<samp>_</samp>&rsquo; instead of &lsquo;<samp>gettext</samp>&rsquo; to declare:
562 </p>
563 <table><tr><td>&nbsp;</td><td><pre class="example">#include &lt;libintl.h&gt;
564 #define _(String) gettext (String)
565 </pre></td></tr></table>
566
567 <p>instead of merely using &lsquo;<samp>#include &lt;libintl.h&gt;</samp>&rsquo;.
568 </p>
569 <p>The marking keywords &lsquo;<samp>gettext</samp>&rsquo; and &lsquo;<samp>_</samp>&rsquo; take the translatable
570 string as sole argument. It is also possible to define marking functions
571 that take it at another argument position. It is even possible to make
572 the marked argument position depend on the total number of arguments of
573 the function call; this is useful in C++. All this is achieved using
574 <code>xgettext</code>'s &lsquo;<samp>--keyword</samp>&rsquo; option. How to pass such an option
575 to <code>xgettext</code>, assuming that <code>gettextize</code> is used, is described
576 in <a href="gettext_13.html#SEC237">&lsquo;<tt>Makevars</tt>&rsquo; in &lsquo;<tt>po/</tt>&rsquo;</a> and <a href="gettext_13.html#SEC252">AM_XGETTEXT_OPTION in &lsquo;<tt>po.m4</tt>&rsquo;</a>.
577 </p>
578 <p>Note also that long strings can be split across lines, into multiple
579 adjacent string tokens. Automatic string concatenation is performed
580 at compile time according to ISO C and ISO C++; <code>xgettext</code> also
581 supports this syntax.
582 </p>
583 <p>In C++, marking a C++ format string requires a small code change,
584 because the first argument to <code>std::format</code> must be a constant
585 expression.
586 For example,
587 </p><table><tr><td>&nbsp;</td><td><pre class="smallexample">std::format (&quot;{} {}!&quot;, &quot;Hello&quot;, &quot;world&quot;)
588 </pre></td></tr></table>
589 <p>needs to be changed to
590 </p><table><tr><td>&nbsp;</td><td><pre class="smallexample">std::vformat (gettext (&quot;{} {}!&quot;), std::make_format_args(&quot;Hello&quot;, &quot;world&quot;))
591 </pre></td></tr></table>
592
593 <p>Later on, the maintenance is relatively easy. If, as a programmer,
594 you add or modify a string, you will have to ask yourself if the
595 new or altered string requires translation, and include it within
596 &lsquo;<samp>_()</samp>&rsquo; if you think it should be translated. For example, &lsquo;<samp>&quot;%s&quot;</samp>&rsquo;
597 is an example of string <em>not</em> requiring translation. But
598 &lsquo;<samp>&quot;%s: %d&quot;</samp>&rsquo; <em>does</em> require translation, because in French, unlike
599 in English, it's customary to put a space before a colon.
600 </p>
601
602 <a name="Marking"></a>
603 <a name="SEC29"></a>
604 <h2 class="section"> <a href="gettext_toc.html#TOC22">4.5 Marking Translatable Strings</a> </h2>
605
606 <p>In PO mode, one set of features is meant more for the programmer than
607 for the translator, and allows him to interactively mark which strings,
608 in a set of program sources, are translatable, and which are not.
609 Even if it is a fairly easy job for a programmer to find and mark
610 such strings by other means, using any editor of his choice, PO mode
611 makes this work more comfortable. Further, this gives translators
612 who feel a little like programmers, or programmers who feel a little
613 like translators, a tool letting them work at marking translatable
614 strings in the program sources, while simultaneously producing a set of
615 translation in some language, for the package being internationalized.
616 </p>
617 <a name="IDX132"></a>
618 <p>The set of program sources, targeted by the PO mode commands describe
619 here, should have an Emacs tags table constructed for your project,
620 prior to using these PO file commands. This is easy to do. In any
621 shell window, change the directory to the root of your project, then
622 execute a command resembling:
623 </p>
624 <table><tr><td>&nbsp;</td><td><pre class="example">etags src/*.[hc] lib/*.[hc]
625 </pre></td></tr></table>
626
627 <p>presuming here you want to process all &lsquo;<tt>.h</tt>&rsquo; and &lsquo;<tt>.c</tt>&rsquo; files
628 from the &lsquo;<tt>src/</tt>&rsquo; and &lsquo;<tt>lib/</tt>&rsquo; directories. This command will
629 explore all said files and create a &lsquo;<tt>TAGS</tt>&rsquo; file in your root
630 directory, somewhat summarizing the contents using a special file
631 format Emacs can understand.
632 </p>
633 <a name="IDX133"></a>
634 <p>For packages following the GNU coding standards, there is
635 a make goal <code>tags</code> or <code>TAGS</code> which constructs the tag files in
636 all directories and for all files containing source code.
637 </p>
638 <p>Once your &lsquo;<tt>TAGS</tt>&rsquo; file is ready, the following commands assist
639 the programmer at marking translatable strings in his set of sources.
640 But these commands are necessarily driven from within a PO file
641 window, and it is likely that you do not even have such a PO file yet.
642 This is not a problem at all, as you may safely open a new, empty PO
643 file, mainly for using these commands. This empty PO file will slowly
644 fill in while you mark strings as translatable in your program sources.
645 </p>
646 <dl compact="compact">
647 <dt> <kbd>,</kbd></dt>
648 <dd><a name="IDX134"></a>
649 <p>Search through program sources for a string which looks like a
650 candidate for translation (<code>po-tags-search</code>).
651 </p>
652 </dd>
653 <dt> <kbd>M-,</kbd></dt>
654 <dd><a name="IDX135"></a>
655 <p>Mark the last string found with &lsquo;<samp>_()</samp>&rsquo; (<code>po-mark-translatable</code>).
656 </p>
657 </dd>
658 <dt> <kbd>M-.</kbd></dt>
659 <dd><a name="IDX136"></a>
660 <p>Mark the last string found with a keyword taken from a set of possible
661 keywords. This command with a prefix allows some management of these
662 keywords (<code>po-select-mark-and-mark</code>).
663 </p>
664 </dd>
665 </dl>
666
667 <a name="IDX137"></a>
668 <p>The <kbd>,</kbd> (<code>po-tags-search</code>) command searches for the next
669 occurrence of a string which looks like a possible candidate for
670 translation, and displays the program source in another Emacs window,
671 positioned in such a way that the string is near the top of this other
672 window. If the string is too big to fit whole in this window, it is
673 positioned so only its end is shown. In any case, the cursor
674 is left in the PO file window. If the shown string would be better
675 presented differently in different native languages, you may mark it
676 using <kbd>M-,</kbd> or <kbd>M-.</kbd>. Otherwise, you might rather ignore it
677 and skip to the next string by merely repeating the <kbd>,</kbd> command.
678 </p>
679 <p>A string is a good candidate for translation if it contains a sequence
680 of three or more letters. A string containing at most two letters in
681 a row will be considered as a candidate if it has more letters than
682 non-letters. The command disregards strings containing no letters,
683 or isolated letters only. It also disregards strings within comments,
684 or strings already marked with some keyword PO mode knows (see below).
685 </p>
686 <p>If you have never told Emacs about some &lsquo;<tt>TAGS</tt>&rsquo; file to use, the
687 command will request that you specify one from the minibuffer, the
688 first time you use the command. You may later change your &lsquo;<tt>TAGS</tt>&rsquo;
689 file by using the regular Emacs command <kbd>M-x visit-tags-table</kbd>,
690 which will ask you to name the precise &lsquo;<tt>TAGS</tt>&rsquo; file you want
691 to use. See <a href="../emacs/Tags.html#Tags">(emacs)Tags</a> section `Tag Tables' in <cite>The Emacs Editor</cite>.
692 </p>
693 <p>Each time you use the <kbd>,</kbd> command, the search resumes from where it was
694 left by the previous search, and goes through all program sources,
695 obeying the &lsquo;<tt>TAGS</tt>&rsquo; file, until all sources have been processed.
696 However, by giving a prefix argument to the command (<kbd>C-u
697 ,</kbd>), you may request that the search be restarted all over again
698 from the first program source; but in this case, strings that you
699 recently marked as translatable will be automatically skipped.
700 </p>
701 <p>Using this <kbd>,</kbd> command does not prevent using of other regular
702 Emacs tags commands. For example, regular <code>tags-search</code> or
703 <code>tags-query-replace</code> commands may be used without disrupting the
704 independent <kbd>,</kbd> search sequence. However, as implemented, the
705 <em>initial</em> <kbd>,</kbd> command (or the <kbd>,</kbd> command is used with a
706 prefix) might also reinitialize the regular Emacs tags searching to the
707 first tags file, this reinitialization might be considered spurious.
708 </p>
709 <a name="IDX138"></a>
710 <a name="IDX139"></a>
711 <p>The <kbd>M-,</kbd> (<code>po-mark-translatable</code>) command will mark the
712 recently found string with the &lsquo;<samp>_</samp>&rsquo; keyword. The <kbd>M-.</kbd>
713 (<code>po-select-mark-and-mark</code>) command will request that you type
714 one keyword from the minibuffer and use that keyword for marking
715 the string. Both commands will automatically create a new PO file
716 untranslated entry for the string being marked, and make it the
717 current entry (making it easy for you to immediately proceed to its
718 translation, if you feel like doing it right away). It is possible
719 that the modifications made to the program source by <kbd>M-,</kbd> or
720 <kbd>M-.</kbd> render some source line longer than 80 columns, forcing you
721 to break and re-indent this line differently. You may use the <kbd>O</kbd>
722 command from PO mode, or any other window changing command from
723 Emacs, to break out into the program source window, and do any
724 needed adjustments. You will have to use some regular Emacs command
725 to return the cursor to the PO file window, if you want command
726 <kbd>,</kbd> for the next string, say.
727 </p>
728 <p>The <kbd>M-.</kbd> command has a few built-in speedups, so you do not
729 have to explicitly type all keywords all the time. The first such
730 speedup is that you are presented with a <em>preferred</em> keyword,
731 which you may accept by merely typing <kbd>&lt;RET&gt;</kbd> at the prompt.
732 The second speedup is that you may type any non-ambiguous prefix of the
733 keyword you really mean, and the command will complete it automatically
734 for you. This also means that PO mode has to <em>know</em> all
735 your possible keywords, and that it will not accept mistyped keywords.
736 </p>
737 <p>If you reply <kbd>?</kbd> to the keyword request, the command gives a
738 list of all known keywords, from which you may choose. When the
739 command is prefixed by an argument (<kbd>C-u M-.</kbd>), it inhibits
740 updating any program source or PO file buffer, and does some simple
741 keyword management instead. In this case, the command asks for a
742 keyword, written in full, which becomes a new allowed keyword for
743 later <kbd>M-.</kbd> commands. Moreover, this new keyword automatically
744 becomes the <em>preferred</em> keyword for later commands. By typing
745 an already known keyword in response to <kbd>C-u M-.</kbd>, one merely
746 changes the <em>preferred</em> keyword and does nothing more.
747 </p>
748 <p>All keywords known for <kbd>M-.</kbd> are recognized by the <kbd>,</kbd> command
749 when scanning for strings, and strings already marked by any of those
750 known keywords are automatically skipped. If many PO files are opened
751 simultaneously, each one has its own independent set of known keywords.
752 There is no provision in PO mode, currently, for deleting a known
753 keyword, you have to quit the file (maybe using <kbd>q</kbd>) and reopen
754 it afresh. When a PO file is newly brought up in an Emacs window, only
755 &lsquo;<samp>gettext</samp>&rsquo; and &lsquo;<samp>_</samp>&rsquo; are known as keywords, and &lsquo;<samp>gettext</samp>&rsquo;
756 is preferred for the <kbd>M-.</kbd> command. In fact, this is not useful to
757 prefer &lsquo;<samp>_</samp>&rsquo;, as this one is already built in the <kbd>M-,</kbd> command.
758 </p>
759
760 <a name="c_002dformat-Flag"></a>
761 <a name="SEC30"></a>
762 <h2 class="section"> <a href="gettext_toc.html#TOC23">4.6 Special Comments preceding Keywords</a> </h2>
763
764
765 <p>In C programs strings are often used within calls of functions from the
766 <code>printf</code> family. The special thing about these format strings is
767 that they can contain format specifiers introduced with <kbd>%</kbd>. Assume
768 we have the code
769 </p>
770 <table><tr><td>&nbsp;</td><td><pre class="example">printf (gettext (&quot;String `%s' has %d characters\n&quot;), s, strlen (s));
771 </pre></td></tr></table>
772
773 <p>A possible German translation for the above string might be:
774 </p>
775 <table><tr><td>&nbsp;</td><td><pre class="example">&quot;%d Zeichen lang ist die Zeichenkette `%s'&quot;
776 </pre></td></tr></table>
777
778 <p>A C programmer, even if he cannot speak German, will recognize that
779 there is something wrong here. The order of the two format specifiers
780 is changed but of course the arguments in the <code>printf</code> don't have.
781 This will most probably lead to problems because now the length of the
782 string is regarded as the address.
783 </p>
784 <p>To prevent errors at runtime caused by translations, the <code>msgfmt</code>
785 tool can check statically whether the arguments in the original and the
786 translation string match in type and number. If this is not the case
787 and the &lsquo;<samp>-c</samp>&rsquo; option has been passed to <code>msgfmt</code>, <code>msgfmt</code>
788 will give an error and refuse to produce a MO file. Thus consistent
789 use of &lsquo;<samp>msgfmt -c</samp>&rsquo; will catch the error, so that it cannot cause
790 problems at runtime.
791 </p>
792 <p>If the word order in the above German translation would be correct one
793 would have to write
794 </p>
795 <table><tr><td>&nbsp;</td><td><pre class="example">&quot;%2$d Zeichen lang ist die Zeichenkette `%1$s'&quot;
796 </pre></td></tr></table>
797
798 <p>The routines in <code>msgfmt</code> know about this special notation.
799 </p>
800 <p>Because not all strings in a program will be format strings, it is not
801 useful for <code>msgfmt</code> to test all the strings in the &lsquo;<tt>.po</tt>&rsquo; file.
802 This might cause problems because the string might contain what looks
803 like a format specifier, but the string is not used in <code>printf</code>.
804 </p>
805 <p>Therefore <code>xgettext</code> adds a special tag to those messages it
806 thinks might be a format string. There is no absolute rule for this,
807 only a heuristic. In the &lsquo;<tt>.po</tt>&rsquo; file the entry is marked using the
808 <code>c-format</code> flag in the <code>#,</code> comment line (see section <a href="gettext_3.html#SEC16">The Format of PO Files</a>).
809 </p>
810 <a name="IDX140"></a>
811 <a name="IDX141"></a>
812 <p>The careful reader now might say that this again can cause problems.
813 The heuristic might guess it wrong. This is true and therefore
814 <code>xgettext</code> knows about a special kind of comment which lets
815 the programmer take over the decision. If in the same line as or
816 the immediately preceding line to the <code>gettext</code> keyword
817 the <code>xgettext</code> program finds a comment containing the words
818 <code>xgettext:c-format</code>, it will mark the string in any case with
819 the <code>c-format</code> flag. This kind of comment should be used when
820 <code>xgettext</code> does not recognize the string as a format string but
821 it really is one and it should be tested. Please note that when the
822 comment is in the same line as the <code>gettext</code> keyword, it must be
823 before the string to be translated. Also note that a comment such as
824 <code>xgettext:c-format</code> applies only to the first string in the same
825 or the next line, not to multiple strings.
826 </p>
827 <p>This situation happens quite often. The <code>printf</code> function is often
828 called with strings which do not contain a format specifier. Of course
829 one would normally use <code>fputs</code> but it does happen. In this case
830 <code>xgettext</code> does not recognize this as a format string but what
831 happens if the translation introduces a valid format specifier? The
832 <code>printf</code> function will try to access one of the parameters but none
833 exists because the original code does not pass any parameters.
834 </p>
835 <p><code>xgettext</code> of course could make a wrong decision the other way
836 round, i.e. a string marked as a format string actually is not a format
837 string. In this case the <code>msgfmt</code> might give too many warnings and
838 would prevent translating the &lsquo;<tt>.po</tt>&rsquo; file. The method to prevent
839 this wrong decision is similar to the one used above, only the comment
840 to use must contain the string <code>xgettext:no-c-format</code>.
841 </p>
842 <p>If a string is marked with <code>c-format</code> and this is not correct the
843 user can find out who is responsible for the decision. See
844 <a href="gettext_5.html#SEC36">Invoking the <code>xgettext</code> Program</a> to see how the <code>--debug</code> option can be
845 used for solving this problem.
846 </p>
847
848 <a name="Special-cases"></a>
849 <a name="SEC31"></a>
850 <h2 class="section"> <a href="gettext_toc.html#TOC24">4.7 Special Cases of Translatable Strings</a> </h2>
851
852 <p>The attentive reader might now point out that it is not always possible
853 to mark translatable string with <code>gettext</code> or something like this.
854 Consider the following case:
855 </p>
856 <table><tr><td>&nbsp;</td><td><pre class="example">{
857 static const char *messages[] = {
858 &quot;some very meaningful message&quot;,
859 &quot;and another one&quot;
860 };
861 const char *string;
862 &hellip;
863 string
864 = index &gt; 1 ? &quot;a default message&quot; : messages[index];
865
866 fputs (string);
867 &hellip;
868 }
869 </pre></td></tr></table>
870
871 <p>While it is no problem to mark the string <code>&quot;a default message&quot;</code> it
872 is not possible to mark the string initializers for <code>messages</code>.
873 What is to be done? We have to fulfill two tasks. First we have to mark the
874 strings so that the <code>xgettext</code> program (see section <a href="gettext_5.html#SEC36">Invoking the <code>xgettext</code> Program</a>)
875 can find them, and second we have to translate the string at runtime
876 before printing them.
877 </p>
878 <p>The first task can be fulfilled by creating a new keyword, which names a
879 no-op. For the second we have to mark all access points to a string
880 from the array. So one solution can look like this:
881 </p>
882 <table><tr><td>&nbsp;</td><td><pre class="example">#define gettext_noop(String) String
883
884 {
885 static const char *messages[] = {
886 gettext_noop (&quot;some very meaningful message&quot;),
887 gettext_noop (&quot;and another one&quot;)
888 };
889 const char *string;
890 &hellip;
891 string
892 = index &gt; 1 ? gettext (&quot;a default message&quot;) : gettext (messages[index]);
893
894 fputs (string);
895 &hellip;
896 }
897 </pre></td></tr></table>
898
899 <p>Please convince yourself that the string which is written by
900 <code>fputs</code> is translated in any case. How to get <code>xgettext</code> know
901 the additional keyword <code>gettext_noop</code> is explained in <a href="gettext_5.html#SEC36">Invoking the <code>xgettext</code> Program</a>.
902 </p>
903 <p>The above is of course not the only solution. You could also come along
904 with the following one:
905 </p>
906 <table><tr><td>&nbsp;</td><td><pre class="example">#define gettext_noop(String) String
907
908 {
909 static const char *messages[] = {
910 gettext_noop (&quot;some very meaningful message&quot;),
911 gettext_noop (&quot;and another one&quot;)
912 };
913 const char *string;
914 &hellip;
915 string
916 = index &gt; 1 ? gettext_noop (&quot;a default message&quot;) : messages[index];
917
918 fputs (gettext (string));
919 &hellip;
920 }
921 </pre></td></tr></table>
922
923 <p>But this has a drawback. The programmer has to take care that
924 he uses <code>gettext_noop</code> for the string <code>&quot;a default message&quot;</code>.
925 A use of <code>gettext</code> could have in rare cases unpredictable results.
926 </p>
927 <p>One advantage is that you need not make control flow analysis to make
928 sure the output is really translated in any case. But this analysis is
929 generally not very difficult. If it should be in any situation you can
930 use this second method in this situation.
931 </p>
932
933 <a name="Bug-Report-Address"></a>
934 <a name="SEC32"></a>
935 <h2 class="section"> <a href="gettext_toc.html#TOC25">4.8 Letting Users Report Translation Bugs</a> </h2>
936
937 <p>Code sometimes has bugs, but translations sometimes have bugs too. The
938 users need to be able to report them. Reporting translation bugs to the
939 programmer or maintainer of a package is not very useful, since the
940 maintainer must never change a translation, except on behalf of the
941 translator. Hence the translation bugs must be reported to the
942 translators.
943 </p>
944 <p>Here is a way to organize this so that the maintainer does not need to
945 forward translation bug reports, nor even keep a list of the addresses of
946 the translators or their translation teams.
947 </p>
948 <p>Every program has a place where is shows the bug report address. For
949 GNU programs, it is the code which handles the &ldquo;&ndash;help&rdquo; option,
950 typically in a function called &ldquo;usage&rdquo;. In this place, instruct the
951 translator to add her own bug reporting address. For example, if that
952 code has a statement
953 </p>
954 <table><tr><td>&nbsp;</td><td><pre class="example">printf (_(&quot;Report bugs to &lt;%s&gt;.\n&quot;), PACKAGE_BUGREPORT);
955 </pre></td></tr></table>
956
957 <p>you can add some translator instructions like this:
958 </p>
959 <table><tr><td>&nbsp;</td><td><pre class="example">/* TRANSLATORS: The placeholder indicates the bug-reporting address
960 for this package. Please add _another line_ saying
961 &quot;Report translation bugs to &lt;...&gt;\n&quot; with the address for translation
962 bugs (typically your translation team's web or email address). */
963 printf (_(&quot;Report bugs to &lt;%s&gt;.\n&quot;), PACKAGE_BUGREPORT);
964 </pre></td></tr></table>
965
966 <p>These will be extracted by &lsquo;<samp>xgettext</samp>&rsquo;, leading to a .pot file that
967 contains this:
968 </p>
969 <table><tr><td>&nbsp;</td><td><pre class="example">#. TRANSLATORS: The placeholder indicates the bug-reporting address
970 #. for this package. Please add _another line_ saying
971 #. &quot;Report translation bugs to &lt;...&gt;\n&quot; with the address for translation
972 #. bugs (typically your translation team's web or email address).
973 #: src/hello.c:178
974 #, c-format
975 msgid &quot;Report bugs to &lt;%s&gt;.\n&quot;
976 msgstr &quot;&quot;
977 </pre></td></tr></table>
978
979
980 <a name="Names"></a>
981 <a name="SEC33"></a>
982 <h2 class="section"> <a href="gettext_toc.html#TOC26">4.9 Marking Proper Names for Translation</a> </h2>
983
984 <p>Should names of persons, cities, locations etc. be marked for translation
985 or not? People who only know languages that can be written with Latin
986 letters (English, Spanish, French, German, etc.) are tempted to say &ldquo;no&rdquo;,
987 because names usually do not change when transported between these languages.
988 However, in general when translating from one script to another, names
989 are translated too, usually phonetically or by transliteration. For
990 example, Russian or Greek names are converted to the Latin alphabet when
991 being translated to English, and English or French names are converted
992 to the Katakana script when being translated to Japanese. This is
993 necessary because the speakers of the target language in general cannot
994 read the script the name is originally written in.
995 </p>
996 <p>As a programmer, you should therefore make sure that names are marked
997 for translation, with a special comment telling the translators that it
998 is a proper name and how to pronounce it. In its simple form, it looks
999 like this:
1000 </p>
1001 <table><tr><td>&nbsp;</td><td><pre class="example">printf (_(&quot;Written by %s.\n&quot;),
1002 /* TRANSLATORS: This is a proper name. See the gettext
1003 manual, section Names. Note this is actually a non-ASCII
1004 name: The first name is (with Unicode escapes)
1005 &quot;Fran\u00e7ois&quot; or (with HTML entities) &quot;Fran&amp;ccedil;ois&quot;.
1006 Pronunciation is like &quot;fraa-swa pee-nar&quot;. */
1007 _(&quot;Francois Pinard&quot;));
1008 </pre></td></tr></table>
1009
1010 <p>The GNU gnulib library offers a module &lsquo;<samp>propername</samp>&rsquo;
1011 (<a href="https://www.gnu.org/software/gnulib/MODULES.html#module=propername">https://www.gnu.org/software/gnulib/MODULES.html#module=propername</a>)
1012 which takes care to automatically append the original name, in parentheses,
1013 to the translated name. For names that cannot be written in ASCII, it
1014 also frees the translator from the task of entering the appropriate non-ASCII
1015 characters if no script change is needed. In this more comfortable form,
1016 it looks like this:
1017 </p>
1018 <table><tr><td>&nbsp;</td><td><pre class="example">printf (_(&quot;Written by %s and %s.\n&quot;),
1019 proper_name (&quot;Ulrich Drepper&quot;),
1020 /* TRANSLATORS: This is a proper name. See the gettext
1021 manual, section Names. Note this is actually a non-ASCII
1022 name: The first name is (with Unicode escapes)
1023 &quot;Fran\u00e7ois&quot; or (with HTML entities) &quot;Fran&amp;ccedil;ois&quot;.
1024 Pronunciation is like &quot;fraa-swa pee-nar&quot;. */
1025 proper_name_utf8 (&quot;Francois Pinard&quot;, &quot;Fran\303\247ois Pinard&quot;));
1026 </pre></td></tr></table>
1027
1028 <p>You can also write the original name directly in Unicode (rather than with
1029 Unicode escapes or HTML entities) and denote the pronunciation using the
1030 International Phonetic Alphabet (see
1031 <a href="https://en.wikipedia.org/wiki/International_Phonetic_Alphabet">https://en.wikipedia.org/wiki/International_Phonetic_Alphabet</a>).
1032 </p>
1033 <p>As a translator, you should use some care when translating names, because
1034 it is frustrating if people see their names mutilated or distorted.
1035 </p>
1036 <p>If your language uses the Latin script, all you need to do is to reproduce
1037 the name as perfectly as you can within the usual character set of your
1038 language. In this particular case, this means to provide a translation
1039 containing the c-cedilla character. If your language uses a different
1040 script and the people speaking it don't usually read Latin words, it means
1041 transliteration. If the programmer used the simple case, you should still
1042 give, in parentheses, the original writing of the name &ndash; for the sake of
1043 the people that do read the Latin script. If the programmer used the
1044 &lsquo;<samp>propername</samp>&rsquo; module mentioned above, you don't need to give the original
1045 writing of the name in parentheses, because the program will already do so.
1046 Here is an example, using Greek as the target script:
1047 </p>
1048 <table><tr><td>&nbsp;</td><td><pre class="example">#. This is a proper name. See the gettext
1049 #. manual, section Names. Note this is actually a non-ASCII
1050 #. name: The first name is (with Unicode escapes)
1051 #. &quot;Fran\u00e7ois&quot; or (with HTML entities) &quot;Fran&amp;ccedil;ois&quot;.
1052 #. Pronunciation is like &quot;fraa-swa pee-nar&quot;.
1053 msgid &quot;Francois Pinard&quot;
1054 msgstr &quot;\phi\rho\alpha\sigma\omicron\alpha \pi\iota\nu\alpha\rho&quot;
1055 &quot; (Francois Pinard)&quot;
1056 </pre></td></tr></table>
1057
1058 <p>Because translation of names is such a sensitive domain, it is a good
1059 idea to test your translation before submitting it.
1060 </p>
1061
1062 <a name="Libraries"></a>
1063 <a name="SEC34"></a>
1064 <h2 class="section"> <a href="gettext_toc.html#TOC27">4.10 Preparing Library Sources</a> </h2>
1065
1066 <p>When you are preparing a library, not a program, for the use of
1067 <code>gettext</code>, only a few details are different. Here we assume that
1068 the library has a translation domain and a POT file of its own. (If
1069 it uses the translation domain and POT file of the main program, then
1070 the previous sections apply without changes.)
1071 </p>
1072 <ol>
1073 <li>
1074 The library code doesn't call <code>setlocale (LC_ALL, &quot;&quot;)</code>. It's the
1075 responsibility of the main program to set the locale. The library's
1076 documentation should mention this fact, so that developers of programs
1077 using the library are aware of it.
1078
1079 </li><li>
1080 The library code doesn't call <code>textdomain (PACKAGE)</code>, because it
1081 would interfere with the text domain set by the main program.
1082
1083 </li><li>
1084 The initialization code for a program was
1085
1086 <table><tr><td>&nbsp;</td><td><pre class="smallexample"> setlocale (LC_ALL, &quot;&quot;);
1087 bindtextdomain (PACKAGE, LOCALEDIR);
1088 textdomain (PACKAGE);
1089 </pre></td></tr></table>
1090
1091 <p>For a library it is reduced to
1092 </p>
1093 <table><tr><td>&nbsp;</td><td><pre class="smallexample"> bindtextdomain (PACKAGE, LOCALEDIR);
1094 </pre></td></tr></table>
1095
1096 <p>If your library's API doesn't already have an initialization function,
1097 you need to create one, containing at least the <code>bindtextdomain</code>
1098 invocation. However, you usually don't need to export and document this
1099 initialization function: It is sufficient that all entry points of the
1100 library call the initialization function if it hasn't been called before.
1101 The typical idiom used to achieve this is a static boolean variable that
1102 indicates whether the initialization function has been called. If the
1103 library is meant to be used in multithreaded applications, this variable
1104 needs to be marked <code>volatile</code>, so that its value get propagated
1105 between threads. Like this:
1106 </p>
1107 <table><tr><td>&nbsp;</td><td><pre class="example">static volatile bool libfoo_initialized;
1108
1109 static void
1110 libfoo_initialize (void)
1111 {
1112 bindtextdomain (PACKAGE, LOCALEDIR);
1113 libfoo_initialized = true;
1114 }
1115
1116 /* This function is part of the exported API. */
1117 struct foo *
1118 create_foo (...)
1119 {
1120 /* Must ensure the initialization is performed. */
1121 if (!libfoo_initialized)
1122 libfoo_initialize ();
1123 ...
1124 }
1125
1126 /* This function is part of the exported API. The argument must be
1127 non-NULL and have been created through create_foo(). */
1128 int
1129 foo_refcount (struct foo *argument)
1130 {
1131 /* No need to invoke the initialization function here, because
1132 create_foo() must already have been called before. */
1133 ...
1134 }
1135 </pre></td></tr></table>
1136
1137 <p>The more general solution for initialization functions, POSIX
1138 <code>pthread_once</code>, is not needed in this case.
1139 </p>
1140 </li><li>
1141 The usual declaration of the &lsquo;<samp>_</samp>&rsquo; macro in each source file was
1142
1143 <table><tr><td>&nbsp;</td><td><pre class="smallexample">#include &lt;libintl.h&gt;
1144 #define _(String) gettext (String)
1145 </pre></td></tr></table>
1146
1147 <p>for a program. For a library, which has its own translation domain,
1148 it reads like this:
1149 </p>
1150 <table><tr><td>&nbsp;</td><td><pre class="smallexample">#include &lt;libintl.h&gt;
1151 #define _(String) dgettext (PACKAGE, String)
1152 </pre></td></tr></table>
1153
1154 <p>In other words, <code>dgettext</code> is used instead of <code>gettext</code>.
1155 Similarly, the <code>dngettext</code> function should be used in place of the
1156 <code>ngettext</code> function.
1157 </p></li></ol>
1158
1159
1160 <table cellpadding="1" cellspacing="1" border="0">
1161 <tr><td valign="middle" align="left">[<a href="#SEC17" title="Beginning of this chapter or previous chapter"> &lt;&lt; </a>]</td>
1162 <td valign="middle" align="left">[<a href="gettext_5.html#SEC35" title="Next chapter"> &gt;&gt; </a>]</td>
1163 <td valign="middle" align="left"> &nbsp; </td>
1164 <td valign="middle" align="left"> &nbsp; </td>
1165 <td valign="middle" align="left"> &nbsp; </td>
1166 <td valign="middle" align="left"> &nbsp; </td>
1167 <td valign="middle" align="left"> &nbsp; </td>
1168 <td valign="middle" align="left">[<a href="gettext_toc.html#SEC_Top" title="Cover (top) of document">Top</a>]</td>
1169 <td valign="middle" align="left">[<a href="gettext_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
1170 <td valign="middle" align="left">[<a href="gettext_21.html#SEC389" title="Index">Index</a>]</td>
1171 <td valign="middle" align="left">[<a href="gettext_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
1172 </tr></table>
1173 <p>
1174 <font size="-1">
1175 This document was generated by <em>Bruno Haible</em> on <em>February, 21 2024</em> using <a href="https://www.nongnu.org/texi2html/"><em>texi2html 1.78a</em></a>.
1176 </font>
1177 <br>
1178
1179 </p>
1180 </body>
1181 </html>