Mercurial > repos > rliterman > csp2
comparison CSP2/CSP2_env/env-d9b9114564458d9d-741b3de822f2aaca6c6caa4325c4afce/share/doc/gettext/tutorial.html @ 68:5028fdace37b
planemo upload commit 2e9511a184a1ca667c7be0c6321a36dc4e3d116d
author | jpayne |
---|---|
date | Tue, 18 Mar 2025 16:23:26 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
67:0e9998148a16 | 68:5028fdace37b |
---|---|
1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML i18n//EN"> | |
2 <!-- | |
3 Copyright (C) 2004-2005, 2012 Gora Mohanty. | |
4 Written by Gora Mohanty <gora_mohanty@yahoo.co.in>, 2004. | |
5 | |
6 This manual is covered by the GNU FDL. Permission is granted to copy, | |
7 distribute and/or modify this document under the terms of the | |
8 GNU Free Documentation License (FDL), version 1.2. | |
9 A copy of the license is at | |
10 <https://www.gnu.org/licenses/old-licenses/fdl-1.2>. | |
11 --> | |
12 | |
13 <!--Converted with jLaTeX2HTML 2002-2-1 (1.70) JA patch-1.4 | |
14 patched version by: Kenshi Muto, Debian Project. | |
15 LaTeX2HTML 2002-2-1 (1.70), | |
16 original version by: Nikos Drakos, CBLU, University of Leeds | |
17 * revised and updated by: Marcus Hennecke, Ross Moore, Herb Swan | |
18 * with significant contributions from: | |
19 Jens Lippmann, Marek Rouchal, Martin Wilck and others --> | |
20 <HTML> | |
21 <HEAD> | |
22 <TITLE>A tutorial on Native Language Support using GNU gettext</TITLE> | |
23 <META NAME="description" CONTENT="A tutorial on Native Language Support using GNU gettext"> | |
24 <META NAME="keywords" CONTENT="memo"> | |
25 <META NAME="resource-type" CONTENT="document"> | |
26 <META NAME="distribution" CONTENT="global"> | |
27 | |
28 <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8"> | |
29 <META NAME="Generator" CONTENT="jLaTeX2HTML v2002-2-1 JA patch-1.4"> | |
30 <META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css"> | |
31 | |
32 <!-- | |
33 <LINK REL="STYLESHEET" HREF="memo.css"> | |
34 --> | |
35 | |
36 </HEAD> | |
37 | |
38 <BODY > | |
39 | |
40 <!--Navigation Panel | |
41 <DIV CLASS="navigation"> | |
42 <IMG WIDTH="81" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next_inactive" | |
43 SRC="file:/usr/share/latex2html/icons/nx_grp_g.png"> | |
44 <IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up" | |
45 SRC="file:/usr/share/latex2html/icons/up_g.png"> | |
46 <IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous" | |
47 SRC="file:/usr/share/latex2html/icons/prev_g.png"> | |
48 <BR> | |
49 <BR><BR></DIV> | |
50 End of Navigation Panel--> | |
51 | |
52 <H1 ALIGN="CENTER">A tutorial on Native Language Support using GNU gettext</H1><DIV CLASS="author_info"> | |
53 | |
54 <P ALIGN="CENTER"><STRONG>G. Mohanty</STRONG></P> | |
55 <P ALIGN="CENTER"><STRONG>Revision 0.3: 24 July 2004</STRONG></P> | |
56 </DIV> | |
57 | |
58 <H3>Abstract:</H3> | |
59 <DIV CLASS="ABSTRACT"> | |
60 The use of the GNU <TT>gettext</TT> utilities to implement support for native | |
61 languages is described here. Though, the language to be supported is | |
62 considered to be Oriya, the method is generally applicable. Likewise, while | |
63 Linux was used as the platform here, any system using GNU <TT>gettext</TT> should work | |
64 in a similar fashion. | |
65 | |
66 <P> | |
67 We go through a step-by-step description of how to make on-screen messages | |
68 from a toy program to appear in Oriya instead of English; starting from the | |
69 programming and ending with the user's viewpoint. Some discussion is also made | |
70 of how to go about the task of translation. | |
71 </DIV> | |
72 <P> | |
73 <H1><A NAME="SECTION00010000000000000000"> | |
74 Introduction</A> | |
75 </H1> | |
76 Currently, both commercial and free computer software is typically written and | |
77 documented in English. Till recently, little effort was expended towards | |
78 allowing them to interact with the user in languages other than English, thus | |
79 leaving the non-English speaking world at a disadvantage. However, that | |
80 changed with the release of the GNU <TT>gettext</TT> utilities, and nowadays most GNU | |
81 programs are written within a framework that allows easy translation of the | |
82 program message to languages other than English. Provided that translations | |
83 are available, the language used by the program to interact with the user can | |
84 be set at the time of running it. <TT>gettext</TT> manages to achieve this seemingly | |
85 miraculous task in a manner that simplifies the work of both the programmer | |
86 and the translator, and, more importantly, allows them to work independently | |
87 of each other. | |
88 | |
89 <P> | |
90 This article describes how to support native languages under a system using | |
91 the GNU <TT>gettext</TT> utilities. While it should be applicable to other versions of | |
92 <TT>gettext</TT>, the one actually used for the examples here is version | |
93 0.12.1. Another system, called <TT>catgets</TT>, described in the X/Open | |
94 Portability Guide, is also in use, but we shall not discuss that here. | |
95 | |
96 <P> | |
97 | |
98 <H1><A NAME="SECTION00020000000000000000"> | |
99 A simple example</A> | |
100 </H1> | |
101 <A NAME="sec:simple"></A>Our first example of using <TT>gettext</TT> will be the good old Hello World program, | |
102 whose sole function is to print the phrase “Hello, world!” to the terminal. | |
103 The internationalized version of this program might be saved in hello.c as: | |
104 <PRE> | |
105 1 #include <libintl.h> | |
106 2 #include <locale.h> | |
107 3 #include <stdio.h> | |
108 4 #include <stdlib.h> | |
109 5 int main(void) | |
110 6 { | |
111 7 setlocale( LC_ALL, "" ); | |
112 8 bindtextdomain( "hello", "/usr/share/locale" ); | |
113 9 textdomain( "hello" ); | |
114 10 printf( gettext( "Hello, world!\n" ) ); | |
115 11 exit(0); | |
116 12 } | |
117 </PRE> | |
118 Of course, a real program would check the return values of the functions and | |
119 try to deal with any errors, but we have omitted that part of the code for | |
120 clarity. Compile as usual with <TT>gcc -o hello hello.c</TT>. The program should | |
121 be linked to the GNU libintl library, but as this is part of the GNU C | |
122 library, this is done automatically for you under Linux, and other systems | |
123 using glibc. | |
124 | |
125 <H2><A NAME="SECTION00021000000000000000"> | |
126 The programmer's viewpoint</A> | |
127 </H2> | |
128 As expected, when the <TT>hello</TT> executable is run under the default locale | |
129 (usually the C locale) it prints “Hello, world!” in the terminal. Besides | |
130 some initial setup work, the only additional burden faced by the programmer is | |
131 to replace any string to be printed with <TT>gettext(string)</TT>, i.e., to | |
132 instead pass the string as an argument to the <TT>gettext</TT> function. For lazy | |
133 people like myself, the amount of extra typing can be reduced even further by | |
134 a CPP macro, e.g., put this at the beginning of the source code file, | |
135 <PRE> | |
136 #define _(STRING) gettext(STRING) | |
137 </PRE> | |
138 and then use <TT>_(string)</TT> instead of <TT>gettext(string)</TT>. | |
139 | |
140 <P> | |
141 Let us dissect the program line-by-line. | |
142 | |
143 <OL> | |
144 <LI><TT>locale.h</TT> defines C data structures used to hold locale | |
145 information, and is needed by the <TT>setlocale</TT> function. <TT>libintl.h</TT> | |
146 prototypes the GNU text utilities functions, and is needed here by | |
147 <TT>bindtextdomain</TT>, <TT>gettext</TT>, and <TT>textdomain</TT>. | |
148 </LI> | |
149 <LI>The call to <TT>setlocale</TT> () on line 7, with LC_ALL as the first argument | |
150 and an empty string as the second one, initializes the entire current locale | |
151 of the program as per environment variables set by the user. In other words, | |
152 the program locale is initialized to match that of the user. For details see | |
153 “man <TT>setlocale</TT>.” | |
154 </LI> | |
155 <LI>The <TT>bindtextdomain</TT> function on line 8 sets the base directory for the | |
156 message catalogs for a given message domain. A message domain is a set of | |
157 translatable messages, with every software package typically having its own | |
158 domain. Here, we have used “hello” as the name of the message domain for | |
159 our toy program. As the second argument, /usr/share/locale, is the default | |
160 system location for message catalogs, what we are saying here is that we are | |
161 going to place the message catalog in the default system directory. Thus, we | |
162 could have dispensed with the call to <TT>bindtextdomain</TT> here, and this | |
163 function is useful only if the message catalogs are installed in a | |
164 non-standard place, e.g., a packaged software distribution might have | |
165 the catalogs under a po/ directory under its own main directory. See “man | |
166 <TT>bindtextdomain</TT>” for details. | |
167 </LI> | |
168 <LI>The <TT>textdomain</TT> call on line 9 sets the message domain of the current | |
169 program to “hello,” i.e., the name that we are using for our example | |
170 program. “man textdomain” will give usage details for the function. | |
171 </LI> | |
172 <LI>Finally, on line 10, we have replaced what would normally have been, | |
173 <PRE> | |
174 printf( "Hello, world!\n" ); | |
175 </PRE> | |
176 with, | |
177 <PRE> | |
178 printf( gettext( "Hello, world!\n" ) ); | |
179 </PRE> | |
180 (If you are unfamiliar with C, the <!-- MATH | |
181 $\backslash$ | |
182 --> | |
183 <SPAN CLASS="MATH">\</SPAN>n at the end of the string | |
184 produces a newline at the end of the output.) This simple modification to all | |
185 translatable strings allows the translator to work independently from the | |
186 programmer. <TT>gettextize</TT> eases the task of the programmer in adapting a | |
187 package to use GNU <TT>gettext</TT> for the first time, or to upgrade to a newer | |
188 version of <TT>gettext</TT>. | |
189 </LI> | |
190 </OL> | |
191 | |
192 <H2><A NAME="SECTION00022000000000000000"> | |
193 Extracting translatable strings</A> | |
194 </H2> | |
195 Now, it is time to extract the strings to be translated from the program | |
196 source code. This is achieved with <TT>xgettext</TT>, which can be invoked as follows: | |
197 <PRE><FONT color="red"> | |
198 xgettext -d hello -o hello.pot hello.c | |
199 </FONT></PRE> | |
200 This processes the source code in hello.c, saving the output in hello.pot (the | |
201 argument to the -o option). | |
202 The message domain for the program should be specified as the argument | |
203 to the -d option, and should match the domain specified in the call to | |
204 <TT>textdomain</TT> (on line 9 of the program source). Other details on how to use | |
205 <TT>gettext</TT> can be found from “man gettext.” | |
206 | |
207 <P> | |
208 A .pot (portable object template) file is used as the basis for translating | |
209 program messages into any language. To start translation, one can simply copy | |
210 hello.pot to oriya.po (this preserves the template file for later translation | |
211 into a different language). However, the preferred way to do this is by | |
212 use of the <TT>msginit</TT> program, which takes care of correctly setting up some | |
213 default values, | |
214 <PRE><FONT color="red"> | |
215 msginit -l or_IN -o oriya.po -i hello.pot | |
216 </FONT></PRE> | |
217 Here, the -l option defines the locale (an Oriya locale should have been | |
218 installed on your system), and the -i and -o options define the input and | |
219 output files, respectively. If there is only a single .pot file in the | |
220 directory, it will be used as the input file, and the -i option can be | |
221 omitted. For me, the oriya.po file produced by <TT>msginit</TT> would look like: | |
222 <PRE> | |
223 # Oriya translations for PACKAGE package. | |
224 # Copyright (C) 2004 THE PACKAGE'S COPYRIGHT HOLDER | |
225 # This file is distributed under the same license as the PACKAGE package. | |
226 # Gora Mohanty <gora_mohanty@yahoo.co.in>, 2004. | |
227 # | |
228 msgid "" | |
229 msgstr "" | |
230 "Project-Id-Version: PACKAGE VERSION\n" | |
231 "Report-Msgid-Bugs-To: \n" | |
232 "POT-Creation-Date: 2004-06-22 02:22+0530\n" | |
233 "PO-Revision-Date: 2004-06-22 02:38+0530\n" | |
234 "Last-Translator: Gora Mohanty <gora_mohanty@yahoo.co.in>\n" | |
235 "Language-Team: Oriya\n" | |
236 "MIME-Version: 1.0\n" | |
237 "Content-Type: text/plain; charset=UTF-8\n" | |
238 "Content-Transfer-Encoding: 8bit\n" | |
239 | |
240 #: hello.c:10 | |
241 msgid "Hello, world!\n" | |
242 msgstr "" | |
243 </PRE> | |
244 <TT>msginit</TT> prompted for my email address, and probably obtained my real name | |
245 from the system password file. It also filled in values such as the revision | |
246 date, language, character set, presumably using information from the or_IN | |
247 locale. | |
248 | |
249 <P> | |
250 It is important to respect the format of the entries in the .po (portable | |
251 object) file. Each entry has the following structure: | |
252 <PRE> | |
253 WHITE-SPACE | |
254 # TRANSLATOR-COMMENTS | |
255 #. AUTOMATIC-COMMENTS | |
256 #: REFERENCE... | |
257 #, FLAG... | |
258 msgid UNTRANSLATED-STRING | |
259 msgstr TRANSLATED-STRING | |
260 </PRE> | |
261 where, the initial white-space (spaces, tabs, newlines,...), and all | |
262 comments might or might not exist for a particular entry. Comment lines start | |
263 with a '#' as the first character, and there are two kinds: (i) manually | |
264 added translator comments, that have some white-space immediately following the | |
265 '#,' and (ii) automatic comments added and maintained by the <TT>gettext</TT> tools, | |
266 with a non-white-space character after the '#.' The <TT>msgid</TT> line contains | |
267 the untranslated (English) string, if there is one for that PO file entry, and | |
268 the <TT>msgstr</TT> line is where the translated string is to be entered. More on | |
269 this later. For details on the format of PO files see gettext::Basics::PO | |
270 Files:: in the Emacs info-browser (see Appdx. <A HREF="#sec:emacs-info">A</A> for an | |
271 introduction to using the info-browser in Emacs). | |
272 | |
273 <H2><A NAME="SECTION00023000000000000000"> | |
274 Making translations</A> | |
275 </H2> | |
276 The oriya.po file can then be edited to add the translated Oriya | |
277 strings. While the editing can be carried out in any editor if one is careful | |
278 to follow the PO file format, there are several editors that ease the task of | |
279 editing PO files, among them being po-mode in Emacs, <TT>kbabel</TT>, gtranslator, | |
280 poedit, etc. Appdx. <A HREF="#sec:pofile-editors">B</A> describes features of some of | |
281 these editors. | |
282 | |
283 <P> | |
284 The first thing to do is fill in the comments at the beginning and the header | |
285 entry, parts of which have already been filled in by <TT>msginit</TT>. The lines in | |
286 the header entry are pretty much self-explanatory, and details can be found in | |
287 the gettext::Creating::Header Entry:: info node. After that, the remaining | |
288 work consists of typing the Oriya text that is to serve as translations for | |
289 the corresponding English string. For the <TT>msgstr</TT> line in each of the | |
290 remaining entries, add the translated Oriya text between the double quotes; | |
291 the translation corresponding to the English phrase in the <TT>msgid</TT> string | |
292 for the entry. For example, for the phrase “Hello world!<!-- MATH | |
293 $\backslash$ | |
294 --> | |
295 <SPAN CLASS="MATH">\</SPAN>n” in | |
296 oriya.po, we could enter “ନମସ୍କାର<!-- MATH | |
297 $\backslash$ | |
298 --> | |
299 <SPAN CLASS="MATH">\</SPAN>n”. The final | |
300 oriya.po file might look like: | |
301 <PRE> | |
302 # Oriya translations for hello example package. | |
303 # Copyright (C) 2004 Gora Mohanty | |
304 # This file is distributed under the same license as the hello example package. | |
305 # Gora Mohanty <gora_mohanty@yahoo.co.in>, 2004. | |
306 # | |
307 msgid "" | |
308 msgstr "" | |
309 "Project-Id-Version: oriya\n" | |
310 "Report-Msgid-Bugs-To: \n" | |
311 "POT-Creation-Date: 2004-06-22 02:22+0530\n" | |
312 "PO-Revision-Date: 2004-06-22 10:54+0530\n" | |
313 "Last-Translator: Gora Mohanty <gora_mohanty@yahoo.co.in>\n" | |
314 "Language-Team: Oriya\n" | |
315 "MIME-Version: 1.0\n" | |
316 "Content-Type: text/plain; charset=UTF-8\n" | |
317 "Content-Transfer-Encoding: 8bit\n" | |
318 "X-Generator: KBabel 1.3\n" | |
319 | |
320 #: hello.c:10 | |
321 msgid "Hello, world!\n" | |
322 msgstr "ନମସ୍କାର\n" | |
323 </PRE> | |
324 | |
325 <P> | |
326 For editing PO files, I have found the <TT>kbabel</TT> editor suits me the best. The | |
327 only problem is that while Oriya text can be entered directly into <TT>kbabel</TT> | |
328 using the xkb Oriya keyboard layouts [<A | |
329 HREF="memo.html#xkb-oriya-layout">1</A>] and the entries | |
330 are saved properly, the text is not displayed correctly in the <TT>kbabel</TT> window | |
331 if it includes conjuncts. Emacs po-mode is a little restrictive, but strictly | |
332 enforces conformance with the PO file format. The main problem with it is that | |
333 it does not seem currently possible to edit Oriya text in Emacs. <TT>yudit</TT> | |
334 is the best at editing Oriya text, but does not ensure that the PO file format | |
335 is followed. You can play around a bit with these editors to find one that | |
336 suits your personal preferences. One possibility might be to first edit the | |
337 header entry with <TT>kbabel</TT> or Emacs po-mode, and then use <TT>yudit</TT> to enter | |
338 the Oriya text on the <TT>msgstr</TT> lines. | |
339 | |
340 <H2><A NAME="SECTION00024000000000000000"> | |
341 Message catalogs</A> | |
342 </H2> | |
343 <A NAME="sec:catalog"></A>After completing the translations in the oriya.po file, it must be compiled to | |
344 a binary format that can be quickly loaded by the <TT>gettext</TT> tools. To do that, | |
345 use: | |
346 <PRE><FONT color="red"> | |
347 msgfmt -c -v -o hello.mo oriya.po | |
348 </FONT></PRE> | |
349 The -c option does detailed checking of the PO file format, -v makes the | |
350 program verbose, and the output filename is given by the argument to the -o | |
351 option. Note that the base of the output filename should match the message | |
352 domain given in the first arguments to <TT>bindtextdomain</TT> and <TT>textdomain</TT> on | |
353 lines 8 and 9 of the example program in Sec. <A HREF="#sec:simple">2</A>. The .mo | |
354 (machine object) file should be stored in the location whose base directory is | |
355 given by the second argument to <TT>bindtextdomain</TT>. The final location of the | |
356 file will be in the sub-directory LL/LC_MESSAGES or LL_CC/LC_MESSAGES under | |
357 the base directory, where LL stands for a language, and CC for a country. For | |
358 example, as we have chosen the standard location, /usr/share/locale, for our | |
359 base directory, and for us the language and country strings are “or” and | |
360 “IN,” respectively, we will place hello.mo in /usr/share/locale/or_IN. Note | |
361 that you will need super-user privilege to copy hello.mo to this system | |
362 directory. Thus, | |
363 <PRE><FONT color="red"> | |
364 mkdir -p /usr/share/locale/or_IN/LC_MESSAGES | |
365 cp hello.mo /usr/share/locale/or_IN/LC_MESSAGES | |
366 </FONT></PRE> | |
367 | |
368 <H2><A NAME="SECTION00025000000000000000"> | |
369 The user's viewpoint</A> | |
370 </H2> | |
371 Once the message catalogs have been properly installed, any user on the system | |
372 can use the Oriya version of the Hello World program, provided an Oriya locale | |
373 is available. First, change your locale with, | |
374 <PRE><FONT color="red"> | |
375 echo $LANG | |
376 export LANG=or_IN | |
377 </FONT></PRE> | |
378 The first statement shows you the current setting of your locale (this is | |
379 usually en_US, and you will need it to reset the default locale at the end), | |
380 while the second one sets it to an Oriya locale. | |
381 | |
382 <P> | |
383 A Unicode-capable terminal emulator is needed to view Oriya output | |
384 directly. The new versions of both gnome-terminal and konsole (the KDE | |
385 terminal emulator) are Unicode-aware. I will focus on gnome-terminal as it | |
386 seems to have better support for internationalization. gnome-terminal needs to | |
387 be told that the bytes arriving are UTF-8 encoded multibyte sequences. This | |
388 can be done by (a) choosing Terminal <TT>-></TT> Character Coding <TT>-></TT> | |
389 Unicode (UTF-8), or (b) typing “/bin/echo -n -e | |
390 '<!-- MATH | |
391 $\backslash$ | |
392 --> | |
393 <SPAN CLASS="MATH">\</SPAN>033%<!-- MATH | |
394 $\backslash$ | |
395 --> | |
396 <SPAN CLASS="MATH">\</SPAN>G'” in the terminal, or (c) by running | |
397 /bin/unicode_start. Likewise, you can revert to the default locale by (a) | |
398 choosing Terminal <TT>-></TT> Character Coding <TT>-></TT> Current Locale | |
399 (ISO-8859-1), or (b) “/bin/echo -n -e '<!-- MATH | |
400 $\backslash$ | |
401 --> | |
402 <SPAN CLASS="MATH">\</SPAN>033%<!-- MATH | |
403 $\backslash$ | |
404 --> | |
405 <SPAN CLASS="MATH">\</SPAN>@',” or | |
406 (c) by running /bin/unicode_stop. Now, running the example program (after | |
407 compiling with gcc as described in Sec. <A HREF="#sec:simple">2</A>) with, | |
408 <PRE><FONT color="red"> | |
409 ./hello | |
410 </FONT></PRE> | |
411 should give you output in Oriya. Please note that conjuncts will most likely | |
412 be displayed with a “halant” as the terminal probably does not render Indian | |
413 language fonts correctly. Also, as most terminal emulators assume fixed-width | |
414 fonts, the results are hardly likely to be aesthetically appealing. | |
415 | |
416 <P> | |
417 An alternative is to save the program output in a file, and view it with | |
418 <TT>yudit</TT> which will render the glyphs correctly. Thus, | |
419 <PRE><FONT color="red"> | |
420 ./hello > junk | |
421 yudit junk | |
422 </FONT></PRE> | |
423 Do not forget to reset the locale before resuming usual work in the | |
424 terminal. Else, your English characters might look funny. | |
425 | |
426 <P> | |
427 While all this should give the average user some pleasure in being able to see | |
428 Oriya output from a program without a whole lot of work, it should be kept in | |
429 mind that we are still far from our desired goal. Hopefully, one day the | |
430 situation will be such that rather than deriving special pleasure from it, | |
431 users take it for granted that Oriya should be available and are upset | |
432 otherwise. | |
433 | |
434 <P> | |
435 | |
436 <H1><A NAME="SECTION00030000000000000000"> | |
437 Adding complications: program upgrade</A> | |
438 </H1> | |
439 The previous section presented a simple example of how Oriya language support | |
440 could be added to a C program. Like all programs, we might now wish to further | |
441 enhance it. For example, we could include a greeting to the user by adding | |
442 another <TT>printf</TT> statement after the first one. Our new hello.c source | |
443 code might look like this: | |
444 <PRE> | |
445 1 #include <libintl.h> | |
446 2 #include <locale.h> | |
447 3 #include <stdio.h> | |
448 4 #include <stdlib.h> | |
449 5 int main(void) | |
450 6 { | |
451 7 setlocale( LC_ALL, "" ); | |
452 8 bindtextdomain( "hello", "/usr/share/locale" ); | |
453 9 textdomain( "hello" ); | |
454 10 printf( gettext( "Hello, world!\n" ) ); | |
455 11 printf( gettext( "How are you\n" ) ); | |
456 12 exit(0); | |
457 13 } | |
458 </PRE> | |
459 For such a small change, it would be simple enough to just repeat the above | |
460 cycle of extracting the relevant English text, translating it to Oriya, and | |
461 preparing a new message catalog. We can even simplify the work by cutting and | |
462 pasting most of the old oriya.po file into the new one. However, real programs | |
463 will have thousands of such strings, and we would like to be able to translate | |
464 only the changed strings, and have the <TT>gettext</TT> utilities handle the drudgery | |
465 of combining the new translations with the old ones. This is indeed possible. | |
466 | |
467 <H2><A NAME="SECTION00031000000000000000"> | |
468 Merging old and new translations</A> | |
469 </H2> | |
470 As before, extract the translatable strings from hello.c to a new portable | |
471 object template file, hello-new.pot, using <TT>xgettext</TT>, | |
472 <PRE><FONT color="red"> | |
473 xgettext -d hello -o hello-new.pot hello.c | |
474 </FONT></PRE> | |
475 Now, we use a new program, <TT>msgmerge</TT>, to merge the existing .po file with | |
476 translations into the new template file, viz., | |
477 <PRE><FONT color="red"> | |
478 msgmerge -U oriya.po hello-new.pot | |
479 </FONT></PRE> | |
480 The -U option updates the existing | |
481 .po file, oriya.po. We could have chosen to instead create a new .po file by | |
482 using “-o <SPAN CLASS="MATH"><</SPAN>filename<SPAN CLASS="MATH">></SPAN>” instead of -U. The updated .po file will still | |
483 have the old translations embedded in it, and new entries with untranslated | |
484 <TT>msgid</TT> lines. For us, the new lines in oriya.po will look like, | |
485 <PRE> | |
486 #: hello.c:11 | |
487 msgid "How are you?\n" | |
488 msgstr "" | |
489 </PRE> | |
490 For the new translation, we could use, “ଆପଣ | |
491 କିପରି ଅଛନ୍ତି?” in | |
492 place of the English phrase “How are you?” The updated oriya.po file, | |
493 including the translation might look like: | |
494 <PRE> | |
495 # Oriya translations for hello example package. | |
496 # Copyright (C) 2004 Gora Mohanty | |
497 # This file is distributed under the same license as the hello examplepackage. | |
498 # Gora Mohanty <gora_mohanty@yahoo.co.in>, 2004. | |
499 # | |
500 msgid "" | |
501 msgstr "" | |
502 "Project-Id-Version: oriya\n" | |
503 "Report-Msgid-Bugs-To: \n" | |
504 "POT-Creation-Date: 2004-06-23 14:30+0530\n" | |
505 "PO-Revision-Date: 2004-06-22 10:54+0530\n" | |
506 "Last-Translator: Gora Mohanty <gora_mohanty@yahoo.co.in>\n" | |
507 "Language-Team: Oriya\n" | |
508 "MIME-Version: 1.0\n" | |
509 "Content-Type: text/plain; charset=UTF-8\n" | |
510 "Content-Transfer-Encoding: 8bit\n" | |
511 "X-Generator: KBabel 1.3\n" | |
512 | |
513 #: hello.c:10 | |
514 msgid "Hello, world!\n" | |
515 msgstr "ନମସ୍କାର\n" | |
516 | |
517 #: hello.c:11 | |
518 msgid "How are you?\n" | |
519 msgstr "ଆପଣ କିପରି ଅଛନ୍ତି?\n" | |
520 </PRE> | |
521 | |
522 <P> | |
523 Compile oriya.po to a machine object file, and install in the appropriate | |
524 place as in Sec. <A HREF="#sec:catalog">2.4</A>. Thus, | |
525 <PRE><FONT color="red"> | |
526 msgfmt -c -v -o hello.mo oriya.po | |
527 mkdir -p /usr/share/locale/or_IN/LC_MESSAGES | |
528 cp hello.mo /usr/share/locale/or_IN/LC_MESSAGES | |
529 </FONT></PRE> | |
530 You can test the Oriya output as above, after recompiling hello.c and running | |
531 it in an Oriya locale. | |
532 | |
533 <P> | |
534 | |
535 <H1><A NAME="SECTION00040000000000000000"> | |
536 More about <TT>gettext</TT> </A> | |
537 </H1> | |
538 The GNU <TT>gettext</TT> info pages provide a well-organized and complete description | |
539 of the <TT>gettext</TT> utilities and their usage for enabling Native Language | |
540 Support. One should, at the very least, read the introductory material at | |
541 gettext::Introduction::, and the suggested references in | |
542 gettext::Conclusion::References::. Besides the <TT>gettext</TT> utilities described in | |
543 this document, various other programs to manipulate .po files are discussed in | |
544 gettext:Manipulating::. Finally, support for programming languages other than | |
545 C/C++ is discussed in gettext::Programming Languages::. | |
546 | |
547 <P> | |
548 | |
549 <H1><A NAME="SECTION00050000000000000000"> | |
550 The work of translation</A> | |
551 </H1> | |
552 Besides the obvious program message strings that have been the sole focus of | |
553 our discussion here, there are many other things that require translation, | |
554 including GUI messages, command-line option strings, configuration files, | |
555 program documentation, etc. Besides these obvious aspects, there are a | |
556 significant number of programs and/or scripts that are automatically generated | |
557 by other programs. These generated programs might also themselves require | |
558 translation. So, in any effort to provide support for a given native language, | |
559 carrying out the translation and keeping up with program updates becomes a | |
560 major part of the undertaking, requiring a continuing commitment from the | |
561 language team. A plan has been outlined for the Oriya localization | |
562 project [<A | |
563 HREF="memo.html#url:oriya-trans-plan">2</A>]. | |
564 | |
565 <P> | |
566 | |
567 <H1><A NAME="SECTION00060000000000000000"> | |
568 Acknowledgments</A> | |
569 </H1> | |
570 Extensive use has obviously been made of the GNU <TT>gettext</TT> manual in preparing | |
571 this document. I have also been helped by an article in the Linux | |
572 Journal [<A | |
573 HREF="memo.html#url:lj-translation">3</A>]. | |
574 | |
575 <P> | |
576 This work is part of the project for enabling the use of Oriya under Linux. I | |
577 thank my uncle, N. M. Pattnaik, for conceiving of the project. We have all | |
578 benefited from the discussions amidst the group of people working on this | |
579 project. On the particular issue of translation, the help of H. R. Pansari, | |
580 A. Nayak, and M. Chand is much appreciated. | |
581 | |
582 <H1><A NAME="SECTION00070000000000000000"> | |
583 The Emacs info browser</A> | |
584 </H1> | |
585 <A NAME="sec:emacs-info"></A>You can start up Emacs from the command-line by typing “emacs,” or “emacs | |
586 <SPAN CLASS="MATH"><</SPAN>filename<SPAN CLASS="MATH">></SPAN>.” It can be started from the menu in some desktops, e.g., on | |
587 my GNOME desktop, it is under Main Menu <TT>-></TT> Programming <TT>-></TT> | |
588 Emacs. If you are unfamiliar with Emacs, a tutorial can be started by typing | |
589 “C-h t” in an Emacs window, or from the Help item in the menubar at the | |
590 top. Emacs makes extensive use of the Control (sometimes labelled as “CTRL” | |
591 or “CTL”) and Meta (sometimes labelled as “Edit” or “Alt”) keys. In | |
592 Emacs parlance, a hyphenated sequence, such as “C-h” means to press the | |
593 Control and ‘h’ key simultaneously, while “C-h t” would mean to press the | |
594 Control and ‘h’ key together, release them, and press the ‘t’ key. Similarly, | |
595 “M-x” is used to indicate that the Meta and ‘x’ keys should be pressed at | |
596 the same time. | |
597 | |
598 <P> | |
599 The info browser can be started by typing “C-h i” in Emacs. The first time | |
600 you do this, it will briefly list some commands available inside the info | |
601 browser, and present you with a menu of major topics. Each menu item, or | |
602 cross-reference is hyperlinked to the appropriate node, and you can visit that | |
603 node either by moving the cursor to the item and pressing Enter, or by | |
604 clicking on it with the middle mouse button. To get to the <TT>gettext</TT> menu items, | |
605 you can either scroll down to the line, | |
606 <PRE> | |
607 * gettext: (gettext). GNU gettext utilities. | |
608 </PRE> | |
609 and visit that node. Or, as it is several pages down, you can locate it using | |
610 “I-search.” Type “C-s” to enter “I-search” which will then prompt you | |
611 for a string in the mini-buffer at the bottom of the window. This is an | |
612 incremental search, so that Emacs will keep moving you forward through the | |
613 buffer as you are entering your search string. If you have reached the last | |
614 occurrence of the search string in the current buffer, you will get a message | |
615 saying “Failing I-search: ...” on pressing “C-s.” At that point, press | |
616 “C-s” again to resume the search at the beginning of the buffer. Likewise, | |
617 “C-r” incrementally searches backwards from the present location. | |
618 | |
619 <P> | |
620 Info nodes are listed in this document with a “::” separator, so | |
621 that one can go to the gettext::Creating::Header Entry:: by visiting the | |
622 “gettext” node from the main info menu, navigating to the “Creating” | |
623 node, and following that to the “Header Entry” node. | |
624 | |
625 <P> | |
626 A stand-alone info browser, independent of Emacs, is also available on many | |
627 systems. Thus, the <TT>gettext</TT> info page can also be accessed by typing | |
628 “info gettext” in a terminal. <TT>xinfo</TT> is an X application serving as an | |
629 info browser, so that if it is installed, typing “xinfo gettext” from the | |
630 command line will open a new browser window with the <TT>gettext</TT> info page. | |
631 | |
632 <P> | |
633 | |
634 <H1><A NAME="SECTION00080000000000000000"> | |
635 PO file editors</A> | |
636 </H1> | |
637 <A NAME="sec:pofile-editors"></A>While the <TT>yudit</TT> editor is adequate for our present purposes, and we are | |
638 planning on using that as it is platform-independent, and currently the best | |
639 at rendering Oriya. This section describes some features of some editors that | |
640 are specialized for editing PO files under Linux. This is still work in | |
641 progress, as I am in the process of trying out different editors before | |
642 settling on one. The ones considered here are: Emacs in po-mode, <TT>poedit</TT>, | |
643 <TT>kbabel</TT>, and <TT>gtranslator</TT>. | |
644 | |
645 <H2><A NAME="SECTION00081000000000000000"> | |
646 Emacs PO mode</A> | |
647 </H2> | |
648 Emacs should automatically enter po-mode when you load a .po file, as | |
649 indicated by “PO” in the modeline at the bottom. The window is made | |
650 read-only, so that you can edit the .po file only through special commands. A | |
651 description of Emacs po-mode can be found under the gettext::Basics info node, | |
652 or type ‘h’ or ‘?’ in a po-mode window for a list of available commands. While | |
653 I find Emacs po-mode quite restrictive, this is probably due to unfamiliarity | |
654 with it. Its main advantage is that it imposes rigid conformance to the PO | |
655 file format, and checks the file format when closing the .po file | |
656 buffer. Emacs po-mode is not useful for Oriya translation, as I know of no way | |
657 to directly enter Oriya text under Emacs. | |
658 | |
659 <H2><A NAME="SECTION00082000000000000000"> | |
660 poedit</A> | |
661 </H2> | |
662 XXX: in preparation. | |
663 | |
664 <H2><A NAME="SECTION00083000000000000000"> | |
665 KDE: the kbabel editor</A> | |
666 </H2> | |
667 <TT>kbabel</TT> [<A | |
668 HREF="memo.html#url:kbabel">4</A>] is a more user-friendly and configurable editor than | |
669 either of Emacs po-mode or <TT>poedit</TT>. It is integrated into KDE, and offers | |
670 extensive contextual help. Besides support for various PO file features, it | |
671 has a plugin framework for dictionaries, that allows consistency checks and | |
672 translation suggestions. | |
673 | |
674 <H2><A NAME="SECTION00084000000000000000"> | |
675 GNOME: the gtranslator editor</A> | |
676 </H2> | |
677 XXX: in preparation. | |
678 | |
679 <H2><A NAME="SECTION00090000000000000000"> | |
680 Bibliography</A> | |
681 </H2><DL COMPACT><DD><P></P><DT><A NAME="xkb-oriya-layout">1</A> | |
682 <DD> | |
683 G. Mohanty, | |
684 <BR>A practical primer for using Oriya under Linux, v0.3, | |
685 <BR><TT><A NAME="tex2html1" | |
686 HREF="http://oriya.sarovar.org/docs/getting_started/index.html">http://oriya.sarovar.org/docs/getting_started/index.html</A></TT>, 2004, | |
687 <BR>Sec. 6.2 describes the xkb layouts for Oriya. | |
688 | |
689 <P></P><DT><A NAME="url:oriya-trans-plan">2</A> | |
690 <DD> | |
691 G. Mohanty, | |
692 <BR>A plan for Oriya localization, v0.1, | |
693 <BR><TT><A NAME="tex2html2" | |
694 HREF="http://oriya.sarovar.org/docs/translation_plan/index.html">http://oriya.sarovar.org/docs/translation_plan/index.html</A></TT>, | |
695 2004. | |
696 | |
697 <P></P><DT><A NAME="url:lj-translation">3</A> | |
698 <DD> | |
699 Linux Journal article on internationalization, | |
700 <BR><TT><A NAME="tex2html3" | |
701 HREF="https://www.linuxjournal.com/article/3023">https://www.linuxjournal.com/article/3023</A></TT>. | |
702 | |
703 <P></P><DT><A NAME="url:kbabel">4</A> | |
704 <DD> | |
705 Features of the kbabel editor, | |
706 <BR><TT><A NAME="tex2html4" | |
707 HREF="http://i18n.kde.org/tools/kbabel/features.html">http://i18n.kde.org/tools/kbabel/features.html</A></TT>. | |
708 </DL> | |
709 | |
710 <H1><A NAME="SECTION000100000000000000000"> | |
711 About this document ...</A> | |
712 </H1> | |
713 <STRONG>A tutorial on Native Language Support using GNU gettext</STRONG><P> | |
714 This document was generated using the | |
715 <A HREF="http://www.latex2html.org/"><STRONG>LaTeX</STRONG>2<tt>HTML</tt></A> translator Version 2002-2-1 (1.70) | |
716 <P> | |
717 Copyright © 1993, 1994, 1995, 1996, | |
718 <A HREF="http://cbl.leeds.ac.uk/nikos/personal.html">Nikos Drakos</A>, | |
719 Computer Based Learning Unit, University of Leeds. | |
720 <BR>Copyright © 1997, 1998, 1999, | |
721 <A HREF="http://www.maths.mq.edu.au/~ross/">Ross Moore</A>, | |
722 Mathematics Department, Macquarie University, Sydney. | |
723 <P> | |
724 The command line arguments were: <BR> | |
725 <STRONG>latex2html</STRONG> <TT>-no_math -html_version 4.0,math,unicode,i18n,tables -split 0 memo</TT> | |
726 <P> | |
727 The translation was initiated by Gora Mohanty on 2004-07-24 | |
728 <DIV CLASS="navigation"><HR> | |
729 | |
730 <!--Navigation Panel | |
731 <IMG WIDTH="81" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="next_inactive" | |
732 SRC="file:/usr/share/latex2html/icons/nx_grp_g.png"> | |
733 <IMG WIDTH="26" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="up" | |
734 SRC="file:/usr/share/latex2html/icons/up_g.png"> | |
735 <IMG WIDTH="63" HEIGHT="24" ALIGN="BOTTOM" BORDER="0" ALT="previous" | |
736 SRC="file:/usr/share/latex2html/icons/prev_g.png"> | |
737 <BR></DIV> | |
738 End of Navigation Panel--> | |
739 | |
740 <ADDRESS> | |
741 Gora Mohanty | |
742 2004-07-24 | |
743 </ADDRESS> | |
744 </BODY> | |
745 </HTML> |