jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:
jpayne@68:jpayne@68: We go through a step-by-step description of how to make on-screen messages jpayne@68: from a toy program to appear in Oriya instead of English; starting from the jpayne@68: programming and ending with the user's viewpoint. Some discussion is also made jpayne@68: of how to go about the task of translation. jpayne@68:
jpayne@68:
jpayne@68: This article describes how to support native languages under a system using jpayne@68: the GNU gettext utilities. While it should be applicable to other versions of jpayne@68: gettext, the one actually used for the examples here is version jpayne@68: 0.12.1. Another system, called catgets, described in the X/Open jpayne@68: Portability Guide, is also in use, but we shall not discuss that here. jpayne@68: jpayne@68:
jpayne@68: jpayne@68:
jpayne@68: 1 #include <libintl.h> jpayne@68: 2 #include <locale.h> jpayne@68: 3 #include <stdio.h> jpayne@68: 4 #include <stdlib.h> jpayne@68: 5 int main(void) jpayne@68: 6 { jpayne@68: 7 setlocale( LC_ALL, "" ); jpayne@68: 8 bindtextdomain( "hello", "/usr/share/locale" ); jpayne@68: 9 textdomain( "hello" ); jpayne@68: 10 printf( gettext( "Hello, world!\n" ) ); jpayne@68: 11 exit(0); jpayne@68: 12 } jpayne@68:jpayne@68: Of course, a real program would check the return values of the functions and jpayne@68: try to deal with any errors, but we have omitted that part of the code for jpayne@68: clarity. Compile as usual with gcc -o hello hello.c. The program should jpayne@68: be linked to the GNU libintl library, but as this is part of the GNU C jpayne@68: library, this is done automatically for you under Linux, and other systems jpayne@68: using glibc. jpayne@68: jpayne@68:
jpayne@68: #define _(STRING) gettext(STRING) jpayne@68:jpayne@68: and then use _(string) instead of gettext(string). jpayne@68: jpayne@68:
jpayne@68: Let us dissect the program line-by-line. jpayne@68: jpayne@68:
jpayne@68: printf( "Hello, world!\n" ); jpayne@68:jpayne@68: with, jpayne@68:
jpayne@68: printf( gettext( "Hello, world!\n" ) ); jpayne@68:jpayne@68: (If you are unfamiliar with C, the jpayne@68: \n at the end of the string jpayne@68: produces a newline at the end of the output.) This simple modification to all jpayne@68: translatable strings allows the translator to work independently from the jpayne@68: programmer. gettextize eases the task of the programmer in adapting a jpayne@68: package to use GNU gettext for the first time, or to upgrade to a newer jpayne@68: version of gettext. jpayne@68:
jpayne@68: xgettext -d hello -o hello.pot hello.c jpayne@68:jpayne@68: This processes the source code in hello.c, saving the output in hello.pot (the jpayne@68: argument to the -o option). jpayne@68: The message domain for the program should be specified as the argument jpayne@68: to the -d option, and should match the domain specified in the call to jpayne@68: textdomain (on line 9 of the program source). Other details on how to use jpayne@68: gettext can be found from “man gettext.” jpayne@68: jpayne@68:
jpayne@68: A .pot (portable object template) file is used as the basis for translating jpayne@68: program messages into any language. To start translation, one can simply copy jpayne@68: hello.pot to oriya.po (this preserves the template file for later translation jpayne@68: into a different language). However, the preferred way to do this is by jpayne@68: use of the msginit program, which takes care of correctly setting up some jpayne@68: default values, jpayne@68:
jpayne@68: msginit -l or_IN -o oriya.po -i hello.pot jpayne@68:jpayne@68: Here, the -l option defines the locale (an Oriya locale should have been jpayne@68: installed on your system), and the -i and -o options define the input and jpayne@68: output files, respectively. If there is only a single .pot file in the jpayne@68: directory, it will be used as the input file, and the -i option can be jpayne@68: omitted. For me, the oriya.po file produced by msginit would look like: jpayne@68:
jpayne@68: # Oriya translations for PACKAGE package. jpayne@68: # Copyright (C) 2004 THE PACKAGE'S COPYRIGHT HOLDER jpayne@68: # This file is distributed under the same license as the PACKAGE package. jpayne@68: # Gora Mohanty <gora_mohanty@yahoo.co.in>, 2004. jpayne@68: # jpayne@68: msgid "" jpayne@68: msgstr "" jpayne@68: "Project-Id-Version: PACKAGE VERSION\n" jpayne@68: "Report-Msgid-Bugs-To: \n" jpayne@68: "POT-Creation-Date: 2004-06-22 02:22+0530\n" jpayne@68: "PO-Revision-Date: 2004-06-22 02:38+0530\n" jpayne@68: "Last-Translator: Gora Mohanty <gora_mohanty@yahoo.co.in>\n" jpayne@68: "Language-Team: Oriya\n" jpayne@68: "MIME-Version: 1.0\n" jpayne@68: "Content-Type: text/plain; charset=UTF-8\n" jpayne@68: "Content-Transfer-Encoding: 8bit\n" jpayne@68: jpayne@68: #: hello.c:10 jpayne@68: msgid "Hello, world!\n" jpayne@68: msgstr "" jpayne@68:jpayne@68: msginit prompted for my email address, and probably obtained my real name jpayne@68: from the system password file. It also filled in values such as the revision jpayne@68: date, language, character set, presumably using information from the or_IN jpayne@68: locale. jpayne@68: jpayne@68:
jpayne@68: It is important to respect the format of the entries in the .po (portable jpayne@68: object) file. Each entry has the following structure: jpayne@68:
jpayne@68: WHITE-SPACE jpayne@68: # TRANSLATOR-COMMENTS jpayne@68: #. AUTOMATIC-COMMENTS jpayne@68: #: REFERENCE... jpayne@68: #, FLAG... jpayne@68: msgid UNTRANSLATED-STRING jpayne@68: msgstr TRANSLATED-STRING jpayne@68:jpayne@68: where, the initial white-space (spaces, tabs, newlines,...), and all jpayne@68: comments might or might not exist for a particular entry. Comment lines start jpayne@68: with a '#' as the first character, and there are two kinds: (i) manually jpayne@68: added translator comments, that have some white-space immediately following the jpayne@68: '#,' and (ii) automatic comments added and maintained by the gettext tools, jpayne@68: with a non-white-space character after the '#.' The msgid line contains jpayne@68: the untranslated (English) string, if there is one for that PO file entry, and jpayne@68: the msgstr line is where the translated string is to be entered. More on jpayne@68: this later. For details on the format of PO files see gettext::Basics::PO jpayne@68: Files:: in the Emacs info-browser (see Appdx. A for an jpayne@68: introduction to using the info-browser in Emacs). jpayne@68: jpayne@68:
jpayne@68: The first thing to do is fill in the comments at the beginning and the header jpayne@68: entry, parts of which have already been filled in by msginit. The lines in jpayne@68: the header entry are pretty much self-explanatory, and details can be found in jpayne@68: the gettext::Creating::Header Entry:: info node. After that, the remaining jpayne@68: work consists of typing the Oriya text that is to serve as translations for jpayne@68: the corresponding English string. For the msgstr line in each of the jpayne@68: remaining entries, add the translated Oriya text between the double quotes; jpayne@68: the translation corresponding to the English phrase in the msgid string jpayne@68: for the entry. For example, for the phrase “Hello world! jpayne@68: \n” in jpayne@68: oriya.po, we could enter “ନମସ୍କାର jpayne@68: \n”. The final jpayne@68: oriya.po file might look like: jpayne@68:
jpayne@68: # Oriya translations for hello example package. jpayne@68: # Copyright (C) 2004 Gora Mohanty jpayne@68: # This file is distributed under the same license as the hello example package. jpayne@68: # Gora Mohanty <gora_mohanty@yahoo.co.in>, 2004. jpayne@68: # jpayne@68: msgid "" jpayne@68: msgstr "" jpayne@68: "Project-Id-Version: oriya\n" jpayne@68: "Report-Msgid-Bugs-To: \n" jpayne@68: "POT-Creation-Date: 2004-06-22 02:22+0530\n" jpayne@68: "PO-Revision-Date: 2004-06-22 10:54+0530\n" jpayne@68: "Last-Translator: Gora Mohanty <gora_mohanty@yahoo.co.in>\n" jpayne@68: "Language-Team: Oriya\n" jpayne@68: "MIME-Version: 1.0\n" jpayne@68: "Content-Type: text/plain; charset=UTF-8\n" jpayne@68: "Content-Transfer-Encoding: 8bit\n" jpayne@68: "X-Generator: KBabel 1.3\n" jpayne@68: jpayne@68: #: hello.c:10 jpayne@68: msgid "Hello, world!\n" jpayne@68: msgstr "ନମସ୍କାର\n" jpayne@68:jpayne@68: jpayne@68:
jpayne@68: For editing PO files, I have found the kbabel editor suits me the best. The jpayne@68: only problem is that while Oriya text can be entered directly into kbabel jpayne@68: using the xkb Oriya keyboard layouts [1] and the entries jpayne@68: are saved properly, the text is not displayed correctly in the kbabel window jpayne@68: if it includes conjuncts. Emacs po-mode is a little restrictive, but strictly jpayne@68: enforces conformance with the PO file format. The main problem with it is that jpayne@68: it does not seem currently possible to edit Oriya text in Emacs. yudit jpayne@68: is the best at editing Oriya text, but does not ensure that the PO file format jpayne@68: is followed. You can play around a bit with these editors to find one that jpayne@68: suits your personal preferences. One possibility might be to first edit the jpayne@68: header entry with kbabel or Emacs po-mode, and then use yudit to enter jpayne@68: the Oriya text on the msgstr lines. jpayne@68: jpayne@68:
jpayne@68: msgfmt -c -v -o hello.mo oriya.po jpayne@68:jpayne@68: The -c option does detailed checking of the PO file format, -v makes the jpayne@68: program verbose, and the output filename is given by the argument to the -o jpayne@68: option. Note that the base of the output filename should match the message jpayne@68: domain given in the first arguments to bindtextdomain and textdomain on jpayne@68: lines 8 and 9 of the example program in Sec. 2. The .mo jpayne@68: (machine object) file should be stored in the location whose base directory is jpayne@68: given by the second argument to bindtextdomain. The final location of the jpayne@68: file will be in the sub-directory LL/LC_MESSAGES or LL_CC/LC_MESSAGES under jpayne@68: the base directory, where LL stands for a language, and CC for a country. For jpayne@68: example, as we have chosen the standard location, /usr/share/locale, for our jpayne@68: base directory, and for us the language and country strings are “or” and jpayne@68: “IN,” respectively, we will place hello.mo in /usr/share/locale/or_IN. Note jpayne@68: that you will need super-user privilege to copy hello.mo to this system jpayne@68: directory. Thus, jpayne@68:
jpayne@68: mkdir -p /usr/share/locale/or_IN/LC_MESSAGES jpayne@68: cp hello.mo /usr/share/locale/or_IN/LC_MESSAGES jpayne@68:jpayne@68: jpayne@68:
jpayne@68: echo $LANG jpayne@68: export LANG=or_IN jpayne@68:jpayne@68: The first statement shows you the current setting of your locale (this is jpayne@68: usually en_US, and you will need it to reset the default locale at the end), jpayne@68: while the second one sets it to an Oriya locale. jpayne@68: jpayne@68:
jpayne@68: A Unicode-capable terminal emulator is needed to view Oriya output jpayne@68: directly. The new versions of both gnome-terminal and konsole (the KDE jpayne@68: terminal emulator) are Unicode-aware. I will focus on gnome-terminal as it jpayne@68: seems to have better support for internationalization. gnome-terminal needs to jpayne@68: be told that the bytes arriving are UTF-8 encoded multibyte sequences. This jpayne@68: can be done by (a) choosing Terminal -> Character Coding -> jpayne@68: Unicode (UTF-8), or (b) typing “/bin/echo -n -e jpayne@68: ' jpayne@68: \033% jpayne@68: \G'” in the terminal, or (c) by running jpayne@68: /bin/unicode_start. Likewise, you can revert to the default locale by (a) jpayne@68: choosing Terminal -> Character Coding -> Current Locale jpayne@68: (ISO-8859-1), or (b) “/bin/echo -n -e ' jpayne@68: \033% jpayne@68: \@',” or jpayne@68: (c) by running /bin/unicode_stop. Now, running the example program (after jpayne@68: compiling with gcc as described in Sec. 2) with, jpayne@68:
jpayne@68: ./hello jpayne@68:jpayne@68: should give you output in Oriya. Please note that conjuncts will most likely jpayne@68: be displayed with a “halant” as the terminal probably does not render Indian jpayne@68: language fonts correctly. Also, as most terminal emulators assume fixed-width jpayne@68: fonts, the results are hardly likely to be aesthetically appealing. jpayne@68: jpayne@68:
jpayne@68: An alternative is to save the program output in a file, and view it with jpayne@68: yudit which will render the glyphs correctly. Thus, jpayne@68:
jpayne@68: ./hello > junk jpayne@68: yudit junk jpayne@68:jpayne@68: Do not forget to reset the locale before resuming usual work in the jpayne@68: terminal. Else, your English characters might look funny. jpayne@68: jpayne@68:
jpayne@68: While all this should give the average user some pleasure in being able to see jpayne@68: Oriya output from a program without a whole lot of work, it should be kept in jpayne@68: mind that we are still far from our desired goal. Hopefully, one day the jpayne@68: situation will be such that rather than deriving special pleasure from it, jpayne@68: users take it for granted that Oriya should be available and are upset jpayne@68: otherwise. jpayne@68: jpayne@68:
jpayne@68: jpayne@68:
jpayne@68: 1 #include <libintl.h> jpayne@68: 2 #include <locale.h> jpayne@68: 3 #include <stdio.h> jpayne@68: 4 #include <stdlib.h> jpayne@68: 5 int main(void) jpayne@68: 6 { jpayne@68: 7 setlocale( LC_ALL, "" ); jpayne@68: 8 bindtextdomain( "hello", "/usr/share/locale" ); jpayne@68: 9 textdomain( "hello" ); jpayne@68: 10 printf( gettext( "Hello, world!\n" ) ); jpayne@68: 11 printf( gettext( "How are you\n" ) ); jpayne@68: 12 exit(0); jpayne@68: 13 } jpayne@68:jpayne@68: For such a small change, it would be simple enough to just repeat the above jpayne@68: cycle of extracting the relevant English text, translating it to Oriya, and jpayne@68: preparing a new message catalog. We can even simplify the work by cutting and jpayne@68: pasting most of the old oriya.po file into the new one. However, real programs jpayne@68: will have thousands of such strings, and we would like to be able to translate jpayne@68: only the changed strings, and have the gettext utilities handle the drudgery jpayne@68: of combining the new translations with the old ones. This is indeed possible. jpayne@68: jpayne@68:
jpayne@68: xgettext -d hello -o hello-new.pot hello.c jpayne@68:jpayne@68: Now, we use a new program, msgmerge, to merge the existing .po file with jpayne@68: translations into the new template file, viz., jpayne@68:
jpayne@68: msgmerge -U oriya.po hello-new.pot jpayne@68:jpayne@68: The -U option updates the existing jpayne@68: .po file, oriya.po. We could have chosen to instead create a new .po file by jpayne@68: using “-o <filename>” instead of -U. The updated .po file will still jpayne@68: have the old translations embedded in it, and new entries with untranslated jpayne@68: msgid lines. For us, the new lines in oriya.po will look like, jpayne@68:
jpayne@68: #: hello.c:11 jpayne@68: msgid "How are you?\n" jpayne@68: msgstr "" jpayne@68:jpayne@68: For the new translation, we could use, “ଆପଣ jpayne@68: କିପରି ଅଛନ୍ତି?” in jpayne@68: place of the English phrase “How are you?” The updated oriya.po file, jpayne@68: including the translation might look like: jpayne@68:
jpayne@68: # Oriya translations for hello example package. jpayne@68: # Copyright (C) 2004 Gora Mohanty jpayne@68: # This file is distributed under the same license as the hello examplepackage. jpayne@68: # Gora Mohanty <gora_mohanty@yahoo.co.in>, 2004. jpayne@68: # jpayne@68: msgid "" jpayne@68: msgstr "" jpayne@68: "Project-Id-Version: oriya\n" jpayne@68: "Report-Msgid-Bugs-To: \n" jpayne@68: "POT-Creation-Date: 2004-06-23 14:30+0530\n" jpayne@68: "PO-Revision-Date: 2004-06-22 10:54+0530\n" jpayne@68: "Last-Translator: Gora Mohanty <gora_mohanty@yahoo.co.in>\n" jpayne@68: "Language-Team: Oriya\n" jpayne@68: "MIME-Version: 1.0\n" jpayne@68: "Content-Type: text/plain; charset=UTF-8\n" jpayne@68: "Content-Transfer-Encoding: 8bit\n" jpayne@68: "X-Generator: KBabel 1.3\n" jpayne@68: jpayne@68: #: hello.c:10 jpayne@68: msgid "Hello, world!\n" jpayne@68: msgstr "ନମସ୍କାର\n" jpayne@68: jpayne@68: #: hello.c:11 jpayne@68: msgid "How are you?\n" jpayne@68: msgstr "ଆପଣ କିପରି ଅଛନ୍ତି?\n" jpayne@68:jpayne@68: jpayne@68:
jpayne@68: Compile oriya.po to a machine object file, and install in the appropriate jpayne@68: place as in Sec. 2.4. Thus, jpayne@68:
jpayne@68: msgfmt -c -v -o hello.mo oriya.po jpayne@68: mkdir -p /usr/share/locale/or_IN/LC_MESSAGES jpayne@68: cp hello.mo /usr/share/locale/or_IN/LC_MESSAGES jpayne@68:jpayne@68: You can test the Oriya output as above, after recompiling hello.c and running jpayne@68: it in an Oriya locale. jpayne@68: jpayne@68:
jpayne@68: jpayne@68:
jpayne@68: jpayne@68:
jpayne@68: jpayne@68:
jpayne@68: This work is part of the project for enabling the use of Oriya under Linux. I jpayne@68: thank my uncle, N. M. Pattnaik, for conceiving of the project. We have all jpayne@68: benefited from the discussions amidst the group of people working on this jpayne@68: project. On the particular issue of translation, the help of H. R. Pansari, jpayne@68: A. Nayak, and M. Chand is much appreciated. jpayne@68: jpayne@68:
jpayne@68: The info browser can be started by typing “C-h i” in Emacs. The first time jpayne@68: you do this, it will briefly list some commands available inside the info jpayne@68: browser, and present you with a menu of major topics. Each menu item, or jpayne@68: cross-reference is hyperlinked to the appropriate node, and you can visit that jpayne@68: node either by moving the cursor to the item and pressing Enter, or by jpayne@68: clicking on it with the middle mouse button. To get to the gettext menu items, jpayne@68: you can either scroll down to the line, jpayne@68:
jpayne@68: * gettext: (gettext). GNU gettext utilities. jpayne@68:jpayne@68: and visit that node. Or, as it is several pages down, you can locate it using jpayne@68: “I-search.” Type “C-s” to enter “I-search” which will then prompt you jpayne@68: for a string in the mini-buffer at the bottom of the window. This is an jpayne@68: incremental search, so that Emacs will keep moving you forward through the jpayne@68: buffer as you are entering your search string. If you have reached the last jpayne@68: occurrence of the search string in the current buffer, you will get a message jpayne@68: saying “Failing I-search: ...” on pressing “C-s.” At that point, press jpayne@68: “C-s” again to resume the search at the beginning of the buffer. Likewise, jpayne@68: “C-r” incrementally searches backwards from the present location. jpayne@68: jpayne@68:
jpayne@68: Info nodes are listed in this document with a “::” separator, so jpayne@68: that one can go to the gettext::Creating::Header Entry:: by visiting the jpayne@68: “gettext” node from the main info menu, navigating to the “Creating” jpayne@68: node, and following that to the “Header Entry” node. jpayne@68: jpayne@68:
jpayne@68: A stand-alone info browser, independent of Emacs, is also available on many jpayne@68: systems. Thus, the gettext info page can also be accessed by typing jpayne@68: “info gettext” in a terminal. xinfo is an X application serving as an jpayne@68: info browser, so that if it is installed, typing “xinfo gettext” from the jpayne@68: command line will open a new browser window with the gettext info page. jpayne@68: jpayne@68:
jpayne@68: jpayne@68:
jpayne@68: This document was generated using the jpayne@68: LaTeX2HTML translator Version 2002-2-1 (1.70) jpayne@68:
jpayne@68: Copyright © 1993, 1994, 1995, 1996,
jpayne@68: Nikos Drakos,
jpayne@68: Computer Based Learning Unit, University of Leeds.
jpayne@68:
Copyright © 1997, 1998, 1999,
jpayne@68: Ross Moore,
jpayne@68: Mathematics Department, Macquarie University, Sydney.
jpayne@68:
jpayne@68: The command line arguments were:
jpayne@68: latex2html -no_math -html_version 4.0,math,unicode,i18n,tables -split 0 memo
jpayne@68:
jpayne@68: The translation was initiated by Gora Mohanty on 2004-07-24 jpayne@68: