jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:
jpayne@68:[ << ] | jpayne@68:[ >> ] | jpayne@68:jpayne@68: | jpayne@68: | jpayne@68: | jpayne@68: | jpayne@68: | [Top] | jpayne@68:[Contents] | jpayne@68:[Index] | jpayne@68:[ ? ] | jpayne@68:
This chapter explains the goals sought in the creation
jpayne@68: of GNU gettext
and the free Translation Project.
jpayne@68: Then, it explains a few broad concepts around
jpayne@68: Native Language Support, and positions message translation with regard
jpayne@68: to other aspects of national and cultural variance, as they apply
jpayne@68: to programs. It also surveys those files used to convey the
jpayne@68: translations. It explains how the various tools interact in the
jpayne@68: initial generation of these files, and later, how the maintenance
jpayne@68: cycle should usually operate.
jpayne@68:
In this manual, we use he when speaking of the programmer or
jpayne@68: maintainer, she when speaking of the translator, and they
jpayne@68: when speaking of the installers or end users of the translated program.
jpayne@68: This is only a convenience for clarifying the documentation. It is
jpayne@68: absolutely not meant to imply that some roles are more appropriate
jpayne@68: to males or females. Besides, as you might guess, GNU gettext
jpayne@68: is meant to be useful for people using computers, whatever their sex,
jpayne@68: race, religion or nationality!
jpayne@68:
Please submit suggestions and corrections jpayne@68:
bug-gettext@gnu.org
.
jpayne@68: Please include the manual's edition number and update date in your messages. jpayne@68:
jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:gettext
Usually, programs are written and documented in English, and use jpayne@68: English at execution time to interact with users. This is true jpayne@68: not only of GNU software, but also of a great deal of proprietary jpayne@68: and free software. Using a common language is quite handy for jpayne@68: communication between developers, maintainers and users from all jpayne@68: countries. On the other hand, most people are less comfortable with jpayne@68: English than with their own native language, and would prefer to jpayne@68: use their mother tongue for day to day's work, as far as possible. jpayne@68: Many would simply love to see their computer screen showing jpayne@68: a lot less of English, and far more of their own language. jpayne@68:
jpayne@68: jpayne@68:However, to many people, this dream might appear so far fetched that jpayne@68: they may believe it is not even worth spending time thinking about jpayne@68: it. They have no confidence at all that the dream might ever jpayne@68: become true. Yet some have not lost hope, and have organized themselves. jpayne@68: The Translation Project is a formalization of this hope into a jpayne@68: workable structure, which has a good chance to get all of us nearer jpayne@68: the achievement of a truly multi-lingual set of programs. jpayne@68:
jpayne@68:GNU gettext
is an important step for the Translation Project,
jpayne@68: as it is an asset on which we may build many other steps. This package
jpayne@68: offers to programmers, translators and even users, a well integrated
jpayne@68: set of tools and documentation. Specifically, the GNU gettext
jpayne@68: utilities are a set of tools that provides a framework within which
jpayne@68: other free packages may produce multi-lingual messages. These tools
jpayne@68: include
jpayne@68:
GNU gettext
is designed to minimize the impact of
jpayne@68: internationalization on program sources, keeping this impact as small
jpayne@68: and hardly noticeable as possible. Internationalization has better
jpayne@68: chances of succeeding if it is very light weighted, or at least,
jpayne@68: appear to be so, when looking at program sources.
jpayne@68:
The Translation Project also uses the GNU gettext
distribution
jpayne@68: as a vehicle for documenting its structure and methods. This goes
jpayne@68: beyond the strict technicalities of documenting the GNU gettext
jpayne@68: proper. By so doing, translators will find in a single place, as
jpayne@68: far as possible, all they need to know for properly doing their
jpayne@68: translating work. Also, this supplemental documentation might also
jpayne@68: help programmers, and even curious users, in understanding how GNU
jpayne@68: gettext
is related to the remainder of the Translation
jpayne@68: Project, and consequently, have a glimpse at the big picture.
jpayne@68:
Two long words appear all the time when we discuss support of native jpayne@68: language in programs, and these words have a precise meaning, worth jpayne@68: being explained here, once and for all in this document. The words are jpayne@68: internationalization and localization. Many people, jpayne@68: tired of writing these long words over and over again, took the jpayne@68: habit of writing i18n and l10n instead, quoting the first jpayne@68: and last letter of each word, and replacing the run of intermediate jpayne@68: letters by a number merely telling how many such letters there are. jpayne@68: But in this manual, in the sake of clarity, we will patiently write jpayne@68: the names in full, each time… jpayne@68:
jpayne@68: jpayne@68:By internationalization, one refers to the operation by which a
jpayne@68: program, or a set of programs turned into a package, is made aware of and
jpayne@68: able to support multiple languages. This is a generalization process,
jpayne@68: by which the programs are untied from calling only English strings or
jpayne@68: other English specific habits, and connected to generic ways of doing
jpayne@68: the same, instead. Program developers may use various techniques to
jpayne@68: internationalize their programs. Some of these have been standardized.
jpayne@68: GNU gettext
offers one of these standards. See section The Programmer's View.
jpayne@68:
By localization, one means the operation by which, in a set jpayne@68: of programs already internationalized, one gives the program all jpayne@68: needed information so that it can adapt itself to handle its input jpayne@68: and output in a fashion which is correct for some native language and jpayne@68: cultural habits. This is a particularisation process, by which generic jpayne@68: methods already implemented in an internationalized program are used jpayne@68: in specific ways. The programming environment puts several functions jpayne@68: to the programmers disposal which allow this runtime configuration. jpayne@68: The formal description of specific set of cultural habits for some jpayne@68: country, together with all associated translations targeted to the jpayne@68: same native language, is called the locale for this language jpayne@68: or country. Users achieve localization of programs by setting proper jpayne@68: values to special environment variables, prior to executing those jpayne@68: programs, identifying which locale should be used. jpayne@68:
jpayne@68:In fact, locale message support is only one component of the cultural jpayne@68: data that makes up a particular locale. There are a whole host of jpayne@68: routines and functions provided to aid programmers in developing jpayne@68: internationalized software and which allow them to access the data jpayne@68: stored in a particular locale. When someone presently refers to a jpayne@68: particular locale, they are obviously referring to the data stored jpayne@68: within that particular locale. Similarly, if a programmer is referring jpayne@68: to “accessing the locale routines”, they are referring to the jpayne@68: complete suite of routines that access all of the locale's information. jpayne@68:
jpayne@68: jpayne@68: jpayne@68: jpayne@68:One uses the expression Native Language Support, or merely NLS, jpayne@68: for speaking of the overall activity or feature encompassing both jpayne@68: internationalization and localization, allowing for multi-lingual jpayne@68: interactions in a program. In a nutshell, one could say that jpayne@68: internationalization is the operation by which further localizations jpayne@68: are made possible. jpayne@68:
jpayne@68:Also, very roughly said, when it comes to multi-lingual messages, jpayne@68: internationalization is usually taken care of by programmers, and jpayne@68: localization is usually taken care of by translators. jpayne@68:
jpayne@68: jpayne@68: jpayne@68: jpayne@68:For a totally multi-lingual distribution, there are many things to jpayne@68: translate beyond output messages. jpayne@68:
jpayne@68:gettext
offers a complete toolset for
jpayne@68: translating messages output by C programs. Perl scripts and shell
jpayne@68: scripts will also need to be translated. Even if there are today some hooks
jpayne@68: by which this can be done, these hooks are not integrated as well as they
jpayne@68: should be.
jpayne@68:
jpayne@68: autoconf
or bison
, are able
jpayne@68: to produce other programs (or scripts). Even if the generating
jpayne@68: programs themselves are internationalized, the generated programs they
jpayne@68: produce may need internationalization on their own, and this indirect
jpayne@68: internationalization could be automated right from the generating
jpayne@68: program. In fact, quite usually, generating and generated programs
jpayne@68: could be internationalized independently, as the effort needed is
jpayne@68: fairly orthogonal.
jpayne@68:
jpayne@68: recode
program is able to reconstruct at execution.
jpayne@68: Since these descriptions are extracted from the RFC by mechanical means,
jpayne@68: translating them properly would require a prior translation of the RFC
jpayne@68: itself.
jpayne@68:
jpayne@68: gcc
to allow diacriticized characters in identifiers or use
jpayne@68: translated keywords; ‘rm -i’ might accept something else than
jpayne@68: ‘y’ or ‘n’ for replies, etc. Even if the program will
jpayne@68: eventually make most of its output in the foreign languages, one has
jpayne@68: to decide whether the input syntax, option values, etc., are to be
jpayne@68: localized or not.
jpayne@68:
jpayne@68: As we already stressed, translation is only one aspect of locales.
jpayne@68: Other internationalization aspects are system services and are handled
jpayne@68: in GNU libc
. There
jpayne@68: are many attributes that are needed to define a country's cultural
jpayne@68: conventions. These attributes include beside the country's native
jpayne@68: language, the formatting of the date and time, the representation of
jpayne@68: numbers, the symbols for currency, etc. These local rules are
jpayne@68: termed the country's locale. The locale represents the knowledge
jpayne@68: needed to support the country's native attributes.
jpayne@68:
There are a few major areas which may vary between countries and
jpayne@68: hence, define what a locale must describe. The following list helps
jpayne@68: putting multi-lingual messages into the proper context of other tasks
jpayne@68: related to locales. See the GNU libc
manual for details.
jpayne@68:
The codeset most commonly used through out the USA and most English jpayne@68: speaking parts of the world is the ASCII codeset. However, there are jpayne@68: many characters needed by various locales that are not found within jpayne@68: this codeset. The 8-bit ISO 8859-1 code set has most of the special jpayne@68: characters needed to handle the major European languages. However, in jpayne@68: many cases, choosing ISO 8859-1 is nevertheless not adequate: it jpayne@68: doesn't even handle the major European currency. Hence each locale jpayne@68: will need to specify which codeset they need to use and will need jpayne@68: to have the appropriate character handling routines to cope with jpayne@68: the codeset. jpayne@68:
jpayne@68:The symbols used vary from country to country as does the position jpayne@68: used by the symbol. Software needs to be able to transparently jpayne@68: display currency figures in the native mode for each locale. jpayne@68:
jpayne@68:The format of date varies between locales. For example, Christmas day jpayne@68: in 1994 is written as 12/25/94 in the USA and as 25/12/94 in Australia. jpayne@68: Other countries might use ISO 8601 dates, etc. jpayne@68:
jpayne@68:Time of the day may be noted as hh:mm, hh.mm, jpayne@68: or otherwise. Some locales require time to be specified in 24-hour jpayne@68: mode rather than as AM or PM. Further, the nature and yearly extent jpayne@68: of the Daylight Saving correction vary widely between countries. jpayne@68:
jpayne@68:Numbers can be represented differently in different locales. jpayne@68: For example, the following numbers are all written correctly for jpayne@68: their respective locales: jpayne@68:
jpayne@68:12,345.67 English jpayne@68: 12.345,67 German jpayne@68: 12345,67 French jpayne@68: 1,2345.67 Asia jpayne@68: |
Some programs could go further and use different unit systems, like jpayne@68: English units or Metric units, or even take into account variants jpayne@68: about how numbers are spelled in full. jpayne@68:
jpayne@68:The most obvious area is the language support within a locale. This is
jpayne@68: where GNU gettext
provides the means for developers and users to
jpayne@68: easily change the language that the software uses to communicate to
jpayne@68: the user.
jpayne@68:
These areas of cultural conventions are called locale categories. jpayne@68: It is an unfortunate term; locale aspects or locale feature jpayne@68: categories would be a better term, because each “locale category” jpayne@68: describes an area or task that requires localization. The concrete data jpayne@68: that describes the cultural conventions for such an area and for a particular jpayne@68: culture is also called a locale category. In this sense, a locale jpayne@68: is composed of several locale categories: the locale category describing jpayne@68: the codeset, the locale category describing the formatting of numbers, jpayne@68: the locale category containing the translated messages, and so on. jpayne@68:
jpayne@68: jpayne@68:Components of locale outside of message handling are standardized in
jpayne@68: the ISO C standard and the POSIX:2001 standard (also known as the SUSV3
jpayne@68: specification). GNU libc
jpayne@68: fully implements this, and most other modern systems provide a more
jpayne@68: or less reasonable support for at least some of the missing components.
jpayne@68:
The letters PO in ‘.po’ files means Portable Object, to jpayne@68: distinguish it from ‘.mo’ files, where MO stands for Machine jpayne@68: Object. This paradigm, as well as the PO file format, is inspired jpayne@68: by the NLS standard developed by Uniforum, and first implemented by jpayne@68: Sun in their Solaris system. jpayne@68:
jpayne@68:PO files are meant to be read and edited by humans, and associate each
jpayne@68: original, translatable string of a given package with its translation
jpayne@68: in a particular target language. A single PO file is dedicated to
jpayne@68: a single target language. If a package supports many languages,
jpayne@68: there is one such PO file per language supported, and each package
jpayne@68: has its own set of PO files. These PO files are best created by
jpayne@68: the xgettext
program, and later updated or refreshed through
jpayne@68: the msgmerge
program. Program xgettext
extracts all
jpayne@68: marked messages from a set of C files and initializes a PO file with
jpayne@68: empty translations. Program msgmerge
takes care of adjusting
jpayne@68: PO files between releases of the corresponding sources, commenting
jpayne@68: obsolete entries, initializing new ones, and updating all source
jpayne@68: line references. Files ending with ‘.pot’ are kind of base
jpayne@68: translation files found in distributions, in PO file format.
jpayne@68:
MO files are meant to be read by programs, and are binary in nature.
jpayne@68: A few systems already offer tools for creating and handling MO files
jpayne@68: as part of the Native Language Support coming with the system, but the
jpayne@68: format of these MO files is often different from system to system,
jpayne@68: and non-portable. The tools already provided with these systems don't
jpayne@68: support all the features of GNU gettext
. Therefore GNU
jpayne@68: gettext
uses its own format for MO files. Files ending with
jpayne@68: ‘.gmo’ are really MO files, when it is known that these files use
jpayne@68: the GNU format.
jpayne@68:
gettext
The following diagram summarizes the relation between the files
jpayne@68: handled by GNU gettext
and the tools acting on these files.
jpayne@68: It is followed by somewhat detailed explanations, which you should
jpayne@68: read while keeping an eye on the diagram. Having a clear understanding
jpayne@68: of these interrelations will surely help programmers, translators
jpayne@68: and maintainers.
jpayne@68:
Original C Sources ───> Preparation ───> Marked C Sources ───╮ jpayne@68: │ jpayne@68: ╭─────────<─── GNU gettext Library │ jpayne@68: ╭─── make <───┤ │ jpayne@68: │ ╰─────────<────────────────────┬───────────────╯ jpayne@68: │ │ jpayne@68: │ ╭─────<─── PACKAGE.pot <─── xgettext <───╯ ╭───<─── PO Compendium jpayne@68: │ │ │ ↑ jpayne@68: │ │ ╰───╮ │ jpayne@68: │ ╰───╮ ├───> PO editor ───╮ jpayne@68: │ ├────> msgmerge ──────> LANG.po ────>────────╯ │ jpayne@68: │ ╭───╯ │ jpayne@68: │ │ │ jpayne@68: │ ╰─────────────<───────────────╮ │ jpayne@68: │ ├─── New LANG.po <────────────────────╯ jpayne@68: │ ╭─── LANG.gmo <─── msgfmt <───╯ jpayne@68: │ │ jpayne@68: │ ╰───> install ───> /.../LANG/PACKAGE.mo ───╮ jpayne@68: │ ├───> "Hello world!" jpayne@68: ╰───────> install ───> /.../bin/PROGRAM ───────╯ jpayne@68: |
As a programmer, the first step to bringing GNU gettext
jpayne@68: into your package is identifying, right in the C sources, those strings
jpayne@68: which are meant to be translatable, and those which are untranslatable.
jpayne@68: This tedious job can be done a little more comfortably using emacs PO
jpayne@68: mode, but you can use any means familiar to you for modifying your
jpayne@68: C sources. Beside this some other simple, standard changes are needed to
jpayne@68: properly initialize the translation library. See section Preparing Program Sources, for
jpayne@68: more information about all this.
jpayne@68:
For newly written software the strings of course can and should be
jpayne@68: marked while writing it. The gettext
approach makes this
jpayne@68: very easy. Simply put the following lines at the beginning of each file
jpayne@68: or in a central header file:
jpayne@68:
#define _(String) (String) jpayne@68: #define N_(String) String jpayne@68: #define textdomain(Domain) jpayne@68: #define bindtextdomain(Package, Directory) jpayne@68: |
Doing this allows you to prepare the sources for internationalization.
jpayne@68: Later when you feel ready for the step to use the gettext
library
jpayne@68: simply replace these definitions by the following:
jpayne@68:
#include <libintl.h> jpayne@68: #define _(String) gettext (String) jpayne@68: #define gettext_noop(String) String jpayne@68: #define N_(String) gettext_noop (String) jpayne@68: |
and link against ‘libintl.a’ or ‘libintl.so’. Note that on
jpayne@68: GNU systems, you don't need to link with libintl
because the
jpayne@68: gettext
library functions are already contained in GNU libc.
jpayne@68: That is all you have to change.
jpayne@68:
Once the C sources have been modified, the xgettext
program
jpayne@68: is used to find and extract all translatable strings, and create a
jpayne@68: PO template file out of all these. This ‘package.pot’ file
jpayne@68: contains all original program strings. It has sets of pointers to
jpayne@68: exactly where in C sources each string is used. All translations
jpayne@68: are set to empty. The letter t
in ‘.pot’ marks this as
jpayne@68: a Template PO file, not yet oriented towards any particular language.
jpayne@68: See section Invoking the xgettext
Program, for more details about how one calls the
jpayne@68: xgettext
program. If you are really lazy, you might
jpayne@68: be interested at working a lot more right away, and preparing the
jpayne@68: whole distribution setup (see section The Maintainer's View). By doing so, you
jpayne@68: spare yourself typing the xgettext
command, as make
jpayne@68: should now generate the proper things automatically for you!
jpayne@68:
The first time through, there is no ‘lang.po’ yet, so the
jpayne@68: msgmerge
step may be skipped and replaced by a mere copy of
jpayne@68: ‘package.pot’ to ‘lang.po’, where lang
jpayne@68: represents the target language. See Creating a New PO File for details.
jpayne@68:
Then comes the initial translation of messages. Translation in jpayne@68: itself is a whole matter, still exclusively meant for humans, jpayne@68: and whose complexity far overwhelms the level of this manual. jpayne@68: Nevertheless, a few hints are given in some other chapter of this jpayne@68: manual (see section The Translator's View). You will also find there indications jpayne@68: about how to contact translating teams, or becoming part of them, jpayne@68: for sharing your translating concerns with others who target the same jpayne@68: native language. jpayne@68:
jpayne@68:While adding the translated messages into the ‘lang.po’ jpayne@68: PO file, if you are not using one of the dedicated PO file editors jpayne@68: (see section Editing PO Files), you are on your own jpayne@68: for ensuring that your efforts fully respect the PO file format, and quoting jpayne@68: conventions (see section The Format of PO Files). This is surely not an impossible task, jpayne@68: as this is the way many people have handled PO files around 1995. jpayne@68: On the other hand, by using a PO file editor, most details jpayne@68: of PO file format are taken care of for you, but you have to acquire jpayne@68: some familiarity with PO file editor itself. jpayne@68:
jpayne@68:If some common translations have already been saved into a compendium jpayne@68: PO file, translators may use PO mode for initializing untranslated jpayne@68: entries from the compendium, and also save selected translations into jpayne@68: the compendium, updating it (see section Using Translation Compendia). Compendium files jpayne@68: are meant to be exchanged between members of a given translation team. jpayne@68:
jpayne@68:Programs, or packages of programs, are dynamic in nature: users write jpayne@68: bug reports and suggestion for improvements, maintainers react by jpayne@68: modifying programs in various ways. The fact that a package has jpayne@68: already been internationalized should not make maintainers shy jpayne@68: of adding new strings, or modifying strings already translated. jpayne@68: They just do their job the best they can. For the Translation jpayne@68: Project to work smoothly, it is important that maintainers do not jpayne@68: carry translation concerns on their already loaded shoulders, and that jpayne@68: translators be kept as free as possible of programming concerns. jpayne@68:
jpayne@68:The only concern maintainers should have is carefully marking new
jpayne@68: strings as translatable, when they should be, and do not otherwise
jpayne@68: worry about them being translated, as this will come in proper time.
jpayne@68: Consequently, when programs and their strings are adjusted in various
jpayne@68: ways by maintainers, and for matters usually unrelated to translation,
jpayne@68: xgettext
would construct ‘package.pot’ files which are
jpayne@68: evolving over time, so the translations carried by ‘lang.po’
jpayne@68: are slowly fading out of date.
jpayne@68:
It is important for translators (and even maintainers) to understand jpayne@68: that package translation is a continuous process in the lifetime of a jpayne@68: package, and not something which is done once and for all at the start. jpayne@68: After an initial burst of translation activity for a given package, jpayne@68: interventions are needed once in a while, because here and there, jpayne@68: translated entries become obsolete, and new untranslated entries jpayne@68: appear, needing translation. jpayne@68:
jpayne@68:The msgmerge
program has the purpose of refreshing an already
jpayne@68: existing ‘lang.po’ file, by comparing it with a newer
jpayne@68: ‘package.pot’ template file, extracted by xgettext
jpayne@68: out of recent C sources. The refreshing operation adjusts all
jpayne@68: references to C source locations for strings, since these strings
jpayne@68: move as programs are modified. Also, msgmerge
comments out as
jpayne@68: obsolete, in ‘lang.po’, those already translated entries
jpayne@68: which are no longer used in the program sources (see section Obsolete Entries). It finally discovers new strings and inserts them in
jpayne@68: the resulting PO file as untranslated entries (see section Untranslated Entries). See section Invoking the msgmerge
Program, for more information about what
jpayne@68: msgmerge
really does.
jpayne@68:
Whatever route or means taken, the goal is to obtain an updated jpayne@68: ‘lang.po’ file offering translations for all strings. jpayne@68:
jpayne@68:The temporal mobility, or fluidity of PO files, is an integral part of jpayne@68: the translation game, and should be well understood, and accepted. jpayne@68: People resisting it will have a hard time participating in the jpayne@68: Translation Project, or will give a hard time to other participants! In jpayne@68: particular, maintainers should relax and include all available official jpayne@68: PO files in their distributions, even if these have not recently been jpayne@68: updated, without exerting pressure on the translator teams to get the jpayne@68: job done. The pressure should rather come jpayne@68: from the community of users speaking a particular language, and jpayne@68: maintainers should consider themselves fairly relieved of any concern jpayne@68: about the adequacy of translation files. On the other hand, translators jpayne@68: should reasonably try updating the PO files they are responsible for, jpayne@68: while the package is undergoing pretest, prior to an official jpayne@68: distribution. jpayne@68:
jpayne@68:Once the PO file is complete and dependable, the msgfmt
program
jpayne@68: is used for turning the PO file into a machine-oriented format, which
jpayne@68: may yield efficient retrieval of translations by the programs of the
jpayne@68: package, whenever needed at runtime (see section The Format of GNU MO Files). See section Invoking the msgfmt
Program, for more information about all modes of execution
jpayne@68: for the msgfmt
program.
jpayne@68:
Finally, the modified and marked C sources are compiled and linked
jpayne@68: with the GNU gettext
library, usually through the operation of
jpayne@68: make
, given a suitable ‘Makefile’ exists for the project,
jpayne@68: and the resulting executable is installed somewhere users will find it.
jpayne@68: The MO files themselves should also be properly installed. Given the
jpayne@68: appropriate environment variables are set (see section Setting the Locale through Environment Variables),
jpayne@68: the program should localize itself automatically, whenever it executes.
jpayne@68:
The remainder of this manual has the purpose of explaining in depth the various jpayne@68: steps outlined above. jpayne@68:
jpayne@68: jpayne@68:[ << ] | jpayne@68:[ >> ] | jpayne@68:jpayne@68: | jpayne@68: | jpayne@68: | jpayne@68: | jpayne@68: | [Top] | jpayne@68:[Contents] | jpayne@68:[Index] | jpayne@68:[ ? ] | jpayne@68:
jpayne@68:
jpayne@68: This document was generated by Bruno Haible on February, 21 2024 using texi2html 1.78a.
jpayne@68:
jpayne@68:
jpayne@68:
jpayne@68: