jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: GNU gettext utilities: 6. Creating a New PO File jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:

jpayne@68: jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

6. Creating a New PO File

jpayne@68: jpayne@68:

When starting a new translation, the translator creates a file called jpayne@68: ‘LANG.po’, as a copy of the ‘package.pot’ template jpayne@68: file with modifications in the initial comments (at the beginning of the file) jpayne@68: and in the header entry (the first entry, near the beginning of the file). jpayne@68:

jpayne@68:

The easiest way to do so is by use of the ‘msginit’ program. jpayne@68: For example: jpayne@68:

jpayne@68:

$ cd PACKAGE-VERSION
jpayne@68: $ cd po
jpayne@68: $ msginit
jpayne@68:

jpayne@68: jpayne@68:

The alternative way is to do the copy and modifications by hand. jpayne@68: To do so, the translator copies ‘package.pot’ to jpayne@68: ‘LANG.po’. Then she modifies the initial comments and jpayne@68: the header entry of this file. jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:

6.1 Invoking the `msginit` Program

jpayne@68: jpayne@68: jpayne@68:

msginit [option]
jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

The msginit program creates a new PO file, initializing the meta jpayne@68: information with values from the user's environment. jpayne@68:

jpayne@68:

Here are more details. The following header fields of a PO file are jpayne@68: automatically filled, when possible. jpayne@68:

jpayne@68:

‘Project-Id-Version’

The value is guessed from the configure script or any other files jpayne@68: in the current directory. jpayne@68:

jpayne@68:

‘PO-Revision-Date’

The value is taken from the PO-Creation-Data in the input POT jpayne@68: file, or the current date is used. jpayne@68:

jpayne@68:

‘Last-Translator’

The value is taken from user's password file entry and the mailer jpayne@68: configuration files. jpayne@68:

jpayne@68:

‘Language-Team, Language’

These values are set according to the current locale and the predefined jpayne@68: list of translation teams. jpayne@68:

jpayne@68:

‘MIME-Version, Content-Type, Content-Transfer-Encoding’

These values are set according to the content of the POT file and the jpayne@68: current locale. If the POT file contains charset=UTF-8, it means that jpayne@68: the POT file contains non-ASCII characters, and we keep the UTF-8 jpayne@68: encoding. Otherwise, when the POT file is plain ASCII, we use the jpayne@68: locale's encoding. jpayne@68:

jpayne@68:

‘Plural-Forms’

The value is first looked up from the embedded table. jpayne@68:

jpayne@68:

As an experimental feature, you can instruct msginit to use the jpayne@68: information from Unicode CLDR, by setting the GETTEXTCLDRDIR jpayne@68: environment variable. The program will look for a file named jpayne@68: common/supplemental/plurals.xml under that directory. You can jpayne@68: get the CLDR data from http://cldr.unicode.org/. jpayne@68:

jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

6.1.1 Input file location

jpayne@68: jpayne@68:

‘-i inputfile’
‘--input=inputfile’: jpayne@68: jpayne@68:
Input POT file. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

If no inputfile is given, the current directory is searched for the jpayne@68: POT file. If it is ‘-’, standard input is read. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

6.1.2 Output file location

jpayne@68: jpayne@68:

‘-o file’
‘--output-file=file’: jpayne@68: jpayne@68:
Write output to specified PO file. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

If no output file is given, it depends on the ‘--locale’ option or the jpayne@68: user's locale setting. If it is ‘-’, the results are written to jpayne@68: standard output. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

6.1.3 Input file syntax

jpayne@68: jpayne@68:

‘-P’
‘--properties-input’: jpayne@68: jpayne@68:
Assume the input file is a Java ResourceBundle in Java .properties jpayne@68: syntax, not in PO file syntax. jpayne@68:
jpayne@68:
‘--stringtable-input’: jpayne@68:
Assume the input file is a NeXTstep/GNUstep localized resource file in jpayne@68: .strings syntax, not in PO file syntax. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

6.1.4 Output details

jpayne@68: jpayne@68:

‘-l ll_CC[.encoding]’
‘--locale=ll_CC[.encoding]’: jpayne@68: jpayne@68:
Set target locale. ll should be a language code, and CC should jpayne@68: be a country code. The optional part .encoding specifies the encoding jpayne@68: of the locale; most often this part is .UTF-8. jpayne@68: The command ‘locale -a’ can be used to output a list jpayne@68: of all installed locales. The default is the user's locale setting. jpayne@68:
jpayne@68:
‘--no-translator’: jpayne@68:
Declares that the PO file will not have a human translator and is instead jpayne@68: automatically generated. jpayne@68:
jpayne@68:
‘--color’
‘--color=when’: jpayne@68:
Specify whether or when to use colors and other text attributes. jpayne@68: See The --color option for details. jpayne@68:
jpayne@68:
‘--style=style_file’: jpayne@68:
Specify the CSS style rule file to use for --color. jpayne@68: See The --style option for details. jpayne@68:
jpayne@68:
‘-p’
‘--properties-output’: jpayne@68: jpayne@68:
Write out a Java ResourceBundle in Java .properties syntax. Note jpayne@68: that this file format doesn't support plural forms and silently drops jpayne@68: obsolete messages. jpayne@68:
jpayne@68:
‘--stringtable-output’: jpayne@68:
Write out a NeXTstep/GNUstep localized resource file in .strings syntax. jpayne@68: Note that this file format doesn't support plural forms. jpayne@68:
jpayne@68:
‘-w number’
‘--width=number’: jpayne@68: jpayne@68:
Set the output page width. Long strings in the output files will be jpayne@68: split across multiple lines in order to ensure that each line's width jpayne@68: (= number of screen columns) is less or equal to the given number. jpayne@68:
jpayne@68:
‘--no-wrap’: jpayne@68:
Do not break long message lines. Message lines whose width exceeds the jpayne@68: output page width will not be split into several lines. Only file reference jpayne@68: lines which are wider than the output page width will be split. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

6.1.5 Informative output

jpayne@68: jpayne@68:

‘-h’
‘--help’: jpayne@68: jpayne@68:
Display this help and exit. jpayne@68:
jpayne@68:
‘-V’
‘--version’: jpayne@68: jpayne@68:
Output version information and exit. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:

6.2 Filling in the Header Entry

jpayne@68: jpayne@68:

The initial comments "SOME DESCRIPTIVE TITLE", "YEAR" and jpayne@68: "FIRST AUTHOR <EMAIL@ADDRESS>, YEAR" ought to be replaced by sensible jpayne@68: information. This can be done in any text editor; if Emacs is used jpayne@68: and it switched to PO mode automatically (because it has recognized jpayne@68: the file's suffix), you can disable it by typing M-x fundamental-mode. jpayne@68:

jpayne@68:

Modifying the header entry can already be done using PO mode: in Emacs, jpayne@68: type M-x po-mode RET and then RET again to start editing the jpayne@68: entry. You should fill in the following fields. jpayne@68:

jpayne@68:

Project-Id-Version

This is the name and version of the package. Fill it in if it has not jpayne@68: already been filled in by xgettext. jpayne@68:

jpayne@68:

Report-Msgid-Bugs-To

This has already been filled in by xgettext. It contains an email jpayne@68: address or URL where you can report bugs in the untranslated strings: jpayne@68:

jpayne@68:

- Strings which are not entire sentences, see the maintainer guidelines jpayne@68: in Preparing Translatable Strings. jpayne@68:
- Strings which use unclear terms or require additional context to be jpayne@68: understood. jpayne@68:
- Strings which make invalid assumptions about notation of date, time or jpayne@68: money. jpayne@68:
- Pluralisation problems. jpayne@68:
- Incorrect English spelling. jpayne@68:
- Incorrect formatting. jpayne@68:

jpayne@68: jpayne@68:

POT-Creation-Date

This has already been filled in by xgettext. jpayne@68:

jpayne@68:

PO-Revision-Date

You don't need to fill this in. It will be filled by the PO file editor jpayne@68: when you save the file. jpayne@68:

jpayne@68:

Last-Translator

Fill in your name and email address (without double quotes). jpayne@68:

jpayne@68:

Language-Team

Fill in the English name of the language, and the email address or jpayne@68: homepage URL of the language team you are part of. jpayne@68:

jpayne@68:

Before starting a translation, it is a good idea to get in touch with jpayne@68: your translation team, not only to make sure you don't do duplicated work, jpayne@68: but also to coordinate difficult linguistic issues. jpayne@68:

jpayne@68: jpayne@68:

In the Free Translation Project, each translation team has its own mailing jpayne@68: list. The up-to-date list of teams can be found at the Free Translation jpayne@68: Project's homepage, https://translationproject.org/, in the "Teams" jpayne@68: area. jpayne@68:

jpayne@68:

Language

Fill in the language code of the language. This can be in one of three jpayne@68: forms: jpayne@68:

jpayne@68:

- jpayne@68: ‘ll’, an ISO 639 two-letter language code (lowercase). jpayne@68: See Language Codes for the list of codes. jpayne@68: jpayne@68:
- jpayne@68: ‘ll_CC’, where ‘ll’ is an ISO 639 two-letter jpayne@68: language code (lowercase) and ‘CC’ is an ISO 3166 two-letter jpayne@68: country code (uppercase). The country code specification is not redundant: jpayne@68: Some languages have dialects in different countries. For example, jpayne@68: ‘de_AT’ is used for Austria, and ‘pt_BR’ for Brazil. The country jpayne@68: code serves to distinguish the dialects. See Language Codes and jpayne@68: Country Codes for the lists of codes. jpayne@68: jpayne@68:
- jpayne@68: ‘ll_CC@variant’, where ‘ll’ is an jpayne@68: ISO 639 two-letter language code (lowercase), ‘CC’ is an jpayne@68: ISO 3166 two-letter country code (uppercase), and ‘variant’ is jpayne@68: a variant designator. The variant designator (lowercase) can be a script jpayne@68: designator, such as ‘latin’ or ‘cyrillic’. jpayne@68:

jpayne@68: jpayne@68:

The naming convention ‘ll_CC’ is also the way locales are jpayne@68: named on systems based on GNU libc. But there are three important differences: jpayne@68:

jpayne@68:

jpayne@68: In this PO file field, but not in locale names, ‘ll_CC’ jpayne@68: combinations denoting a language's main dialect are abbreviated as jpayne@68: ‘ll’. For example, ‘de’ is equivalent to ‘de_DE’ jpayne@68: (German as spoken in Germany), and ‘pt’ to ‘pt_PT’ (Portuguese as jpayne@68: spoken in Portugal) in this context. jpayne@68: jpayne@68:
jpayne@68: In this PO file field, suffixes like ‘.encoding’ are not used. jpayne@68: jpayne@68:
jpayne@68: In this PO file field, variant designators that are not relevant to message jpayne@68: translation, such as ‘@euro’, are not used. jpayne@68:

jpayne@68: jpayne@68:

So, if your locale name is ‘de_DE.UTF-8’, the language specification in jpayne@68: PO files is just ‘de’. jpayne@68:

jpayne@68:

Content-Type

jpayne@68: jpayne@68:

Replace ‘CHARSET’ with the character encoding used for your language, jpayne@68: in your locale, or UTF-8. This field is needed for correct operation of the jpayne@68: msgmerge and msgfmt programs, as well as for users whose jpayne@68: locale's character encoding differs from yours (see How to specify the output character set gettext uses). jpayne@68:

jpayne@68: jpayne@68:

You get the character encoding of your locale by running the shell command jpayne@68: ‘locale charmap’. If the result is ‘C’ or ‘ANSI_X3.4-1968’, jpayne@68: which is equivalent to ‘ASCII’ (= ‘US-ASCII’), it means that your jpayne@68: locale is not correctly configured. In this case, ask your translation jpayne@68: team which charset to use. ‘ASCII’ is not usable for any language jpayne@68: except Latin. jpayne@68:

jpayne@68: jpayne@68:

Because the PO files must be portable to operating systems with less advanced jpayne@68: internationalization facilities, the character encodings that can be used jpayne@68: are limited to those supported by both GNU libc and GNU jpayne@68: libiconv. These are: jpayne@68: ASCII, ISO-8859-1, ISO-8859-2, ISO-8859-3, jpayne@68: ISO-8859-4, ISO-8859-5, ISO-8859-6, ISO-8859-7, jpayne@68: ISO-8859-8, ISO-8859-9, ISO-8859-13, ISO-8859-14, jpayne@68: ISO-8859-15, jpayne@68: KOI8-R, KOI8-U, KOI8-T, jpayne@68: CP850, CP866, CP874, jpayne@68: CP932, CP949, CP950, CP1250, CP1251, jpayne@68: CP1252, CP1253, CP1254, CP1255, CP1256, jpayne@68: CP1257, GB2312, EUC-JP, EUC-KR, EUC-TW, jpayne@68: BIG5, BIG5-HKSCS, GBK, GB18030, SHIFT_JIS, jpayne@68: JOHAB, TIS-620, VISCII, GEORGIAN-PS, UTF-8. jpayne@68:

jpayne@68: jpayne@68:

In the GNU system, the following encodings are frequently used for the jpayne@68: corresponding languages. jpayne@68:

jpayne@68: jpayne@68:

ISO-8859-1 for jpayne@68: Afrikaans, Albanian, Basque, Breton, Catalan, Cornish, Danish, Dutch, jpayne@68: English, Estonian, Faroese, Finnish, French, Galician, German, jpayne@68: Greenlandic, Icelandic, Indonesian, Irish, Italian, Malay, Manx, jpayne@68: Norwegian, Occitan, Portuguese, Spanish, Swedish, Tagalog, Uzbek, jpayne@68: Walloon, jpayne@68:
ISO-8859-2 for jpayne@68: Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian, Slovak, jpayne@68: Slovenian, jpayne@68:
ISO-8859-3 for Maltese, jpayne@68:
ISO-8859-5 for Macedonian, Serbian, jpayne@68:
ISO-8859-6 for Arabic, jpayne@68:
ISO-8859-7 for Greek, jpayne@68:
ISO-8859-8 for Hebrew, jpayne@68:
ISO-8859-9 for Turkish, jpayne@68:
ISO-8859-13 for Latvian, Lithuanian, Maori, jpayne@68:
ISO-8859-14 for Welsh, jpayne@68:
ISO-8859-15 for jpayne@68: Basque, Catalan, Dutch, English, Finnish, French, Galician, German, Irish, jpayne@68: Italian, Portuguese, Spanish, Swedish, Walloon, jpayne@68:
KOI8-R for Russian, jpayne@68:
KOI8-U for Ukrainian, jpayne@68:
KOI8-T for Tajik, jpayne@68:
CP1251 for Bulgarian, Belarusian, jpayne@68:
GB2312, GBK, GB18030 jpayne@68: for simplified writing of Chinese, jpayne@68:
BIG5, BIG5-HKSCS jpayne@68: for traditional writing of Chinese, jpayne@68:
EUC-JP for Japanese, jpayne@68:
EUC-KR for Korean, jpayne@68:
TIS-620 for Thai, jpayne@68:
GEORGIAN-PS for Georgian, jpayne@68:
UTF-8 for any language, including those listed above. jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

When single quote characters or double quote characters are used in jpayne@68: translations for your language, and your locale's encoding is one of the jpayne@68: ISO-8859-* charsets, it is best if you create your PO files in UTF-8 jpayne@68: encoding, instead of your locale's encoding. This is because in UTF-8 jpayne@68: the real quote characters can be represented (single quote characters: jpayne@68: U+2018, U+2019, double quote characters: U+201C, U+201D), whereas none of jpayne@68: ISO-8859-* charsets has them all. Users in UTF-8 locales will see the jpayne@68: real quote characters, whereas users in ISO-8859-* locales will see the jpayne@68: vertical apostrophe and the vertical double quote instead (because that's jpayne@68: what the character set conversion will transliterate them to). jpayne@68:

jpayne@68: jpayne@68:

To enter such quote characters under X11, you can change your keyboard jpayne@68: mapping using the xmodmap program. The X11 names of the quote jpayne@68: characters are "leftsinglequotemark", "rightsinglequotemark", jpayne@68: "leftdoublequotemark", "rightdoublequotemark", "singlelowquotemark", jpayne@68: "doublelowquotemark". jpayne@68:

jpayne@68:

Note that only recent versions of GNU Emacs support the UTF-8 encoding: jpayne@68: Emacs 20 with Mule-UCS, and Emacs 21. As of January 2001, XEmacs doesn't jpayne@68: support the UTF-8 encoding. jpayne@68:

jpayne@68:

The character encoding name can be written in either upper or lower case. jpayne@68: Usually upper case is preferred. jpayne@68:

jpayne@68:

Content-Transfer-Encoding

Set this to 8bit. jpayne@68:

jpayne@68:

Plural-Forms

This field is optional. It is only needed if the PO file has plural forms. jpayne@68: You can find them by searching for the ‘msgid_plural’ keyword. The jpayne@68: format of the plural forms field is described in Additional functions for plural forms and jpayne@68: Translating plural forms. jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

jpayne@68:

jpayne@68: jpayne@68: This document was generated by Bruno Haible on February, 21 2024 using texi2html 1.78a. jpayne@68: jpayne@68:
jpayne@68: jpayne@68:

jpayne@68: jpayne@68: