jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:
jpayne@68:[ << ] | jpayne@68:[ >> ] | jpayne@68:jpayne@68: | jpayne@68: | jpayne@68: | jpayne@68: | jpayne@68: | [Top] | jpayne@68:[Contents] | jpayne@68:[Index] | jpayne@68:[ ? ] | jpayne@68:
The GNU gettext
toolset helps programmers and translators
jpayne@68: at producing, updating and using translation files, mainly those
jpayne@68: PO files which are textual, editable files. This chapter explains
jpayne@68: the format of PO files.
jpayne@68:
A PO file is made up of many entries, each entry holding the relation jpayne@68: between an original untranslated string and its corresponding jpayne@68: translation. All entries in a given PO file usually pertain jpayne@68: to a single project, and all translations are expressed in a single jpayne@68: target language. One PO file entry has the following schematic jpayne@68: structure: jpayne@68:
jpayne@68:white-space jpayne@68: # translator-comments jpayne@68: #. extracted-comments jpayne@68: #: reference… jpayne@68: #, flag… jpayne@68: #| msgid previous-untranslated-string jpayne@68: msgid untranslated-string jpayne@68: msgstr translated-string jpayne@68: |
The general structure of a PO file should be well understood by jpayne@68: the translator. When using PO mode, very little has to be known jpayne@68: about the format details, as PO mode takes care of them for her. jpayne@68:
jpayne@68:A simple entry can look like this: jpayne@68:
jpayne@68:#: lib/error.c:116 jpayne@68: msgid "Unknown system error" jpayne@68: msgstr "Error desconegut del sistema" jpayne@68: |
Entries begin with some optional white space. Usually, when generated
jpayne@68: through GNU gettext
tools, there is exactly one blank line
jpayne@68: between entries. Then comments follow, on lines all starting with the
jpayne@68: character #
. There are two kinds of comments: those which have
jpayne@68: some white space immediately following the #
- the translator
jpayne@68: comments -, which comments are created and maintained exclusively by the
jpayne@68: translator, and those which have some non-white character just after the
jpayne@68: #
- the automatic comments -, which comments are created and
jpayne@68: maintained automatically by GNU gettext
tools. Comment lines
jpayne@68: starting with #.
contain comments given by the programmer, directed
jpayne@68: at the translator; these comments are called extracted comments
jpayne@68: because the xgettext
program extracts them from the program's
jpayne@68: source code. Comment lines starting with #:
contain references to
jpayne@68: the program's source code. Comment lines starting with #,
contain
jpayne@68: flags; more about these below. Comment lines starting with #|
jpayne@68: contain the previous untranslated string for which the translator gave
jpayne@68: a translation.
jpayne@68:
All comments, of either kind, are optional. jpayne@68:
jpayne@68:References to the program's source code, in lines that start with #:
,
jpayne@68: are of the form file_name:line_number
or just
jpayne@68: file_name. If the file_name contains spaces. it is enclosed
jpayne@68: within Unicode characters U+2068 and U+2069.
jpayne@68:
After white space and comments, entries show two strings, namely
jpayne@68: first the untranslated string as it appears in the original program
jpayne@68: sources, and then, the translation of this string. The original
jpayne@68: string is introduced by the keyword msgid
, and the translation,
jpayne@68: by msgstr
. The two strings, untranslated and translated,
jpayne@68: are quoted in various ways in the PO file, using "
jpayne@68: delimiters and \
escapes, but the translator does not really
jpayne@68: have to pay attention to the precise quoting format, as PO mode fully
jpayne@68: takes care of quoting for her.
jpayne@68:
The msgid
strings, as well as automatic comments, are produced
jpayne@68: and managed by other GNU gettext
tools, and PO mode does not
jpayne@68: provide means for the translator to alter these. The most she can
jpayne@68: do is merely deleting them, and only by deleting the whole entry.
jpayne@68: On the other hand, the msgstr
string, as well as translator
jpayne@68: comments, are really meant for the translator, and PO mode gives her
jpayne@68: the full control she needs.
jpayne@68:
The comment lines beginning with #,
are special because they are
jpayne@68: not completely ignored by the programs as comments generally are. The
jpayne@68: comma separated list of flags is used by the msgfmt
jpayne@68: program to give the user some better diagnostic messages. Currently
jpayne@68: there are two forms of flags defined:
jpayne@68:
fuzzy
This flag can be generated by the msgmerge
program or it can be
jpayne@68: inserted by the translator herself. It shows that the msgstr
jpayne@68: string might not be a correct translation (anymore). Only the translator
jpayne@68: can judge if the translation requires further modification, or is
jpayne@68: acceptable as is. Once satisfied with the translation, she then removes
jpayne@68: this fuzzy
attribute. The msgmerge
program inserts this
jpayne@68: when it combined the msgid
and msgstr
entries after fuzzy
jpayne@68: search only. See section Fuzzy Entries.
jpayne@68:
c-format
no-c-format
These flags should not be added by a human. Instead only the
jpayne@68: xgettext
program adds them. In an automated PO file processing
jpayne@68: system as proposed here, the user's changes would be thrown away again as
jpayne@68: soon as the xgettext
program generates a new template file.
jpayne@68:
The c-format
flag indicates that the untranslated string and the
jpayne@68: translation are supposed to be C format strings. The no-c-format
jpayne@68: flag indicates that they are not C format strings, even though the untranslated
jpayne@68: string happens to look like a C format string (with ‘%’ directives).
jpayne@68:
When the c-format
flag is given for a string the msgfmt
jpayne@68: program does some more tests to check the validity of the translation.
jpayne@68: See section Invoking the msgfmt
Program, Special Comments preceding Keywords and C Format Strings.
jpayne@68:
objc-format
no-objc-format
Likewise for Objective C, see Objective C Format Strings. jpayne@68:
jpayne@68:c++-format
no-c++-format
Likewise for C++, see C++ Format Strings. jpayne@68:
jpayne@68:python-format
no-python-format
Likewise for Python, see Python Format Strings. jpayne@68:
jpayne@68:python-brace-format
no-python-brace-format
Likewise for Python brace, see Python Format Strings. jpayne@68:
jpayne@68:java-format
no-java-format
Likewise for Java MessageFormat
format strings, see Java Format Strings.
jpayne@68:
java-printf-format
no-java-printf-format
Likewise for Java printf
format strings, see Java Format Strings.
jpayne@68:
csharp-format
no-csharp-format
Likewise for C#, see C# Format Strings. jpayne@68:
jpayne@68:javascript-format
no-javascript-format
Likewise for JavaScript, see JavaScript Format Strings. jpayne@68:
jpayne@68:scheme-format
no-scheme-format
Likewise for Scheme, see Scheme Format Strings. jpayne@68:
jpayne@68:lisp-format
no-lisp-format
Likewise for Lisp, see Lisp Format Strings. jpayne@68:
jpayne@68:elisp-format
no-elisp-format
Likewise for Emacs Lisp, see Emacs Lisp Format Strings. jpayne@68:
jpayne@68:librep-format
no-librep-format
Likewise for librep, see librep Format Strings. jpayne@68:
jpayne@68:ruby-format
no-ruby-format
Likewise for Ruby, see Ruby Format Strings. jpayne@68:
jpayne@68:sh-format
no-sh-format
Likewise for Shell, see Shell Format Strings. jpayne@68:
jpayne@68:awk-format
no-awk-format
Likewise for awk, see awk Format Strings. jpayne@68:
jpayne@68:lua-format
no-lua-format
Likewise for Lua, see Lua Format Strings. jpayne@68:
jpayne@68:object-pascal-format
no-object-pascal-format
Likewise for Object Pascal, see Object Pascal Format Strings. jpayne@68:
jpayne@68:smalltalk-format
no-smalltalk-format
Likewise for Smalltalk, see Smalltalk Format Strings. jpayne@68:
jpayne@68:qt-format
no-qt-format
Likewise for Qt, see Qt Format Strings. jpayne@68:
jpayne@68:qt-plural-format
no-qt-plural-format
Likewise for Qt plural forms, see Qt Format Strings. jpayne@68:
jpayne@68:kde-format
no-kde-format
Likewise for KDE, see KDE Format Strings. jpayne@68:
jpayne@68:boost-format
no-boost-format
Likewise for Boost, see Boost Format Strings. jpayne@68:
jpayne@68:tcl-format
no-tcl-format
Likewise for Tcl, see Tcl Format Strings. jpayne@68:
jpayne@68:perl-format
no-perl-format
Likewise for Perl, see Perl Format Strings. jpayne@68:
jpayne@68:perl-brace-format
no-perl-brace-format
Likewise for Perl brace, see Perl Format Strings. jpayne@68:
jpayne@68:php-format
no-php-format
Likewise for PHP, see PHP Format Strings. jpayne@68:
jpayne@68:gcc-internal-format
no-gcc-internal-format
Likewise for the GCC sources, see GCC internal Format Strings. jpayne@68:
jpayne@68:gfc-internal-format
no-gfc-internal-format
Likewise for the GNU Fortran Compiler sources, see GFC internal Format Strings. jpayne@68:
jpayne@68:ycp-format
no-ycp-format
Likewise for YCP, see YCP Format Strings. jpayne@68:
jpayne@68:It is also possible to have entries with a context specifier. They look like jpayne@68: this: jpayne@68:
jpayne@68:white-space jpayne@68: # translator-comments jpayne@68: #. extracted-comments jpayne@68: #: reference… jpayne@68: #, flag… jpayne@68: #| msgctxt previous-context jpayne@68: #| msgid previous-untranslated-string jpayne@68: msgctxt context jpayne@68: msgid untranslated-string jpayne@68: msgstr translated-string jpayne@68: |
The context serves to disambiguate messages with the same
jpayne@68: untranslated-string. It is possible to have several entries with
jpayne@68: the same untranslated-string in a PO file, provided that they each
jpayne@68: have a different context. Note that an empty context string
jpayne@68: and an absent msgctxt
line do not mean the same thing.
jpayne@68:
A different kind of entries is used for translations which involve jpayne@68: plural forms. jpayne@68:
jpayne@68:white-space jpayne@68: # translator-comments jpayne@68: #. extracted-comments jpayne@68: #: reference… jpayne@68: #, flag… jpayne@68: #| msgid previous-untranslated-string-singular jpayne@68: #| msgid_plural previous-untranslated-string-plural jpayne@68: msgid untranslated-string-singular jpayne@68: msgid_plural untranslated-string-plural jpayne@68: msgstr[0] translated-string-case-0 jpayne@68: ... jpayne@68: msgstr[N] translated-string-case-n jpayne@68: |
Such an entry can look like this: jpayne@68:
jpayne@68:#: src/msgcmp.c:338 src/po-lex.c:699 jpayne@68: #, c-format jpayne@68: msgid "found %d fatal error" jpayne@68: msgid_plural "found %d fatal errors" jpayne@68: msgstr[0] "s'ha trobat %d error fatal" jpayne@68: msgstr[1] "s'han trobat %d errors fatals" jpayne@68: |
Here also, a msgctxt
context can be specified before msgid
,
jpayne@68: like above.
jpayne@68:
Here, additional kinds of flags can be used: jpayne@68:
jpayne@68:range:
This flag is followed by a range of non-negative numbers, using the syntax
jpayne@68: range: minimum-value..maximum-value
. It designates the
jpayne@68: possible values that the numeric parameter of the message can take. In some
jpayne@68: languages, translators may produce slightly better translations if they know
jpayne@68: that the value can only take on values between 0 and 10, for example.
jpayne@68:
The previous-untranslated-string is optionally inserted by the
jpayne@68: msgmerge
program, at the same time when it marks a message fuzzy.
jpayne@68: It helps the translator to see which changes were done by the developers
jpayne@68: on the untranslated-string.
jpayne@68:
It happens that some lines, usually whitespace or comments, follow the jpayne@68: very last entry of a PO file. Such lines are not part of any entry, jpayne@68: and will be dropped when the PO file is processed by the tools, or may jpayne@68: disturb some PO file editors. jpayne@68:
jpayne@68:The remainder of this section may be safely skipped by those using jpayne@68: a PO file editor, yet it may be interesting for everybody to have a better jpayne@68: idea of the precise format of a PO file. On the other hand, those jpayne@68: wishing to modify PO files by hand should carefully continue reading on. jpayne@68:
jpayne@68:An empty untranslated-string is reserved to contain the header jpayne@68: entry with the meta information (see section Filling in the Header Entry). This header jpayne@68: entry should be the first entry of the file. The empty jpayne@68: untranslated-string is reserved for this purpose and must jpayne@68: not be used anywhere else. jpayne@68:
jpayne@68:Each of untranslated-string and translated-string respects
jpayne@68: the C syntax for a character string, including the surrounding quotes
jpayne@68: and embedded backslashed escape sequences, except that universal character
jpayne@68: escape sequences (\u
and \U
) are not allowed. When the time
jpayne@68: comes to write multi-line strings, one should not use escaped newlines.
jpayne@68: Instead, a closing quote should follow the last character on the
jpayne@68: line to be continued, and an opening quote should resume the string
jpayne@68: at the beginning of the following PO file line. For example:
jpayne@68:
msgid "" jpayne@68: "Here is an example of how one might continue a very long string\n" jpayne@68: "for the common case the string represents multi-line output.\n" jpayne@68: |
In this example, the empty string is used on the first line, to
jpayne@68: allow better alignment of the H
from the word ‘Here’
jpayne@68: over the f
from the word ‘for’. In this example, the
jpayne@68: msgid
keyword is followed by three strings, which are meant
jpayne@68: to be concatenated. Concatenating the empty string does not change
jpayne@68: the resulting overall string, but it is a way for us to comply with
jpayne@68: the necessity of msgid
to be followed by a string on the same
jpayne@68: line, while keeping the multi-line presentation left-justified, as
jpayne@68: we find this to be a cleaner disposition. The empty string could have
jpayne@68: been omitted, but only if the string starting with ‘Here’ was
jpayne@68: promoted on the first line, right after msgid
.(2) It was not really necessary
jpayne@68: either to switch between the two last quoted strings immediately after
jpayne@68: the newline ‘\n’, the switch could have occurred after any
jpayne@68: other character, we just did it this way because it is neater.
jpayne@68:
One should carefully distinguish between end of lines marked as jpayne@68: ‘\n’ inside quotes, which are part of the represented jpayne@68: string, and end of lines in the PO file itself, outside string quotes, jpayne@68: which have no incidence on the represented string. jpayne@68:
jpayne@68: jpayne@68:Outside strings, white lines and comments may be used freely.
jpayne@68: Comments start at the beginning of a line with ‘#’ and extend
jpayne@68: until the end of the PO file line. Comments written by translators
jpayne@68: should have the initial ‘#’ immediately followed by some white
jpayne@68: space. If the ‘#’ is not immediately followed by white space,
jpayne@68: this comment is most likely generated and managed by specialized GNU
jpayne@68: tools, and might disappear or be replaced unexpectedly when the PO
jpayne@68: file is given to msgmerge
.
jpayne@68:
For a PO file to be valid, no two entries without msgctxt
may have
jpayne@68: the same untranslated-string or untranslated-string-singular.
jpayne@68: Similarly, no two entries may have the same msgctxt
and the same
jpayne@68: untranslated-string or untranslated-string-singular.
jpayne@68:
[ << ] | jpayne@68:[ >> ] | jpayne@68:jpayne@68: | jpayne@68: | jpayne@68: | jpayne@68: | jpayne@68: | [Top] | jpayne@68:[Contents] | jpayne@68:[Index] | jpayne@68:[ ? ] | jpayne@68:
jpayne@68:
jpayne@68: This document was generated by Bruno Haible on February, 21 2024 using texi2html 1.78a.
jpayne@68:
jpayne@68:
jpayne@68:
jpayne@68: