jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: GNU gettext utilities: 10. Producing Binary MO Files jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:

jpayne@68: jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10. Producing Binary MO Files

jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:

10.1 Invoking the `msgfmt` Program

jpayne@68: jpayne@68: jpayne@68:

msgfmt [option] filename.po …
jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

The msgfmt programs generates a binary message catalog from a textual jpayne@68: translation description. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.1.1 Input file location

jpayne@68: jpayne@68:

‘filename.po …’
‘-D directory’
‘--directory=directory’: jpayne@68: jpayne@68:
Add directory to the list of directories. Source files are jpayne@68: searched relative to this list of directories. The resulting binary jpayne@68: file will be written relative to the current directory, though. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

If an input file is ‘-’, standard input is read. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.1.2 Operation mode

jpayne@68: jpayne@68:

‘-j’
‘--java’: jpayne@68: jpayne@68: jpayne@68:
Java mode: generate a Java ResourceBundle class. jpayne@68:
jpayne@68:
‘--java2’: jpayne@68:
Like –java, and assume Java2 (JDK 1.2 or higher). jpayne@68:
jpayne@68:
‘--csharp’: jpayne@68: jpayne@68:
C# mode: generate a .NET .dll file containing a subclass of jpayne@68: GettextResourceSet. jpayne@68:
jpayne@68:
‘--csharp-resources’: jpayne@68: jpayne@68:
C# resources mode: generate a .NET ‘.resources’ file. jpayne@68:
jpayne@68:
‘--tcl’: jpayne@68: jpayne@68:
Tcl mode: generate a tcl/msgcat ‘.msg’ file. jpayne@68:
jpayne@68:
‘--qt’: jpayne@68: jpayne@68:
Qt mode: generate a Qt ‘.qm’ file. jpayne@68:
jpayne@68:
‘--desktop’: jpayne@68: jpayne@68:
Desktop Entry mode: generate a ‘.desktop’ file. jpayne@68:
jpayne@68:
‘--xml’: jpayne@68: jpayne@68:
XML mode: generate an XML file. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

10.1.3 Output file location

jpayne@68: jpayne@68:

‘-o file’

‘--output-file=file’

jpayne@68: jpayne@68:

Write output to specified file. jpayne@68:

jpayne@68:

‘--strict’

jpayne@68:

Direct the program to work strictly following the Uniforum/Sun jpayne@68: implementation. Currently this only affects the naming of the output jpayne@68: file. If this option is not given the name of the output file is the jpayne@68: same as the domain name. If the strict Uniforum mode is enabled the jpayne@68: suffix ‘.mo’ is added to the file name if it is not already jpayne@68: present. jpayne@68:

jpayne@68:

We find this behaviour of Sun's implementation rather silly and so by jpayne@68: default this mode is not selected. jpayne@68:

jpayne@68:

jpayne@68: jpayne@68:

If the output file is ‘-’, output is written to standard output. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.1.4 Output file location in Java mode

jpayne@68: jpayne@68:

‘-r resource’
‘--resource=resource’: jpayne@68: jpayne@68:
Specify the resource name. jpayne@68:
jpayne@68:
‘-l locale’
‘--locale=locale’: jpayne@68: jpayne@68:
Specify the locale name, either a language specification of the form ll jpayne@68: or a combined language and country specification of the form ll_CC. jpayne@68:
jpayne@68:
‘-d directory’: jpayne@68:
Specify the base directory of classes directory hierarchy. jpayne@68:
jpayne@68:
‘--source’: jpayne@68:
Produce a .java source file, instead of a compiled .class file. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

The class name is determined by appending the locale name to the resource name, jpayne@68: separated with an underscore. The ‘-d’ option is mandatory. The class jpayne@68: is written under the specified directory. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.1.5 Output file location in C# mode

jpayne@68: jpayne@68:

‘-r resource’
‘--resource=resource’: jpayne@68: jpayne@68:
Specify the resource name. jpayne@68:
jpayne@68:
‘-l locale’
‘--locale=locale’: jpayne@68: jpayne@68:
Specify the locale name, either a language specification of the form ll jpayne@68: or a combined language and country specification of the form ll_CC. jpayne@68:
jpayne@68:
‘-d directory’: jpayne@68:
Specify the base directory for locale dependent ‘.dll’ files. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

The ‘-l’ and ‘-d’ options are mandatory. The ‘.dll’ file is jpayne@68: written in a subdirectory of the specified directory whose name depends on the jpayne@68: locale. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.1.6 Output file location in Tcl mode

jpayne@68: jpayne@68:

‘-l locale’
‘--locale=locale’: jpayne@68: jpayne@68:
Specify the locale name, either a language specification of the form ll jpayne@68: or a combined language and country specification of the form ll_CC. jpayne@68:
jpayne@68:
‘-d directory’: jpayne@68:
Specify the base directory of ‘.msg’ message catalogs. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

The ‘-l’ and ‘-d’ options are mandatory. The ‘.msg’ file is jpayne@68: written in the specified directory. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.1.7 Desktop Entry mode operations

jpayne@68: jpayne@68:

‘--template=template’: jpayne@68:
Specify a .desktop file used as a template. jpayne@68:
jpayne@68:
‘-k[keywordspec]’
‘--keyword[=keywordspec]’: jpayne@68: jpayne@68:
Specify keywordspec as an additional keyword to be looked for. jpayne@68: Without a keywordspec, the option means to not use default keywords. jpayne@68:
jpayne@68:
‘-l locale’
‘--locale=locale’: jpayne@68: jpayne@68:
Specify the locale name, either a language specification of the form ll jpayne@68: or a combined language and country specification of the form ll_CC. jpayne@68:
jpayne@68:
‘-d directory’: jpayne@68:
Specify the directory where PO files are read. The directory must jpayne@68: contain the ‘LINGUAS’ file. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

To generate a ‘.desktop’ file for a single locale, you can use it jpayne@68: as follows. jpayne@68:

jpayne@68:

msgfmt --desktop --template=template --locale=locale \
jpayne@68:   -o file filename.po …
jpayne@68:

jpayne@68: jpayne@68:

msgfmt provides a special "bulk" operation mode to process multiple jpayne@68: ‘.po’ files at a time. jpayne@68:

jpayne@68:

msgfmt --desktop --template=template -d directory -o file
jpayne@68:

jpayne@68: jpayne@68:

msgfmt first reads the ‘LINGUAS’ file under directory, and jpayne@68: then processes all ‘.po’ files listed there. You can also limit jpayne@68: the locales to a subset, through the ‘LINGUAS’ environment jpayne@68: variable. jpayne@68:

jpayne@68:

For either operation modes, the ‘-o’ and ‘--template’ jpayne@68: options are mandatory. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.1.8 XML mode operations

jpayne@68: jpayne@68:

‘--template=template’: jpayne@68:
Specify an XML file used as a template. jpayne@68:
jpayne@68:
‘-L name’
‘--language=name’: jpayne@68: jpayne@68: jpayne@68:
Specifies the language of the input files. jpayne@68:
jpayne@68:
‘-l locale’
‘--locale=locale’: jpayne@68: jpayne@68:
Specify the locale name, either a language specification of the form ll jpayne@68: or a combined language and country specification of the form ll_CC. jpayne@68:
jpayne@68:
‘-d directory’: jpayne@68:
Specify the base directory of ‘.po’ message catalogs. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

To generate an XML file for a single locale, you can use it as follows. jpayne@68:

jpayne@68:

msgfmt --xml --template=template --locale=locale \
jpayne@68:   -o file filename.po …
jpayne@68:

jpayne@68: jpayne@68:

msgfmt provides a special "bulk" operation mode to process multiple jpayne@68: ‘.po’ files at a time. jpayne@68:

jpayne@68:

msgfmt --xml --template=template -d directory -o file
jpayne@68:

jpayne@68: jpayne@68:

jpayne@68:

For either operation modes, the ‘-o’ and ‘--template’ jpayne@68: options are mandatory. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.1.9 Input file syntax

jpayne@68: jpayne@68:

‘-P’
‘--properties-input’: jpayne@68: jpayne@68:
Assume the input files are Java ResourceBundles in Java .properties jpayne@68: syntax, not in PO file syntax. jpayne@68:
jpayne@68:
‘--stringtable-input’: jpayne@68:
Assume the input files are NeXTstep/GNUstep localized resource files in jpayne@68: .strings syntax, not in PO file syntax. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

10.1.10 Input file interpretation

jpayne@68: jpayne@68:

‘-c’

‘--check’

jpayne@68: jpayne@68:

Perform all the checks implied by --check-format, --check-header, jpayne@68: --check-domain. jpayne@68:

jpayne@68:

‘--check-format’

jpayne@68: jpayne@68:

Check language dependent format strings. jpayne@68:

jpayne@68:

If the string represents a format string used in a jpayne@68: printf-like function both strings should have the same number of jpayne@68: ‘%’ format specifiers, with matching types. If the flag jpayne@68: c-format or possible-c-format appears in the special jpayne@68: comment <#,> for this entry a check is performed. For example, the jpayne@68: check will diagnose using ‘%.*s’ against ‘%s’, or ‘%d’ jpayne@68: against ‘%s’, or ‘%d’ against ‘%x’. It can even handle jpayne@68: positional parameters. jpayne@68:

jpayne@68:

Normally the xgettext program automatically decides whether a jpayne@68: string is a format string or not. This algorithm is not perfect, jpayne@68: though. It might regard a string as a format string though it is not jpayne@68: used in a printf-like function and so msgfmt might report jpayne@68: errors where there are none. jpayne@68:

jpayne@68:

To solve this problem the programmer can dictate the decision to the jpayne@68: xgettext program (see section C Format Strings). The translator should not jpayne@68: consider removing the flag from the <#,> line. This "fix" would be jpayne@68: reversed again as soon as msgmerge is called the next time. jpayne@68:

jpayne@68:

‘--check-header’

jpayne@68:

Verify presence and contents of the header entry. See section Filling in the Header Entry, jpayne@68: for a description of the various fields in the header entry. jpayne@68:

jpayne@68:

‘--check-domain’

jpayne@68:

Check for conflicts between domain directives and the --output-file jpayne@68: option jpayne@68:

jpayne@68:

‘-C’

‘--check-compatibility’

jpayne@68: jpayne@68: jpayne@68:

Check that GNU msgfmt behaves like X/Open msgfmt. This will give an error jpayne@68: when attempting to use the GNU extensions. jpayne@68:

jpayne@68:

‘--check-accelerators[=char]’

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

Check presence of keyboard accelerators for menu items. This is based on jpayne@68: the convention used in some GUIs that a keyboard accelerator in a menu jpayne@68: item string is designated by an immediately preceding ‘&’ character. jpayne@68: Sometimes a keyboard accelerator is also called "keyboard mnemonic". jpayne@68: This check verifies that if the untranslated string has exactly one jpayne@68: ‘&’ character, the translated string has exactly one ‘&’ as well. jpayne@68: If this option is given with a char argument, this char should jpayne@68: be a non-alphanumeric character and is used as keyboard accelerator mark jpayne@68: instead of ‘&’. jpayne@68:

jpayne@68:

‘-f’

‘--use-fuzzy’

jpayne@68: jpayne@68: jpayne@68:

Use fuzzy entries in output. Note that using this option is usually wrong, jpayne@68: because fuzzy messages are exactly those which have not been validated by jpayne@68: a human translator. jpayne@68:

jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

10.1.11 Output details

jpayne@68: jpayne@68:

‘--no-convert’

jpayne@68:

Don't convert the messages to UTF-8 encoding. By default, messages are jpayne@68: converted to UTF-8 encoding before being stored in a MO file; this helps jpayne@68: avoiding conversions at run time, since nowadays most locales use the jpayne@68: UTF-8 encoding. jpayne@68:

jpayne@68:

‘--no-redundancy’

jpayne@68:

Don't pre-expand ISO C 99 <inttypes.h> format string directive macros. jpayne@68: By default, messages that are marked as c-format and contain jpayne@68: ISO C 99 <inttypes.h> format string directive macros are pre-expanded jpayne@68: for selected platforms, and these redundant expansions are stored in the jpayne@68: MO file. These redundant expansions make the translations of these jpayne@68: messages work with the gettext implementation in the libc jpayne@68: of that platform, without requiring GNU gettext's libintl. jpayne@68: The platforms that benefit from this pre-expansion are those with the jpayne@68: musl libc. jpayne@68:

jpayne@68:

‘-a number’

‘--alignment=number’

jpayne@68: jpayne@68:

Align strings to number bytes (default: 1). jpayne@68:

jpayne@68:

‘--endianness=byteorder’

jpayne@68:

Write out 32-bit numbers in the given byte order. The possible values are jpayne@68: big and little. The default is little. jpayne@68:

jpayne@68:

MO files of any endianness can be used on any platform. When a MO file has jpayne@68: an endianness other than the platform's one, the 32-bit numbers from the MO jpayne@68: file are swapped at runtime. The performance impact is negligible. jpayne@68:

jpayne@68:

This option can be useful to produce MO files that are optimized for one jpayne@68: platform. jpayne@68:

jpayne@68:

‘--no-hash’

jpayne@68:

Don't include a hash table in the binary file. Lookup will be more expensive jpayne@68: at run time (binary search instead of hash table lookup). jpayne@68:

jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

10.1.12 Informative output

jpayne@68: jpayne@68:

‘-h’
‘--help’: jpayne@68: jpayne@68:
Display this help and exit. jpayne@68:
jpayne@68:
‘-V’
‘--version’: jpayne@68: jpayne@68:
Output version information and exit. jpayne@68:
jpayne@68:
‘--statistics’: jpayne@68:
Print statistics about translations. When the option --verbose is used jpayne@68: in combination with --statistics, the input file name is printed in jpayne@68: front of the statistics line. jpayne@68:
jpayne@68:
‘-v’
‘--verbose’: jpayne@68: jpayne@68:
Increase verbosity level. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:

10.2 Invoking the `msgunfmt` Program

jpayne@68: jpayne@68: jpayne@68:

msgunfmt [option] [file]...
jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

The msgunfmt program converts a binary message catalog to a jpayne@68: Uniforum style .po file. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.2.1 Operation mode

jpayne@68: jpayne@68:

‘-j’
‘--java’: jpayne@68: jpayne@68: jpayne@68:
Java mode: input is a Java ResourceBundle class. jpayne@68:
jpayne@68:
‘--csharp’: jpayne@68: jpayne@68:
C# mode: input is a .NET .dll file containing a subclass of jpayne@68: GettextResourceSet. jpayne@68:
jpayne@68:
‘--csharp-resources’: jpayne@68: jpayne@68:
C# resources mode: input is a .NET ‘.resources’ file. jpayne@68:
jpayne@68:
‘--tcl’: jpayne@68: jpayne@68:
Tcl mode: input is a tcl/msgcat ‘.msg’ file. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

10.2.2 Input file location

jpayne@68: jpayne@68:

‘file …’: Input .mo files. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

If no input file is given or if it is ‘-’, standard input is read. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.2.3 Input file location in Java mode

jpayne@68: jpayne@68:

‘-r resource’
‘--resource=resource’: jpayne@68: jpayne@68:
Specify the resource name. jpayne@68:
jpayne@68:
‘-l locale’
‘--locale=locale’: jpayne@68: jpayne@68:
Specify the locale name, either a language specification of the form ll jpayne@68: or a combined language and country specification of the form ll_CC. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

The class name is determined by appending the locale name to the resource name, jpayne@68: separated with an underscore. The class is located using the CLASSPATH. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.2.4 Input file location in C# mode

jpayne@68: jpayne@68:

‘-r resource’
‘--resource=resource’: jpayne@68: jpayne@68:
Specify the resource name. jpayne@68:
jpayne@68:
‘-l locale’
‘--locale=locale’: jpayne@68: jpayne@68:
Specify the locale name, either a language specification of the form ll jpayne@68: or a combined language and country specification of the form ll_CC. jpayne@68:
jpayne@68:
‘-d directory’: jpayne@68:
Specify the base directory for locale dependent ‘.dll’ files. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

The ‘-l’ and ‘-d’ options are mandatory. The ‘.msg’ file is jpayne@68: located in a subdirectory of the specified directory whose name depends on the jpayne@68: locale. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.2.5 Input file location in Tcl mode

jpayne@68: jpayne@68:

‘-l locale’
‘--locale=locale’: jpayne@68: jpayne@68:
Specify the locale name, either a language specification of the form ll jpayne@68: or a combined language and country specification of the form ll_CC. jpayne@68:
jpayne@68:
‘-d directory’: jpayne@68:
Specify the base directory of ‘.msg’ message catalogs. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

The ‘-l’ and ‘-d’ options are mandatory. The ‘.msg’ file is jpayne@68: located in the specified directory. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.2.6 Output file location

jpayne@68: jpayne@68:

‘-o file’
‘--output-file=file’: jpayne@68: jpayne@68:
Write output to specified file. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68:

The results are written to standard output if no output file is specified jpayne@68: or if it is ‘-’. jpayne@68:

jpayne@68: jpayne@68: jpayne@68:

10.2.7 Output details

jpayne@68: jpayne@68: jpayne@68:

‘--color’
‘--color=when’: jpayne@68:
Specify whether or when to use colors and other text attributes. jpayne@68: See The --color option for details. jpayne@68:
jpayne@68:
‘--style=style_file’: jpayne@68:
Specify the CSS style rule file to use for --color. jpayne@68: See The --style option for details. jpayne@68:
jpayne@68:
‘--force-po’: jpayne@68:
Always write an output file even if it contains no message. jpayne@68:
jpayne@68:
‘-i’
‘--indent’: jpayne@68: jpayne@68:
Write the .po file using indented style. jpayne@68:
jpayne@68:
‘--strict’: jpayne@68:
Write out a strict Uniforum conforming PO file. Note that this jpayne@68: Uniforum format should be avoided because it doesn't support the jpayne@68: GNU extensions. jpayne@68:
jpayne@68:
‘-p’
‘--properties-output’: jpayne@68: jpayne@68:
Write out a Java ResourceBundle in Java .properties syntax. Note jpayne@68: that this file format doesn't support plural forms and silently drops jpayne@68: obsolete messages. jpayne@68:
jpayne@68:
‘--stringtable-output’: jpayne@68:
Write out a NeXTstep/GNUstep localized resource file in .strings syntax. jpayne@68: Note that this file format doesn't support plural forms. jpayne@68:
jpayne@68:
‘-w number’
‘--width=number’: jpayne@68: jpayne@68:
Set the output page width. Long strings in the output files will be jpayne@68: split across multiple lines in order to ensure that each line's width jpayne@68: (= number of screen columns) is less or equal to the given number. jpayne@68:
jpayne@68:
‘--no-wrap’: jpayne@68:
Do not break long message lines. Message lines whose width exceeds the jpayne@68: output page width will not be split into several lines. Only file reference jpayne@68: lines which are wider than the output page width will be split. jpayne@68:
jpayne@68:
‘-s’
‘--sort-output’: jpayne@68: jpayne@68: jpayne@68:
Generate sorted output. Note that using this option makes it much harder jpayne@68: for the translator to understand each message's context. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68:

10.2.8 Informative output

jpayne@68: jpayne@68:

‘-h’
‘--help’: jpayne@68: jpayne@68:
Display this help and exit. jpayne@68:
jpayne@68:
‘-V’
‘--version’: jpayne@68: jpayne@68:
Output version information and exit. jpayne@68:
jpayne@68:
‘-v’
‘--verbose’: jpayne@68: jpayne@68:
Increase verbosity level. jpayne@68:
jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:

10.3 The Format of GNU MO Files

jpayne@68: jpayne@68:

The format of the generated MO files is best described by a picture, jpayne@68: which appears below. jpayne@68:

jpayne@68: jpayne@68:

The first two words serve the identification of the file. The magic jpayne@68: number will always signal GNU MO files. The number is stored in the jpayne@68: byte order used when the MO file was generated, so the magic number jpayne@68: really is two numbers: 0x950412de and 0xde120495. jpayne@68:

jpayne@68:

The second word describes the current revision of the file format, jpayne@68: composed of a major and a minor revision number. The revision numbers jpayne@68: ensure that the readers of MO files can distinguish new formats from jpayne@68: old ones and handle their contents, as far as possible. For now the jpayne@68: major revision is 0 or 1, and the minor revision is also 0 or 1. More jpayne@68: revisions might be added in the future. A program seeing an unexpected jpayne@68: major revision number should stop reading the MO file entirely; whereas jpayne@68: an unexpected minor revision number means that the file can be read but jpayne@68: will not reveal its full contents, when parsed by a program that jpayne@68: supports only smaller minor revision numbers. jpayne@68:

jpayne@68:

The version is kept jpayne@68: separate from the magic number, instead of using different magic jpayne@68: numbers for different formats, mainly because ‘/etc/magic’ is jpayne@68: not updated often. jpayne@68:

jpayne@68:

Follow a number of pointers to later tables in the file, allowing jpayne@68: for the extension of the prefix part of MO files without having to jpayne@68: recompile programs reading them. This might become useful for later jpayne@68: inserting a few flag bits, indication about the charset used, new jpayne@68: tables, or other things. jpayne@68:

jpayne@68:

Then, at offset O and offset T in the picture, two tables jpayne@68: of string descriptors can be found. In both tables, each string jpayne@68: descriptor uses two 32 bits integers, one for the string length, jpayne@68: another for the offset of the string in the MO file, counting in bytes jpayne@68: from the start of the file. The first table contains descriptors jpayne@68: for the original strings, and is sorted so the original strings jpayne@68: are in increasing lexicographical order. The second table contains jpayne@68: descriptors for the translated strings, and is parallel to the first jpayne@68: table: to find the corresponding translation one has to access the jpayne@68: array slot in the second array with the same index. jpayne@68:

jpayne@68:

Having the original strings sorted enables the use of simple binary jpayne@68: search, for when the MO file does not contain an hashing table, or jpayne@68: for when it is not practical to use the hashing table provided in jpayne@68: the MO file. This also has another advantage, as the empty string jpayne@68: in a PO file GNU gettext is usually translated into jpayne@68: some system information attached to that particular MO file, and the jpayne@68: empty string necessarily becomes the first in both the original and jpayne@68: translated tables, making the system information very easy to find. jpayne@68:

jpayne@68: jpayne@68:

The size S of the hash table can be zero. In this case, the jpayne@68: hash table itself is not contained in the MO file. Some people might jpayne@68: prefer this because a precomputed hashing table takes disk space, and jpayne@68: does not win that much speed. The hash table contains indices jpayne@68: to the sorted array of strings in the MO file. Conflict resolution is jpayne@68: done by double hashing. The precise hashing algorithm used is fairly jpayne@68: dependent on GNU gettext code, and is not documented here. jpayne@68:

jpayne@68:

As for the strings themselves, they follow the hash file, and each jpayne@68: is terminated with a <NUL>, and this <NUL> is not counted in jpayne@68: the length which appears in the string descriptor. The msgfmt jpayne@68: program has an option selecting the alignment for MO file strings. jpayne@68: With this option, each string is separately aligned so it starts at jpayne@68: an offset which is a multiple of the alignment value. On some RISC jpayne@68: machines, a correct alignment will speed things up. jpayne@68:

jpayne@68: jpayne@68:

Contexts are stored by storing the concatenation of the context, a jpayne@68: <EOT> byte, and the original string, instead of the original string. jpayne@68:

jpayne@68: jpayne@68:

Plural forms are stored by letting the plural of the original string jpayne@68: follow the singular of the original string, separated through a jpayne@68: <NUL> byte. The length which appears in the string descriptor jpayne@68: includes both. However, only the singular of the original string jpayne@68: takes part in the hash table lookup. The plural variants of the jpayne@68: translation are all stored consecutively, separated through a jpayne@68: <NUL> byte. Here also, the length in the string descriptor jpayne@68: includes all of them. jpayne@68:

jpayne@68: jpayne@68:

The character encoding of the strings can be any standard ASCII-compatible jpayne@68: encoding, such as UTF-8, ISO-8859-1, EUC-JP, etc., as long as the jpayne@68: encoding's name is stated in the header entry (see section Filling in the Header Entry). jpayne@68: Starting with GNU gettext version 0.22, the MO files produced by jpayne@68: msgfmt have them in UTF-8 encoding, unless the msgfmt jpayne@68: option ‘--no-convert’ is used. jpayne@68:

jpayne@68:

Nothing prevents a MO file from having embedded <NUL>s in strings. jpayne@68: However, the program interface currently used already presumes jpayne@68: that strings are <NUL> terminated, so embedded <NUL>s are jpayne@68: somewhat useless. But the MO file format is general enough so other jpayne@68: interfaces would be later possible, if for example, we ever want to jpayne@68: implement wide characters right in MO files, where <NUL> bytes may jpayne@68: accidentally appear. (No, we don't want to have wide characters in MO jpayne@68: files. They would make the file unnecessarily large, and the jpayne@68: ‘wchar_t’ type being platform dependent, MO files would be jpayne@68: platform dependent as well.) jpayne@68:

jpayne@68:

This particular issue has been strongly debated in the GNU jpayne@68: gettext development forum, and it is expectable that MO file jpayne@68: format will evolve or change over time. It is even possible that many jpayne@68: formats may later be supported concurrently. But surely, we have to jpayne@68: start somewhere, and the MO file format described here is a good start. jpayne@68: Nothing is cast in concrete, and the format may later evolve fairly jpayne@68: easily, so we should feel comfortable with the current approach. jpayne@68:

jpayne@68:

        byte
jpayne@68:              +------------------------------------------+
jpayne@68:           0  | magic number = 0x950412de                |
jpayne@68:              |                                          |
jpayne@68:           4  | file format revision = 0                 |
jpayne@68:              |                                          |
jpayne@68:           8  | number of strings                        |  == N
jpayne@68:              |                                          |
jpayne@68:          12  | offset of table with original strings    |  == O
jpayne@68:              |                                          |
jpayne@68:          16  | offset of table with translation strings |  == T
jpayne@68:              |                                          |
jpayne@68:          20  | size of hashing table                    |  == S
jpayne@68:              |                                          |
jpayne@68:          24  | offset of hashing table                  |  == H
jpayne@68:              |                                          |
jpayne@68:              .                                          .
jpayne@68:              .    (possibly more entries later)         .
jpayne@68:              .                                          .
jpayne@68:              |                                          |
jpayne@68:           O  | length & offset 0th string  ----------------.
jpayne@68:       O + 8  | length & offset 1st string  ------------------.
jpayne@68:               ...                                    ...   | |
jpayne@68: O + ((N-1)*8)| length & offset (N-1)th string           |  | |
jpayne@68:              |                                          |  | |
jpayne@68:           T  | length & offset 0th translation  ---------------.
jpayne@68:       T + 8  | length & offset 1st translation  -----------------.
jpayne@68:               ...                                    ...   | | | |
jpayne@68: T + ((N-1)*8)| length & offset (N-1)th translation      |  | | | |
jpayne@68:              |                                          |  | | | |
jpayne@68:           H  | start hash table                         |  | | | |
jpayne@68:               ...                                    ...   | | | |
jpayne@68:   H + S * 4  | end hash table                           |  | | | |
jpayne@68:              |                                          |  | | | |
jpayne@68:              | NUL terminated 0th string  <----------------' | | |
jpayne@68:              |                                          |    | | |
jpayne@68:              | NUL terminated 1st string  <------------------' | |
jpayne@68:              |                                          |      | |
jpayne@68:               ...                                    ...       | |
jpayne@68:              |                                          |      | |
jpayne@68:              | NUL terminated 0th translation  <---------------' |
jpayne@68:              |                                          |        |
jpayne@68:              | NUL terminated 1st translation  <-----------------'
jpayne@68:              |                                          |
jpayne@68:               ...                                    ...
jpayne@68:              |                                          |
jpayne@68:              +------------------------------------------+
jpayne@68:

jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68: jpayne@68:

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

jpayne@68:

jpayne@68: jpayne@68: This document was generated by Bruno Haible on February, 21 2024 using texi2html 1.78a. jpayne@68: jpayne@68:
jpayne@68: jpayne@68:

jpayne@68: jpayne@68: