[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

12. The Translator's View

12.1 Organization

For some software packages, each translator works on her own and communicates directly with the developers of the package. For some other software packages, on the other hand, translators are organized into translation projects and translation teams.

A translation project applies to a group of software packages and shares procedures and methodologies regarding the translation.

There are currently three major translation projects:

A translation team is a group of translators for a single language, in the scope of a translation project.

12.2 Responsibilities in the Translation Project

The following rules and habits apply to the Translation Project.

The translator's responsibilities are:

The Translation Project has a coordinator. He can be reached at ‘coordinator@translationproject.org’. His responsibilities are:

The responsibilities of the package maintainers are:

12.3 Language dialects

For many languages, a translation into the main dialect is intelligible by all speakers of the language. Speakers of another dialect can have a separate translation if they wish so. In fact, since the fallback mechanism implemented in GNU libc and GNU libintl applies on a per-message basis, the message catalog for the dialect needs only to contain the translations that differ from those in the main language.

For example, French speakers in Canada (that is, users in the locale fr_CA) can use and do accept translations produced by French speakers in France (typical file name: fr.po). Nevertheless, the translation system with PO files enables them to produce special message catalogs (file name: fr_CA.po) that will take priority over fr.po for users in that locale. Similarly for users in Austria, where message catalogs de_AT.po take priority over the catalogs named de.po that reflect German as spoken in Germany.

The situation is different for Chinese, though: Since users in the People's Republic of China and in Singapore want translations with Simplified Chinese characters, whereas Chinese users in other territories (such as Taiwan, Hong Kong, and Macao) want translations with Traditional Chinese characters, no translator should ever submit a file named zh.po. Instead, there will typically be two separate translation teams: a team that produces translations with Simplified Chinese characters (file name zh_CN.po) and a team that produces translations with Traditional Chinese characters (file name zh_TW.po).

12.4 Translating plural forms

Suppose you are translating a PO file, and it contains an entry like this:

 
#, c-format
msgid "One file removed"
msgid_plural "%d files removed"
msgstr[0] ""
msgstr[1] ""

What does this mean? How do you fill it in?

Such an entry denotes a message with plural forms, that is, a message where the text depends on a cardinal number. The general form of the message, in English, is the msgid_plural line. The msgid line is the English singular form, that is, the form for when the number is equal to 1. More details about plural forms are explained in Additional functions for plural forms.

The first thing you need to look at is the Plural-Forms line in the header entry of the PO file. It contains the number of plural forms and a formula. If the PO file does not yet have such a line, you have to add it. It only depends on the language into which you are translating. You can get this info by using the msginit command (see Creating a New PO File) – it contains a database of known plural formulas – or by asking other members of your translation team.

Suppose the line looks as follows:

 
"Plural-Forms: nplurals=3; plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 && n"
"%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;\n"

It's logically one line; recall that the PO file formatting is allowed to break long lines so that each physical line fits in 80 monospaced columns.

The value of nplurals here tells you that there are three plural forms. The first thing you need to do is to ensure that the entry contains an msgstr line for each of the forms:

 
#, c-format
msgid "One file removed"
msgid_plural "%d files removed"
msgstr[0] ""
msgstr[1] ""
msgstr[2] ""

Then translate the msgid_plural line and fill it in into each msgstr line:

 
#, c-format
msgid "One file removed"
msgid_plural "%d files removed"
msgstr[0] "%d slika uklonjenih"
msgstr[1] "%d slika uklonjenih"
msgstr[2] "%d slika uklonjenih"

Now you can refine the translation so that it matches the plural form. According to the formula above, msgstr[0] is used when the number ends in 1 but does not end in 11; msgstr[1] is used when the number ends in 2, 3, 4, but not in 12, 13, 14; and msgstr[2] is used in all other cases. With this knowledge, you can refine the translations:

 
#, c-format
msgid "One file removed"
msgid_plural "%d files removed"
msgstr[0] "%d slika je uklonjena"
msgstr[1] "%d datoteke uklonjenih"
msgstr[2] "%d slika uklonjenih"

You noticed that in the English singular form (msgid) the number placeholder could be omitted and replaced by the numeral word “one”. Can you do this in your translation as well?

 
msgstr[0] "jednom datotekom je uklonjen"

Well, it depends on whether msgstr[0] applies only to the number 1, or to other numbers as well. If, according to the plural formula, msgstr[0] applies only to n == 1, then you can use the specialized translation without the number placeholder. In our case, however, msgstr[0] also applies to the numbers 21, 31, 41, etc., and therefore you cannot omit the placeholder.

12.5 Prioritizing messages: How to determine which messages to translate first

A translator sometimes has only a limited amount of time per week to spend on a package, and some packages have quite large message catalogs (over 1000 messages). Therefore she wishes to translate the messages first that are the most visible to the user, or that occur most frequently. This section describes how to determine these "most urgent" messages. It also applies to determine the "next most urgent" messages after the message catalog has already been partially translated.

In a first step, she uses the programs like a user would do. While she does this, the GNU gettext library logs into a file the not yet translated messages for which a translation was requested from the program.

In a second step, she uses the PO mode to translate precisely this set of messages.

Here are more details. The GNU libintl library (but not the corresponding functions in GNU libc) supports an environment variable GETTEXT_LOG_UNTRANSLATED. The GNU libintl library will log into this file the messages for which gettext() and related functions couldn't find the translation. If the file doesn't exist, it will be created as needed. On systems with GNU libc a shared library ‘preloadable_libintl.so’ is provided that can be used with the ELF ‘LD_PRELOAD’ mechanism.

So, in the first step, the translator uses these commands on systems with GNU libc:

 
$ LD_PRELOAD=/usr/local/lib/preloadable_libintl.so
$ export LD_PRELOAD
$ GETTEXT_LOG_UNTRANSLATED=$HOME/gettextlogused
$ export GETTEXT_LOG_UNTRANSLATED

and these commands on other systems:

 
$ GETTEXT_LOG_UNTRANSLATED=$HOME/gettextlogused
$ export GETTEXT_LOG_UNTRANSLATED

Then she uses and peruses the programs. (It is a good and recommended practice to use the programs for which you provide translations: it gives you the needed context.) When done, she removes the environment variables:

 
$ unset LD_PRELOAD
$ unset GETTEXT_LOG_UNTRANSLATED

The second step starts with removing duplicates:

 
$ msguniq $HOME/gettextlogused > missing.po

The result is a PO file, but needs some preprocessing before a PO file editor can be used with it. First, it is a multi-domain PO file, containing messages from many translation domains. Second, it lacks all translator comments and source references. Here is how to get a list of the affected translation domains:

 
$ sed -n -e 's,^domain "\(.*\)"$,\1,p' < missing.po | sort | uniq

Then the translator can handle the domains one by one. For simplicity, let's use environment variables to denote the language, domain and source package.

 
$ lang=nl             # your language
$ domain=coreutils    # the name of the domain to be handled
$ package=/usr/src/gnu/coreutils-4.5.4   # the package where it comes from

She takes the latest copy of ‘$lang.po’ from the Translation Project, or from the package (in most cases, ‘$package/po/$lang.po’), or creates a fresh one if she's the first translator (see Creating a New PO File). She then uses the following commands to mark the not urgent messages as "obsolete". (This doesn't mean that these messages - translated and untranslated ones - will go away. It simply means that the PO file editor will ignore them in the following editing session.)

 
$ msggrep --domain=$domain missing.po | grep -v '^domain' \
  > $domain-missing.po
$ msgattrib --set-obsolete --ignore-file $domain-missing.po $domain.$lang.po \
  > $domain.$lang-urgent.po

Then she translates ‘$domain.$lang-urgent.po’ by use of a PO file editor (see section Editing PO Files). (FIXME: I don't know whether Lokalize and gtranslator also preserve obsolete messages, as they should.) Finally she restores the not urgent messages (with their earlier translations, for those which were already translated) through this command:

 
$ msgmerge --no-fuzzy-matching $domain.$lang-urgent.po $package/po/$domain.pot \
  > $domain.$lang.po

Then she can submit ‘$domain.$lang.po’ and proceed to the next domain.

[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Bruno Haible on December, 31 2024 using texi2html 1.78a.