|
Previous Next . Contents |
About . Documentation . License . Download |
gettext()
approach popular for programs written in ANSI C. However, in contrast to the
gettext() approach, the files are translated as a whole and once
for each target language, so you'll get a set of translated files from a
single source file.
The idea of gettext() is to mark all translatable strings in a
program using a special macro named _ (a single underscore).
A simple file scanner can extract all these messages and create a message
file. The message file is then translated to various languages, and the
_ macro calls a function named gettext(), which uses
these translation files to translate the messages on the fly. (This is a bit
oversimplified, but that's the basic idea.)
An input file for P18 uses a syntax similar to the invocation of the
_ macro to mark all translatable messages. Here's an example of
an p18ized HTML file:
## define LANGUAGE en <html> <head><title>_(Welcome)_</title></head> <body> <h3>_(Welcome)_</h3> _(Blah...)_ </body> </html> |
The first line defines the variable LANGUAGE to "en", which indicates thet the messages in the file are written in english. The pairs _( / )_ mark the translatable messages.
So far, this works well for static messages, but we'll run into problems if we want to translate a dynamic message string, e.g. a message string generated by a PHP script. To solve this, P18 provides a way of passing parameters to message strings. Inside the message string, these parameters may be referenced as "$n", where n is the position of the parameter. Example:
<h3>_(Pages $1 to $2){<?=$list_start?>}{<?=$list_end?>}_</h3>
|
<h3>Seiten <?=$list_start?> bis <?=$list_end?></h3> |
language-code:localization-1:localization-2...
The list of localizations may be empty. Whenever a translation of a message to a language specified by a language identifier id is reqested, P18 starts looking for a translation matching the entire identifier. If no translation is found, P18 starts stripping of localizations from the right of the identifier, either until a translation is found or the entire identifier is stripped away.
P18 itself does not care about your convention of building language identifiers. There are several conventions for encoding languages as language identifiers, the choice is up to you.
A message type is a sequence of alphanumeric ASCII characters, underscores and hyphens. You can use arbitrary identifiers for message types. However, some message types are recognized by P18 and treated in a special way. It is guaranteed that message types starting with an underscore will not be recognized by P18, so you may want to prepend your own message types with an underscore to avoid a collision with a recognized message type. See section Recognized Message Types for a list of recognized message types.
There are two ways of specifying the message type. The first is by setting the P18 variable TYPE, the other is by using a message type option in the message escape (see section P18 Syntax).
A message variant is specified by appending the additional context information to the message text, separated with two percent signs. So you might have the english message "Top", which translates to "Oben", and the message "Top%%of-page", which translates to "Zum Seitenanfang".
In contrast to most intenationalization schemes, P18 makes no destinction between source and target languages. A set of messages with different language identifiers are combined to a message set, and all languages present in the message set may be translated in all directions. This means that if you have a source file written in language A, you can change it to contain messages in language B without changing the translation database.
Typically, every message set is associate with a uniqe message set identifier. However, it is possible to create message sets without a message set identifier. Message set identifiers are useful when a translation file is generated (i.e. a file containing a translation template from one language to another). When the filled-in translation file is fed back into the translation database, the message set IDs may be used to identify the message sets for the translated messages, even if there have been minor modifications to the original messages (e.g. fixed a typo).
It is possible to change a message in the message database without changing the message set identifier. However, one has to be careful when doing this, since imporing a translation file generated earlier may lead to false translations. As a general rule, one should create a new message set (with a new message set identifier) whenever the meaning of the message changes. On the other hand, if the modification is just a fixed typo, it is safe to retain the existing message set identifier.
When translating a set of files, the typical approach is to create an initial message database by scaning the source files. The resulting translation database then contains a message set for every message found, every message set containing only that single message. In most cases all messages found will be in the same language, but it is also possible to have source files written in different languages. The next step is to create translation files for all supported languages, and have these translation files filled in by a translator. The filled in translation files are then fed back into the translation database.
It is likely that the set of messages changes over time. However, the modifications will only be done on the source files, in the respective source language. Before a release is made, P18 can be used to scan for new messages and insert them into the translation database. The maintainer will then again generate translation files for all supported languages, this time only listing the messages that don't have a translation.
As you can see, the translation files are used for communication only. Translation file can (and should) be discarded as soon as they have been fed into the translation database.
The exact syntax of translation files is described in section Translation Files.
The expansion text of an expanded macro is parsed as if it was read from an input file. It is possible to use any preprocessor directive in the expanded text.
The syntax for expanding variables and macros imitates the variable substitution syntax used by the "config.status" script generated by a GNU autoconf style "configure" script, i.e. the name of the variable is enclosed in et-signs (@). An expample for expanding variables:
_(This is $i, version $2)[en/LATIN-1]{@PACKAGE@}{@VERSION@}_
|
Expanding a macro is similar, the macro parameters are passed to the macro as a comma separated list put in parentheses. Example:
@FEATURE(A)@ ... code relevant only if feature A is enabled ... @FEATURE_END()@ |
Macro definitions are mapped to variable definitions using a simple mangling scheme. For macros with no parameters, the name is the same as for a variable. I.e. you can use an empty pairs of parentheses to expand a variable as a macro, causing the expanded text to be parsed again. For macros with parameters, the mangling is a bit more complex: The macro body is bound to a variable with a name constructed using the following scheme:
macro-name$parameter1$parameter2...
If the macro takes a variable number of arguments, the mangled macro name ends with a single dollar sign ($). If default values were specified for some of the parameters, the default values are bound to variables named:
mangled-macro-name$$parameter
Macros and variables can be defined using the ##define and/or ##macro directives (see section Syntax for details). The ##macro directive is just a convenient shorthand for a series of ##define directives defining the macro and its default parameters. ##macro also takes care of creating a correct mangling.
The conditional directives of P18 are ##if, ##else, and ##endif, see section Syntax for details.
It is also possible to specify input files or input directories on the command line. If this is done, the specified files are available through an input object named ARGS.
|
Previous Next . Contents |
About . Documentation . License . Download |