P18 Internationalizing Preprocessor
Documentation
Translation Files
Translation files are used to add translations to a translation database.
The typical procedure is to generate a translation file template for a
translation from language X to language Y, have that translation file template
filled in by a translator, and then feed the translation file back to the
translation database.
A translation file template is generated using the
db export
command and fed back to the translation database using the
db import
command (see section Commands).
A translation file starts with the line
P18/translation, version 1.0
|
The version number is the file format version number, not the version number
of P18. The main purpose of the version number is to allow future versions of
P18 which might use a different format to handle old translation
files generated with previous versions of P18. However, I don't expect the
file format for translation files to change.
The body of a translation file consists of
- Ignored empty lines
Empty lines that are not part of a message text are ignored.
- Comments
Lines starting with a hash mark (#) that are not part of a
message text are ignored. A line is treated as a comment if the first
non-whitespace character on the line is a hash mark.
- Assignments
Assign a value to a symbolic variable.
- Message sets
Sets of message texts of different languages.
- An optional final end marker
A line that is not part of a message set and containing only the keyword
end indicates the end of the translation file. Everyting
following the end marker is ignored.
An assignment has the form:
| Assignment: |
| |
Identifier = Value |
Where Identifier is a sequence of alpanumeric characters, hypens, and
underscores, and Value is a string constant. If Value starts
with a quoting character, it is interpreted as a string constant containing C
style character escapes; if Value does not start with a quoting
character, it is taken literally starting with the first non-whitespace
character following the = sign up to (but not including) the
following newline character.
The following identifiers are recognized:
- encoding
Set the message text encoding to the specified value. All message texts
following the assignment up to the next assignment to encoding
(or the end of the translation file) are expected to use the specified
encoding. The initial value of the encoding variable is
ISO-8859-1.
- database-UID
Specify the UID (unique identifier) of the translation database the
translation file template was generated from. All translation file
templates generated using the
db export
command contain an assignment specifying the database UID.
If the translation file is imported to a database, it is an error, if the
value specified in the assignment does not match the database UID.
Assigning a value to an identifier that is not recognized has no effect
(besides a warning message being issued on import).
A message set starts with the keyword message and ends with the
keyword /message. The body of a message set consists of a number
of message text sections. A message text section starts with a line stating
the encoding, message type, and message UID:
| MessageSectionHead: |
| |
MessageSectionType
MessageSectionType : MessageUID |
| MessageSectionType: |
| |
[ LanguageID ]
[ LanguageID / Type ] |
The MessageSectionType specifies the language and type of the message
text. If the message type is omitted, the default type TEXT is
assumed. Note that the message type does not specify the encoding of the
message as specified in the translation file, but the encoding of the message
if it written to an output file. This means especially, that messages of the
message type HTML or XHTML will use the translation file
encoding for representing special character instead of HTML style character
references.
The MessageUID is the unique identifier associated with the message.
Typically, only unique identifiers for an entire message set are used (see
below).
The special language identifier * represents a general language.
The message text for * can be used to document the message
paramters and is typically left empty. The message UID of the message
associated with the general language * is used as a unique
identifier for the entire message set.