B.1
This test is to verify that all appropriate strings have been
internationalized, and that inappropriate strings have not. This
applies to button and menu labels, footer and other window text
messages and command line messages.
Simply checking the source code is not sufficient; in some cases
it is unclear that a string should be messaged until a problem arises
when the program is run. Even when a string in the source code is
wrapped with the correct message call, such as gettext
or catgets, the string must access the correct catalog.
Additionally, for catgets messages, the string must
access the correct set and message number.
A good approach is to bracket each message with a prefix and
suffix, create a modified catalog and install the catalog in another
locale. For multibyte locales, at least part of the prefix and suffix
should contain multibyte characters.
The prefix and suffix can make the total message length up to 200%
larger than the original message. These longer messages can also show
that various widgets are able to resize to handle longer strings.
- Pseudo-translate every message that can be translated. This
can be accomplished two ways:
- Use the utility that extracted the messages to also change
the messages. The Solaris operating environment provides
utilities to change a message while extracting messages from
the C/C++ source. (For examples of message extraction from C/C++
source code see How to Change Messages for C/C++ Source Files in Section 4.1.)
- Change the message catalogs after they have been extracted
from the source code. The links below show sample scripts and
tools for performing this task.
You can add a prefix and suffix to each message to indicate
the name of the locale. For multibyte locales, use some
multibyte characters as part of the prefix and suffix.
Thus, for the Japanese locale, you can translate the message
"File not found" as "JAXXFile not foundXXJA", where "XX"
represents Japanese multibyte characters. Similarly, translate
the GUI label "Open" as "JAXXOpenXXJA".
- Make sure that the suffix is placed before any final
newlines.
- Watch out for message files with continuation lines;
place the suffix on the last line only.
- Watch out for embedded single and double quotes in
messages and other special characters.
The sample scripts and tools are for products that use
catgets, gettext, and Java resource
bundles as messaging schemes. For messages in X resource files,
the same basic principles of pseudo-localization and using
prefixes and suffixes apply.
Many of these scripts are working versions that change
dynamically to reflect changes in developing products; they may
contain comments and alternate lines.
- Create the message catalog as required by the messaging system
and install it in a correct locale-specific location. This
location will be product dependent.
For Java class files that represent messages using Java
resource bundles, you must rename these files with a
locale-specific suffix and install them in the same location as
the default and other message files. These files are generally not
installed in a locale-specific directory. (See
catgets, gettext, Java language
references, man pages and the example scripts for more information
on how to do this.)
- Run the product or application in the target locale. You
should see the pseudo-translated messages, instead of the messages
you would see when the product is run in the
C
locale.
If a pseudo-translated message fails to appear:
- Verify that your translations did not break the syntax of
the messages files.
- Verify that the resulting message catalogs are installed in
the correct locale-specific location or, for Java resource
bundle class message files, are named with the correct
locale-specific suffix.
If you can verify these two items, then one or more of the
following conditions may be causing the problem:
- The product is not looking for locale-specific message
files.
- The product is looking in another location.
- The product is using the default English message instead of
the message in the locale-specific modified message catalog.
- The message is not in the catalog.
- Simulate as many messages as possible. Early testing can
verify that the overall messaging scheme works. Testing in the
later stages of product development should focus on verifying all
messages.
- By using a locale installed on all machines that use the
Solaris operating environment, such as
fr, you can enlist the help of others in this
testing. You can also encourage others to install other locales,
especially multibyte locales.
B.2
This test reviews message files in the context of the areas listed
below. It is helpful to have a technical writer review these files.
It is important to review the actual .msg, .po, Java,
or other resource or property files, because the cost of changing
even one message is high and requires more than just changing the
code. It may require changes to documents and other elements as well.
Changes take time, and unnecessary changes can severely delay a
localized product's time to market.
To insure that messages are translatable, the developer and tester
must consider the following:
- Is the message clear and directed to the user's probable level
of knowledge?
If, for example, the product does not require knowledge of
UNIX® commands or system calls or internals, then messages
should not use words referring to them.
- Is the message composed of fragments of message?
A message should be tested for fragmentation.
For more information
see Section B.6, Compound Messages.
- Is the message a dynamic message with multiple arguments?
For more information see Section B.7,
Dynamic Messages.
- Does the message use slang terminology? (e.g. English slang,
difficult to translate concepts, or product-specific terminology?)
In the case of product-specific terminology, the product should
also include a glossary of the terminology.
- Is the message commented?
Translators of a product may not know as much about the product
and software details as the developer or even as an end user of
the product. Messages should contain comments to indicate that a
certain word in a message should not be translated, or to clarify
the meaning of certain words or phrases in the message.
- Are the messages numbered consistently?
The catgets messaging scheme uses the idea of
message sets and message numbers within sets. Each development
team should have a way of keeping set and message numbering
organized so that duplicate sets and message numbers do not occur
in the catgets statements in the source code.
NOTE: Do not use duplicate keys in the same message file. For
this type of review, scan the .msg files or use other methods to
see that the files that make up one .cat file do not have
duplicate sets or message numbers within a set. Two messages may
have the same text, but the message numbers must be unique.
B.3
For products that use the gettext family of functions
for messaging, the messages are stored in files ending with a
.po extension. When the messages in the original .po
file have been translated, using the prefix and suffix, and compiled
to the proper .mo files with msgfmt(1), and the .mo
files have been properly installed in locale-specific locations, then
the messages should be read from the modified catalog when the
product is run in that locale.
If any messages appear without the prefix and suffix, they may
have been hardcoded or otherwise not prepared properly for messaging.
How to Test
- Get the original .po message files. If the .po message files
do not exist, then they must be extracted from the source using
the
xgettext command. For shell scripts which use the
gettext(1) command, there is no extractor bundled
with Solaris.
- Translate the .po message files with the prefix and suffix
using the example scripts as a guide.
- Compile the translated message files to the proper .mo files
using the
msgfmt(1) command.
- Install the compiled message files in their proper locations
for the target locale.
- Set the
LANG and LC_ALL environment
variables to the test locale.
- Run the product in the target locale and verify that all
messages have a prefix and suffix, indicating that they come from
the modified message catalog.
Samples and Scripts
These samples and scripts are customized for testing within a
certain product. While these are working versions, they will need
customization for your particular product.
B.4
For products that use the catgets family of functions
for messaging, the messages are stored in files ending with a
.msg extension. When the messages in the original .msg
file have been translated, using the prefix and suffix, and compiled
to the proper .cat files with gencat(1), and the .cat
files have been properly installed in locale-specific locations, then
the messages should be read from the modified catalog when the
product is run in that locale.
If any messages appear without the prefix and suffix, they may
have been hardcoded or otherwise not prepared properly for messaging.
How to Test
- Get the original .msg message files.
- Translate the .msg message files with the prefix and suffix
using the example scripts as a guide.
- Compile the translated message files to the proper .cat files
using the
gencat(1) command.
- Install the compiled message files in their proper locations
for the target locale.
- Set the
LANG and LC_ALL environment
variables to the locale to be tested.
- Run the product in the target locale. Verify that all messages
have the prefix and suffix you inserted to indicate that they come
from the modified message catalog.
Samples and Scripts
These samples and scripts are customized for testing within a
certain product. While these are working versions, they will need
customization for your particular product.
-
catprep
is a Perl program which
modifies the message files.
-
catprep_wide_perl
is also
a Perl program which modifies the message files. This program
expands each message to a specified percentage by using the
-e argument with a percentage amount.
-
prepall.c
modifies the message files.
-
callprep
this script calls the preparation program. This program must be modified for your environment.
-
catgets_msg.txt
An example catgets message file before modification.
-
new_catgets_msg.txt
An example catgets message file after it has been modified by calling the preparation script.
You can compile this file into the message file by running the gencat(1) command.
B.5
A good approach to testing products using Java
Resource Bundles is to bracket each message with a prefix and suffix,
then create a modified message class file and install it in another
locale. For multibyte locales, at least part of the prefix and suffix
should contain multibyte characters.
Assume that the original message files are text files with lines
of the form key=value. We assume that the message file
is made into a .java file, then compiled to a .class file. When:
- the messages are translated using the prefix and suffix;
- the message file(s) are made into a .java file;
- the .java file is compiled to a Java ResourceBundle class file
with the appropriate locale-specific suffix, and is properly
installed in locale-specific locations or in the default location
the messages should be read from the modified catalog when the
product is run in that locale.
If any messages appear without the prefix and suffix, they may
have been hardcoded or otherwise not prepared properly for messaging.
For example, given a message file named xx.messages, after
translation, a script converts it to a file named MyClass.java. Then
MyClass.java is compiled (assuming it will be installed in the
ja locale) into MyClass_ja.class. This class file is
then installed in the appropriate location.
How to Test
- Access the original message files.
- Translate them using a prefix and suffix as discussed above.
- Convert the message file into a .java file.
- Compile the translated message files to the proper .class
files using the
java command.
- Install the compiled message files at their proper locations
for the target locale. For Java ResourceBundles, you can install
the files at the same location as the default or other
locale-specific message files, since it is the locale-specific
suffix that is used to find the message files.
- Set the
LANG and LC_ALL environment
variables to the locale to be tested. (NOTE: The
Solaris Java Virtual Machine actually looks for the
target locale in the LC_CTYPE variable.)
- Run the product in the target locale. Verify that all messages
have the prefix and suffix you inserted, to indicate that they
come from the modified message catalog.
Samples and Scripts
These samples and scripts are customized for testing within a
certain product. While these are working versions, they will need
customization for your particular product.
In the examples below, message files and non-message files, both
localizable and non-localizable, are placed in two .java files. They
are then compiled into a _<locale>.class file and installed.
The locale-specific .class file is named XX_YY.class, where XX is
the class name and YY is the suffix of the locale, such as
fr or ja.
-
msgprep_perl
this perl script modifies each message file with the prefix and suffix.
-
prep_all
this script calls the program that modifies each message file with the prefix and suffix.
-
i18_prop_to_java_ksh
this script creates a .java file from the prefixed and suffixed message file (in the key=value format).
-
call_neil_shx
this script calls i18_prop_to_java_ksh with the expected arguments.
-
src_ide_mess_prop.txt
An example localizable message file before the script inserts prefixes and suffixes.
-
dest_ide_mess_prop.txt
An example localizable message file after the script inserts prefixes and suffixes.
-
properties_ja_java.txt
sample part of a .java file created as a result of running the script that
converted the message file into a .java file. This file was
generated for the ja locale and some characters may
not display properly unless you have Japanese fonts on your
system.
B.6
Sometimes when a message gets displayed to the user, it is
actually composed of several shorter messages. Generally, is not the
correct way to internationalize a message. A long message or a
message that is displayed in parts should be one message and not a
composition of short messages.
Short Compound Message
An example of a short compound message is the set of messages:
Processing.
Processing..
Processing...
Processing... Finished.
This message could be constructed two ways:
- With 3 short messages:
Processing
...
Finished.
- With 3 longer messages:
Processing
Processing...
Processing... Finished
In the first case, the three messages could be difficult to
translate because "Finished" must be at the end of the line. Not all
languages support this grammar.
In the second case, "Processing... Finished" is a
single message and thus more easily translated. The location of the
dots (preceding or following "Processing") can also be localized.
Longer Multiline Compound Message
An example of multiline compound message is:
There are %d errors in file %s.
Note: The file %s has been obsoleted and is now called %s.
Please use this new filename when describing errors.
This message could be constructed two ways:
- Incorrectly, with three short messages (one message per line)
- Correctly, with one longer message.
Using three short messages is incorrect because this assumes that
when the three lines are translated, they will all still be on three
lines. In some cases, the order of the sentences may change or the
number of lines may change after translation.
How to Test for Compound Messages
To test for compound messages, add a prefix or suffix to each
message and look for the prefix or suffix string in the middle of a
message.
Once a prefix or suffix is added, the prefix or suffix string will
be visible within a compound message. In the short message example,
the message:
Processing.
Processing..
Processing...
Processing... Finished.
Might look like:
AAAProcessing.ZZZ
AAAProcessing.ZZZAAA.ZZZ
AAAProcessing.ZZZAAA..ZZZ
AAAProcessing.ZZZAAA..ZZZAAAFinished.ZZZ
The last line shows that the incorrect method of using three
separate messages was used to construct this single message.
B.7
Often when a message is displayed to the user, it is actually a
dynamic message that has been constructed at runtime. For example,
suppose a user saw the message:
There are 10 errors in file test.conf.
This message could have been constructed dynamically from a
printf format such as:
There are %d errors in file %s.
This format contains two arguments: the first is an integer
specifying the number of errors, and the second argument is a string
specifying the name of the file. A translator will actually see this
format as the string to translate.
Dynamic messages are difficult to translate for two reasons:
- The grammar of the language may change the position of the
arguments with respect to one another. If the above message were
translated to Chinese, for example, the Chinese grammar rules
would translate the message to:
File %s has %d errors.
If this message were used, the integer and string arguments
would be reversed and would print incorrectly. In addition, if the
file had 0 errors, then a '0' would be used as the first argument
for the '%s' string, which would cause a core dump.
Fortunately, the printf(3) function call provides
a facility to number the parameters. The above example message
should be rewritten as:
There are %$1d errors in file %$2s.
In Java, this problem does not occur if a Format class is used
because these classes only have positional parameters.
- The multiple arguments within a message may produce a message
that is difficult to understand, and thus too difficult to
translate. For example, a message such as:
%$1s in %$2s has %$3d errors.
would be impossible to translate without some kind of comments
to the translator about what the first, second, or third arguments
are. (NOTE: Comments to translators can be included in the C
source file. See the man page for genmsg(1) command
for an example.)
How to Test
To test for correct dynamic messages, search the message files for
the formatting parameters and manually inspect their correctness.
Message files can be manually changed and have positional parameters
switched. While running the product, the tester can ensure that these
parameters have been switched.
The example perl script below can be used for either the
gettext or catgets methods to explicitly
show that a message is a dynamic message.
Samples and Scripts
These samples and scripts are customized for testing within a
certain product. While these are working versions, they will need
customization for your particular product.
- transmsg_perl
this perl script modifies message files using either the catgets(3c) or
gettext(3c) methods. Messages are modified as follows:
- replaces all vowels in the middle of words with two vowels
- replaces all
printf formats with the format
and the format in parenthesis. (e.g. %s -> %s(%%s))
- trans_before_msg.txt
An example message file before the script modifies the message.
- trans_after_msg.txt
An example message file after the script has modified the message.
The transmsg_perl script can be used to determine if positional parameters are used correctly. For example,
in the dtmail program, after a message is moved to
another file, a status line appears:
2 messages moved to saved_mailbox
This message means that two email messages were moved to the file
called "saved_mailbox".
Use transmsg_perl on the dtmail message file and install the message file:
$ perl transmsg.pl DtMail.msg > /tmp/new.msg
$ gencat /tmp/new.cat /tmp/new.msg
$ su
# mkdir /usr/dt/lib/nls/msg/en_US
# cp /tmp/new.cat /usr/dt/lib/nls/msg/en_US/DtMail.cat
and now run dtmail:
$ setenv LC_ALL en_US
$ dtmail &
to see the message:
2(%d) mesaages moved to saved_mailbox(%s)
You can now see that the message was constructed incorrectly. It
should have displayed:
2(%1$d) measages moved to saved_mailbox(%2$s)
top
|