Sun Java Solaris Communities My SDN Account Join SDN
 
Other Topics

I18n Verification Checklist

 
  « Previous | Contents | Next »
-







B.1

This test is to verify that all appropriate strings have been internationalized, and that inappropriate strings have not. This applies to button and menu labels, footer and other window text messages and command line messages.

Simply checking the source code is not sufficient; in some cases it is unclear that a string should be messaged until a problem arises when the program is run. Even when a string in the source code is wrapped with the correct message call, such as gettext or catgets, the string must access the correct catalog. Additionally, for catgets messages, the string must access the correct set and message number.

A good approach is to bracket each message with a prefix and suffix, create a modified catalog and install the catalog in another locale. For multibyte locales, at least part of the prefix and suffix should contain multibyte characters.

The prefix and suffix can make the total message length up to 200% larger than the original message. These longer messages can also show that various widgets are able to resize to handle longer strings.

  1. Pseudo-translate every message that can be translated. This can be accomplished two ways:
    • Use the utility that extracted the messages to also change the messages. The Solaris operating environment provides utilities to change a message while extracting messages from the C/C++ source. (For examples of message extraction from C/C++ source code see How to Change Messages for C/C++ Source Files in Section 4.1.)
    • Change the message catalogs after they have been extracted from the source code. The links below show sample scripts and tools for performing this task.

      You can add a prefix and suffix to each message to indicate the name of the locale. For multibyte locales, use some multibyte characters as part of the prefix and suffix.

      Thus, for the Japanese locale, you can translate the message "File not found" as "JAXXFile not foundXXJA", where "XX" represents Japanese multibyte characters. Similarly, translate the GUI label "Open" as "JAXXOpenXXJA".

      • Make sure that the suffix is placed before any final newlines.
      • Watch out for message files with continuation lines; place the suffix on the last line only.
      • Watch out for embedded single and double quotes in messages and other special characters.

    The sample scripts and tools are for products that use catgets, gettext, and Java resource bundles as messaging schemes. For messages in X resource files, the same basic principles of pseudo-localization and using prefixes and suffixes apply.

    Many of these scripts are working versions that change dynamically to reflect changes in developing products; they may contain comments and alternate lines.


  2. Create the message catalog as required by the messaging system and install it in a correct locale-specific location. This location will be product dependent.

    For Java class files that represent messages using Java resource bundles, you must rename these files with a locale-specific suffix and install them in the same location as the default and other message files. These files are generally not installed in a locale-specific directory. (See catgets, gettext, Java language references, man pages and the example scripts for more information on how to do this.)

  3. Run the product or application in the target locale. You should see the pseudo-translated messages, instead of the messages you would see when the product is run in the C locale.

    If a pseudo-translated message fails to appear:

    • Verify that your translations did not break the syntax of the messages files.
    • Verify that the resulting message catalogs are installed in the correct locale-specific location or, for Java resource bundle class message files, are named with the correct locale-specific suffix.

    If you can verify these two items, then one or more of the following conditions may be causing the problem:

    • The product is not looking for locale-specific message files.
    • The product is looking in another location.
    • The product is using the default English message instead of the message in the locale-specific modified message catalog.
    • The message is not in the catalog.
  4. Simulate as many messages as possible. Early testing can verify that the overall messaging scheme works. Testing in the later stages of product development should focus on verifying all messages.
  5. By using a locale installed on all machines that use the Solaris operating environment, such as fr, you can enlist the help of others in this testing. You can also encourage others to install other locales, especially multibyte locales.

B.2

This test reviews message files in the context of the areas listed below. It is helpful to have a technical writer review these files.

It is important to review the actual .msg, .po, Java, or other resource or property files, because the cost of changing even one message is high and requires more than just changing the code. It may require changes to documents and other elements as well. Changes take time, and unnecessary changes can severely delay a localized product's time to market.

To insure that messages are translatable, the developer and tester must consider the following:

  • Is the message clear and directed to the user's probable level of knowledge?

    If, for example, the product does not require knowledge of UNIX® commands or system calls or internals, then messages should not use words referring to them.

  • Is the message composed of fragments of message? A message should be tested for fragmentation.

    For more information see Section B.6, Compound Messages.

  • Is the message a dynamic message with multiple arguments?

    For more information see Section B.7, Dynamic Messages.

  • Does the message use slang terminology? (e.g. English slang, difficult to translate concepts, or product-specific terminology?)

    In the case of product-specific terminology, the product should also include a glossary of the terminology.

  • Is the message commented?

    Translators of a product may not know as much about the product and software details as the developer or even as an end user of the product. Messages should contain comments to indicate that a certain word in a message should not be translated, or to clarify the meaning of certain words or phrases in the message.

  • Are the messages numbered consistently?

    The catgets messaging scheme uses the idea of message sets and message numbers within sets. Each development team should have a way of keeping set and message numbering organized so that duplicate sets and message numbers do not occur in the catgets statements in the source code.

    NOTE: Do not use duplicate keys in the same message file. For this type of review, scan the .msg files or use other methods to see that the files that make up one .cat file do not have duplicate sets or message numbers within a set. Two messages may have the same text, but the message numbers must be unique.

B.3

For products that use the gettext family of functions for messaging, the messages are stored in files ending with a .po extension. When the messages in the original .po file have been translated, using the prefix and suffix, and compiled to the proper .mo files with msgfmt(1), and the .mo files have been properly installed in locale-specific locations, then the messages should be read from the modified catalog when the product is run in that locale.

If any messages appear without the prefix and suffix, they may have been hardcoded or otherwise not prepared properly for messaging.

How to Test

  1. Get the original .po message files. If the .po message files do not exist, then they must be extracted from the source using the xgettext command. For shell scripts which use the gettext(1) command, there is no extractor bundled with Solaris.
  2. Translate the .po message files with the prefix and suffix using the example scripts as a guide.
  3. Compile the translated message files to the proper .mo files using the msgfmt(1) command.
  4. Install the compiled message files in their proper locations for the target locale.
  5. Set the LANG and LC_ALL environment variables to the test locale.
  6. Run the product in the target locale and verify that all messages have a prefix and suffix, indicating that they come from the modified message catalog.

Samples and Scripts

These samples and scripts are customized for testing within a certain product. While these are working versions, they will need customization for your particular product.

  • extract_gettext.c (in a .c source file) extracts gettext(1) calls from shell scripts.
  • xgettextsh (in a script file) is a wrapper script which calls the Solaris utility xgettext(1).

    These tools are for shell scripts which use the gettext(1) command. For C applications that use the gettext(3) function, you can call the Solaris utility xgettext on the C source file without using extract_gettext.c

  • prepall.c (in a .c source file) modifies the gettext message files.
  • callprep (in a script file) this script calls the preparation program. This program must be modified for your environment.

  • ltool_po.txt (in a .txt source file) An example gettext message file before it has been modified.
  • ltool_po_chg.txt (in a .txt source file) An example gettext message file after it has been modified by calling the preparation script. You can compile this file into the message file by running msgfmt on it.

B.4

For products that use the catgets family of functions for messaging, the messages are stored in files ending with a .msg extension. When the messages in the original .msg file have been translated, using the prefix and suffix, and compiled to the proper .cat files with gencat(1), and the .cat files have been properly installed in locale-specific locations, then the messages should be read from the modified catalog when the product is run in that locale.

If any messages appear without the prefix and suffix, they may have been hardcoded or otherwise not prepared properly for messaging.

How to Test

  1. Get the original .msg message files.
  2. Translate the .msg message files with the prefix and suffix using the example scripts as a guide.
  3. Compile the translated message files to the proper .cat files using the gencat(1) command.
  4. Install the compiled message files in their proper locations for the target locale.
  5. Set the LANG and LC_ALL environment variables to the locale to be tested.
  6. Run the product in the target locale. Verify that all messages have the prefix and suffix you inserted to indicate that they come from the modified message catalog.

Samples and Scripts

These samples and scripts are customized for testing within a certain product. While these are working versions, they will need customization for your particular product.

  • catprep (in a perl script file) is a Perl program which modifies the message files.
  • catprep_wide_perl (in a perl script file) is also a Perl program which modifies the message files. This program expands each message to a specified percentage by using the -e argument with a percentage amount.
  • prepall.c (in a .c source file) modifies the message files.
  • callprep (in a script file) this script calls the preparation program. This program must be modified for your environment.

  • catgets_msg.txt (in a .txt file) An example catgets message file before modification.
  • new_catgets_msg.txt (in a .txt file) An example catgets message file after it has been modified by calling the preparation script. You can compile this file into the message file by running the gencat(1) command.

B.5

A good approach to testing products using Java Resource Bundles is to bracket each message with a prefix and suffix, then create a modified message class file and install it in another locale. For multibyte locales, at least part of the prefix and suffix should contain multibyte characters.

Assume that the original message files are text files with lines of the form key=value. We assume that the message file is made into a .java file, then compiled to a .class file. When:

  • the messages are translated using the prefix and suffix;
  • the message file(s) are made into a .java file;
  • the .java file is compiled to a Java ResourceBundle class file with the appropriate locale-specific suffix, and is properly installed in locale-specific locations or in the default location

the messages should be read from the modified catalog when the product is run in that locale.

If any messages appear without the prefix and suffix, they may have been hardcoded or otherwise not prepared properly for messaging.

For example, given a message file named xx.messages, after translation, a script converts it to a file named MyClass.java. Then MyClass.java is compiled (assuming it will be installed in the ja locale) into MyClass_ja.class. This class file is then installed in the appropriate location.

How to Test

  • Access the original message files.
  • Translate them using a prefix and suffix as discussed above.
  • Convert the message file into a .java file.
  • Compile the translated message files to the proper .class files using the java command.
  • Install the compiled message files at their proper locations for the target locale. For Java ResourceBundles, you can install the files at the same location as the default or other locale-specific message files, since it is the locale-specific suffix that is used to find the message files.
  • Set the LANG and LC_ALL environment variables to the locale to be tested. (NOTE: The Solaris Java Virtual Machine actually looks for the target locale in the LC_CTYPE variable.)
  • Run the product in the target locale. Verify that all messages have the prefix and suffix you inserted, to indicate that they come from the modified message catalog.

Samples and Scripts

These samples and scripts are customized for testing within a certain product. While these are working versions, they will need customization for your particular product.

In the examples below, message files and non-message files, both localizable and non-localizable, are placed in two .java files. They are then compiled into a _<locale>.class file and installed.

The locale-specific .class file is named XX_YY.class, where XX is the class name and YY is the suffix of the locale, such as fr or ja.

  • msgprep_perl (in a perl script file) this perl script modifies each message file with the prefix and suffix.
  • prep_all (in a script file) this script calls the program that modifies each message file with the prefix and suffix.
  • i18_prop_to_java_ksh (in a script file) this script creates a .java file from the prefixed and suffixed message file (in the key=value format).
  • call_neil_shx (in a script file) this script calls i18_prop_to_java_ksh with the expected arguments.

  • src_ide_mess_prop.txt (in a .txt file) An example localizable message file before the script inserts prefixes and suffixes.
  • dest_ide_mess_prop.txt (in a .txt file) An example localizable message file after the script inserts prefixes and suffixes.
  • properties_ja_java.txt (in a .txt file) sample part of a .java file created as a result of running the script that converted the message file into a .java file. This file was generated for the ja locale and some characters may not display properly unless you have Japanese fonts on your system.

B.6

Sometimes when a message gets displayed to the user, it is actually composed of several shorter messages. Generally, is not the correct way to internationalize a message. A long message or a message that is displayed in parts should be one message and not a composition of short messages.

Short Compound Message

An example of a short compound message is the set of messages:

Processing.
Processing..
Processing...
Processing... Finished.

This message could be constructed two ways:

  1. With 3 short messages:
    Processing
    ...
    Finished.
  2. With 3 longer messages:
    Processing
    Processing...
    Processing... Finished

In the first case, the three messages could be difficult to translate because "Finished" must be at the end of the line. Not all languages support this grammar.

In the second case, "Processing... Finished" is a single message and thus more easily translated. The location of the dots (preceding or following "Processing") can also be localized.

Longer Multiline Compound Message

An example of multiline compound message is:

There are %d errors in file %s.
Note: The file %s has been obsoleted and is now called %s.
Please use this new filename when describing errors.

This message could be constructed two ways:

  1. Incorrectly, with three short messages (one message per line)
  2. Correctly, with one longer message.

Using three short messages is incorrect because this assumes that when the three lines are translated, they will all still be on three lines. In some cases, the order of the sentences may change or the number of lines may change after translation.

How to Test for Compound Messages

To test for compound messages, add a prefix or suffix to each message and look for the prefix or suffix string in the middle of a message.

Once a prefix or suffix is added, the prefix or suffix string will be visible within a compound message. In the short message example, the message:

Processing.
Processing..
Processing...
Processing... Finished.

Might look like:

AAAProcessing.ZZZ
AAAProcessing.ZZZAAA.ZZZ
AAAProcessing.ZZZAAA..ZZZ
AAAProcessing.ZZZAAA..ZZZAAAFinished.ZZZ

The last line shows that the incorrect method of using three separate messages was used to construct this single message.

B.7

Often when a message is displayed to the user, it is actually a dynamic message that has been constructed at runtime. For example, suppose a user saw the message:

There are 10 errors in file test.conf.

This message could have been constructed dynamically from a printf format such as:

There are %d errors in file %s.

This format contains two arguments: the first is an integer specifying the number of errors, and the second argument is a string specifying the name of the file. A translator will actually see this format as the string to translate.

Dynamic messages are difficult to translate for two reasons:

  1. The grammar of the language may change the position of the arguments with respect to one another. If the above message were translated to Chinese, for example, the Chinese grammar rules would translate the message to:

    File %s has %d errors.

    If this message were used, the integer and string arguments would be reversed and would print incorrectly. In addition, if the file had 0 errors, then a '0' would be used as the first argument for the '%s' string, which would cause a core dump.

    Fortunately, the printf(3) function call provides a facility to number the parameters. The above example message should be rewritten as:

    There are %$1d errors in file %$2s.

    In Java, this problem does not occur if a Format class is used because these classes only have positional parameters.

  2. The multiple arguments within a message may produce a message that is difficult to understand, and thus too difficult to translate. For example, a message such as:

    %$1s in %$2s has %$3d errors.

    would be impossible to translate without some kind of comments to the translator about what the first, second, or third arguments are. (NOTE: Comments to translators can be included in the C source file. See the man page for genmsg(1) command for an example.)

How to Test

To test for correct dynamic messages, search the message files for the formatting parameters and manually inspect their correctness. Message files can be manually changed and have positional parameters switched. While running the product, the tester can ensure that these parameters have been switched.

The example perl script below can be used for either the gettext or catgets methods to explicitly show that a message is a dynamic message.

Samples and Scripts

These samples and scripts are customized for testing within a certain product. While these are working versions, they will need customization for your particular product.

  • transmsg_perl (in a perl script file) this perl script modifies message files using either the catgets(3c) or gettext(3c) methods. Messages are modified as follows:
    • replaces all vowels in the middle of words with two vowels
    • replaces all printf formats with the format and the format in parenthesis. (e.g. %s -> %s(%%s))
  • trans_before_msg.txt (in a .txt file) An example message file before the script modifies the message.
  • trans_after_msg.txt (in a .txt file) An example message file after the script has modified the message.

The transmsg_perl script can be used to determine if positional parameters are used correctly. For example, in the dtmail program, after a message is moved to another file, a status line appears:

2 messages moved to saved_mailbox

This message means that two email messages were moved to the file called "saved_mailbox".

Use transmsg_perl on the dtmail message file and install the message file:

$ perl transmsg.pl DtMail.msg > /tmp/new.msg
$ gencat /tmp/new.cat /tmp/new.msg
$ su 
# mkdir /usr/dt/lib/nls/msg/en_US
# cp /tmp/new.cat /usr/dt/lib/nls/msg/en_US/DtMail.cat

and now run dtmail:

$ setenv LC_ALL en_US
$ dtmail &

to see the message:

	2(%d) mesaages moved to saved_mailbox(%s)

You can now see that the message was constructed incorrectly. It should have displayed:

	2(%1$d) measages moved to saved_mailbox(%2$s)
top
  « Previous | Contents | Next »
 
Related Links