Sun Java Solaris Communities My SDN Account Join SDN
 
FAQ

Your Questions Answered

 
 
    new!Question of the Week: How do I start an application in a terminal window using a different locale from the current desktop locale?

  1. I am new to Java. How should I implement Unicode in my applications?
  2. All my windows have a strange "bar" underneath them...?
  3. Java Encoding Detection?
  4. Physical fonts v`s Asian font
  5. How do I install Cyrillic fonts on Solaris?
  6. Displaying multibyte characters
  7. Japanese Solaris
  8. Displaying Asian characters in utf-8 encoded pages
  9. Serving content from a bilingual site
  10. Thai support in Java
  11. Using setlocale() for Arabic
  12. setlocale() and XDrawImage
  13. Resizing Java Applications
  14. Multibyte Swing UIs
  15. Ensuring Arabic text displays right to left
  16. Building a Japanese search engine: I am unable to compare Japanese Strings with each other using Java
  17. How do I type keyboard shortcuts - composite characters - on Solaris using a PC keyboard?


Question of the Week

Dear globalwebmaster,

How do I start an application in a terminal window using a different locale from the current desktop locale?

globalwebmaster says,

You can do this by emulating the CDE log in routine in a dtterm window, as described below. The processes that run when you log in to CDE include:

  • Specific configurations for the locale are set up
  • Fontpaths for the locale are loaded from /usr/openwin/lib/locale/<locale>/OWfontpath (and other possible places)
  • Session scripts, including the input method server, are run from /usr/dt/config/Xsession.d/*
  • The user's shell profile and resource files are read
Starting a CDE application from a shell does not load all the required font paths or input method support if the language or locale requires a font path and the current desktop doesn't have one. To emulate the CDE login, you need to run xset fp+ for all the font paths shown in /usr/openwin/lib/locale/<locale>/OWfontpath, and you may also need to start the input method server for the locale.

For information on starting the input method server, see /usr/dt/config/<locale>/0020.dtims, which usually runs the /usr/openwin/lib/locale/<loicale>/imsscript/* script
file containing startup instructions for the input method server. For example, the /usr/openwin/lib/locale/en_US.UTF-8/imsscript/S505multi contains the following:

if [ -x /usr/openwin/bin/htt -a -x /usr/lib/im/htt_xbe ]; then
         /usr/openwin/bin/htt -xim htt_xbe &
         unset DTSTARTIMS
if

This starts the input method server, which loads each required language engine dynamically (if the system has the language engines).

Below are the steps you need to follow. If you have logged in to CDE using the Unicode or UTF-8 locales and you want to start an application in the same locale, you can skip Step 2 (and in most cases Step 1 too).

1. Add the font paths for the locale you want (in the csh example):

        foreach i ( `cat /usr/openwin/lib/locale/<locale>/OWfontpath` )
                xset fp+ $i
                xset fp rehash
        end

    (The xset fp rehash is probably not required, but it is a good idea after adding each font path.)

2. Start the input method server if your locale is an Asian or Unicode locale and has one or more of the script files at /usr/openwin/lib/locale/<locale>/imsscript/ directory. For example, the following command starts an input method server for the Japanese PCK locale with ATOK12 input system:

        setenv LANG ja_JP.PCK
        sh /usr/openwin/lib/locale/ja_JP.PCK/imsscript/S507atok12

3. Start the applications that will work with your locale choice, such as:

        env LANG=ja_JP.PCK dtterm &

If you are running a non-Unicode/UTF-8 locale desktop and start a CDE Unicode/UTF-8 locale application, the fonts in the application and the desktop might not appear to match. This is because the fontset and fontlist definitions used by the CDE desktop locale are not the same as those used by the Unicode/UTF-8 locales. This is a cosmetic issue that will not hinder any Unicode/UTF-8 character processing or other operations. However, if the font display problem is serious enough, consider using a Unicode/UTF-8 locale desktop.

When you specify a locale with setenv or env, make sure that the current shell's LC_ALL environment variable (and any other LC_* locale environment variable) are not set. If they are set, then specifying the "setenv LANG <locale>" or "env LANG=<locale>" would not take effect because the LC_* locale environment variables take precedence over the LANG locale environment variable. For more information on the locale environment variables, please see the setlocale(3C) man page.

Back to Top


  1. I am new to Java. How should I implement Unicode in my applications?
  2. Dear globalwebmaster,

    I am new to Java. How should I implement Unicode in my applications?

    globalwebmaster says,

    Unicode is Java's native character set. As such, Unicode is "built-in" to the language. Creating an application that doesn't use Unicode would require effort, while using Unicode comes at minimal or no expense at all.
    The char type represents character code units....Unicode UTF-16 code units to be precise. That means that anytime you use char, String, or StringBuffer, you will use Unicode.
    Typically, you will create your java source files in a non-Unicode charset. However, the javac compiler will convert all characters in your source to Unicode in the class files. This happens automatically when you compile the source, so there's not much for you to do unless your source file is in a different encoding than your hosts native character set encoding....this isn't typically the case.
    If that is the case, you must specify the file encoding on the javac command line: javac -encoding Foo.java

    Back to Top


  3. There is a bar underneath every window saying "[English/European]". How can I get rid of it?
  4. Dear globalwebmaster,

    I have installed Solaris 9. However there is a bar underneath every window saying "[English/European]". Why? How can I get rid of this?
    Here's the ouput from locale -a:

    POSIX
    C
    hi_IN.UTF-8
    common
    en_US.UTF-8
    iso_8859_1
    th
    th_TH
    th_TH.ISO8859-11
    th_TH.TIS620
    th_TH.UTF-8

    globalwebmaster says,

    It appears that you have only installed only the listed locales. You also probably selected the "en_US.UTF-8" locale as the default system locale. Check in the /etc/default/init file, for the line LANG=en_US.UTF-8. If this line exists then, you must have chosen en_US.UTF-8 as the system default locale during the installation.

    Looking at the available locales, the only English locales installed are C and en_US.UTF-8. The C locale is a US-ASCII 7-bit English locale and the en_US.UTF-8 is a Unicode locale for American English. If using 7-bit US-ASCII English locale is okay, then you could either CDE Login with C locale by using Options->Langauges->C at the CDE Login menu button, or simply remove the LANG line in the /etc/default/init file and then reboot the system which will make the system's default locale the C locale.

    If neither the C nor the en_US.UTF-8 locales is what you want, then you'll have to add the correct packages that support the other English locales. [See http://developers.sun.com/techtopics/global/reference/techart/index.html#anchor02 for which packages.]

    Back to Top


  5. Java Encoding Detection
  6. Dear globalwebmaster,

    My Java application gets information from various outside processes that might be running in various locales and on various UNIX or Windows platforms; for example, from a database or web server. And these outside processes might output their data in various encodings. The application assumes the encoding of this outside information is in the system encoding under which it is running; no encoding detection is attempted, so sometimes the information displayed from these outside processes does not appear correctly in the applications GUI.
    Are there algorithms or other approaches that can be used to do some encoding detection, like for some CJK encodings ?

    globalwebmaster says,

    The data format used for the data exchange should be specified? A reasonably well specified system these days would either require all data to be encoded in UTF-8, or use XML (where the standard specifies how to detect the encoding), or specify some other way to communicate the character encoding from data provider to consumer.
    The J2RE has a Japanese-Autodetect converter which does a reasonable job of distinguishing between ISO 2022-JP, Shift-JIS, and EUC-JP, and I've heard about other attempts to create encoding detectors. But I suspect that any such detector would fail more often than you could tolerate.


  7. Physical Fonts v's Asian Fonts
  8. Dear globalwebmaster,

    Is it true that with the distribution of JDK 1.3 there are no physical fonts that support Asian characters?
    If the answer is no, I'd like to know the names of the fonts. If the answer is Yes, I'd like your advice on the following: I am using a component that creates a graph, in the form of a jpeg image. The index names can be in more than one language (taken from the properties file). The application may be installed on an English machine, but will need to serve Asian-language clients. So I cannot use logical fonts (and font "dialog" just looks terrible).

    globalwebmaster says,

    The documentation is correct, in that the Lucida fonts in J2RE 1.3.x and 1.4.x don't include glyphs for Chinese, Japanese, or Korean. It's also correct in saying that the logical font definitions on Microsoft Windows™ include these fonts only when running in a corresponding Windows locale.
    If an application needs to display CJK text in non-localized environments, it can either bring along its own physical font or request existing physical fonts by name (the second and fourth options at http://java.sun.com/j2se/1.4.2/docs/guide/intl/faq.html#Text Rendering).

    Back to Top


  9. How do I install Cyrillic fonts on Solaris?
  10. Dear globalwebmaster,

    How do I install Cyrillic fonts on Solaris? What packages are required? Once installed, how do I access them?

    globalwebmaster says,

    Cyrillic fonts in Solaris 8 are in the following packages:

    • SUNWkoi8f for KOI8-R fonts
    • SUNWi5rf, SUNWi5rf for ISO8859-5 Cyrillic fonts
    • SUNW1251f for ANSI-1251/windows-1251 Cyrillic fonts
    These packages can be added using the pkgadd command. The packages can be found on CD 1 of 2 (/cdrom/..../Solaris_8/Product) If you have installed any of Russian locales (that is ru_RU.ISO8859-1, ru_RU.KOI8-R, or ru_RU.ANSI1251), then the fonts are already on your system.
    If you have installed any of Unicode locales, you will have at least SUNWi5rf.
    To use the fonts, you can login to one of the above locales or set your font paths in ~/.OWfontpath.

    Back to Top


  11. Displaying multibyte characters
  12. Dear globalwebmaster,

    I want to write applications that are able to display multiple language characters such as Chinese, Korean, Japanese. The app is running on Solaris base locale but the data could be other locale's data. Is there any way to display that data?

    globalwebmaster says,

    One way would be to do code conversion(s) by using iconv(3) functions from the data in other locale/codeset to the current locale/codeset. There is a possibility that the incoming data in other locale/codeset cannot be represented by the current locale/codeset; in that case, running a UTF-8/Unicode locale as the locale of the application would be a way and do code conversion from all other locales/codesets to UTF-8/Unicode since UTF-8/Unicode usually covers most characters being used.
    Another way would be having multiple locales in an application. This however will make your application a bit complicated since you will have to switch locales whenever necessary; for instance, switch to locale A when you draw text data of the locale A and so on.


  13. Japanese Solaris
  14. Dear globalwebmaster,

    Which locale packages are needed to install the Japanese locale on Solaris 8?
    Is there documentation online on how to install the Japanese locales on top of a base installation of Solaris 8?

    globalwebmaster says,

    The package list for Japanese is availablehere.

    Adding packages: here.

    Back to Top


  15. Displaying Asian characters in utf-8 encoded pages
  16. Dear globalwebmaster,

    I'm working with NS4.78 on Microsoft Windows 2K. I have a utf-8 encoded page
    but the Asian characters are all messed up. What do I do?

    globalwebmaster says,

    It sounds as if you haven't specified the font for Unicode encoding. You need to go to Edit->Preferences->Appearance->Fonts and For the Encoding select Unicode; and for both the Fixed and Variable Width Fonts select Dotum, Batang, Gungsuh or Lucida Sans Unicode. Dotum has certainly worked perfectly for me; the others appear to work to varying degrees.

    Back to Top


  17. Serving content from a bilingual site
  18. Dear globalwebmaster,

    Our site is available in English and Spanish. Currently the user is greeted with a "Please select your language......." page. I know you can configure language preferences in the browser - but can you do anything with this on the server side? - like automatically serving up the correct language?

    globalwebmaster says,

    The browser sends the ACCEPT_LANGUAGE environment variable with every request. There are a number of ways this can be detected and used. CGI scripts and servlets can easily access it and serve up the appropriate page. However you can do without either of these by configuring your web server to look for this preference in every request. Basically the web server parses each request and if it finds, for example, ACCEPT_LANGUAGE=fr then it will serve index.fr.html etc. Should no ACCEPT_LANGUAGE preference be specified in the browser, then a default page is served - probably the English version. The implementation differs slightly depending on your web server. Here's how Apache and iPlanet Web Server do it. See our web faq also.

    Back to Top


  19. Thai support in Java
  20. Dear globalwebmaster,

    I have been unable to create a Java app with Thai letters. Can you help?

    globalwebmaster says,

    Thai support is only available with the latest J2SDK 1.4. Try following sample code:

    import java.awt.*;
    import javax.swing.*;

    public class Thai {

    static public void main(String[] args) {
    JFrame f = new JFrame();
    JTextField t = new JTextField("u0e01u0e02u0e03u0e04u0e05");
    //Font font = new Font("Angsana",Font.PLAIN,48);
    Font font = new Font("Lucida Sans",Font.PLAIN,48);
    t.setFont(font);
    f.getContentPane().add(t);
    f.pack();
    f.setVisible(true);
    }
    }

    Here is the result:

    Back to Top


  21. Using setlocale() for Arabic
  22. Dear globalwebmaster,

    I've been using the following code


    .....
    if ((locale=setlocale(LC_ALL,"ar_EG.8859-6"))==NULL)
    {
    printf("Can't set localen");
    }
    ........
    -but setlocale always fails. Why?

    globalwebmaster says,

    When using setlocale() use ar instead of ar_EG.8859-6. Neither Arabic nor Hebrew have been changed to the long name format. To see if you have the ar locale, go to /usr/lib/locale/ and see if you have an "ar" directory.

    Back to Top


  23. setlocale90 and XDrawImage
  24. Dear globalwebmaster,

    I have a ultra 10 box and I am using xlib/(c++) to render arabic (iso8859-6 encoded) strings on a image.

    1. If I have the arabic locale installed on my system can I just setlocale to "ar_EG" or whatever is the name for arabic locale and use XDrawImageString to render the string with contextual character representation?
    2. Will the Os layer take care of how characters should be displayed ?

    globalwebmaster says,

    1. X Output APIs, for example: XDrawImageString, XDrawString, XTextExtents, Xmb/wcDrawString, ... do not support complex text script, such as Arabic, Hebrew,.... etc. So, the answer is No.
    2. The OS layer won't take care of the character rendering. It belongs to the Graphic Toolkit layers. On unix, they are X and Motif. Current X APIs are not enough to handle complex text scripts. On Solaris, the complex text scripts are supported at Motif layers. So, as long as, your applications are written by using Motif toolkit (Static Label, button, TextField, Text widgets....) then, you will get arabic rendering correctly.

    Back to Top


  25. Resizing Java Apps.
  26. Dear globalwebmaster,

    Is there any visual tool to efficiently resize the localized java application, by only altering the .java and the UIResBundle.properties files? Right now I am resizing manually (which is a shot in the dark) in .java source files in lines that look like :

    passwordLabel.setEnabled(false);
    passwordLabel.setLocation(new Point(16, 56));
    passwordLabel.setSize(new Point(64, 23));
    passwordLabel.setTabIndex(4);
    passwordLabel.setTabStop(false);
    passwordLabel.setText("Mot de passe :");

    And in the out-put resource UIResBundle.properties files in lines that look like:

    userSettingsGroupBox_passwordLabel_Text=	Mot de passe :
    userSettingsGroupBox_passwordLabel_Font=MS Shell Dlg, 11.0, 2, 400, false, false
    userSettingsGroupBox_passwordLabel_Position=16, 56, 64, 23, 0
    userSettingsGroupBox_passwordLabel_Context=passwordLabel

    globalwebmaster says,

    Although there may be an IDE that can parse and visually represent source files, your situation is exactly what we try to avoid. Typically we can avoid hard-coding location and sizes by using well-managed layout managers. Usually a combination of layout managers and nested panels can create sophisticated layouts without resorting to setting exact pixel locations. So, although I don't have a good recommendation for you on this project, I do recommend making better use of layout managers in your next one.

    Back to Top


  27. Multibyte Swing UIs
  28. Dear globalwebmaster,

    I want to write applications that are able to display multiple language characters such as Chinese, Korean, Japanese. The app is running on Solaris base locale but the data could be other locale's data. Is there any way to display that data?

    globalwebmaster says,

    One way would be to do code conversion(s) by using iconv(3) functions from the data in other locale/codeset to the current locale/codeset.There is a possibility that the incoming data in other locale/codeset cannot be represented by the current locale/codeset; in that case, running a UTF-8/Unicode locale as the locale of the application would be a way and do code conversion from all other locales/codesets to UTF-8/Unicode since UTF-8/Unicode usually covers most characters being used.
    Another way would be having multiple locales in an application. This however will make your application a bit complicated since you will have to switch locales whenever necessary; for instance, switch to locale A when you draw text data of the locale A and so on.

    Back to Top


  29. Displaying Arabic text right to left
  30. Dear globalwebmaster,

    How do I ensure that my Arabic text displays from right to left? What encoding should I use? Currently the browser output is garbled.

    globalwebmaster says,

    To view text in any language via a browser you must:
    1. Encode the page using an encoding appropriate to that language.
    2. You must specify this encoding in the HEAD tag of the HTML page.
    3. The browser must then:
    a) be able to support the particular encoding. (if your browser is relatively recent then you can be sure of this)
    b) have the fonts than can display the characters. (not all fonts can display Arabic characters)
    c) have your browser configured to use particular fonts for particular encodings. For help with this see unicode.org and/or here.

    You could encode an Arabic page in "iso-8859-6". Then in the HEAD tag you should include: <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-6"> The browser must then be able to support iso-8859-6 (this should be fine), and have fonts for this encoding (more variable - depends on your browser/platform).

    Of course you can always use "utf-8" encoding as is on the page http://www.columbia.edu/kermit/utf8.html. This page displays multiple languages and is a good page to test the unicode capabilities of your web browsers. Your should upgrade your browser to Netscape 6.2.2 or IE6. Both provide good unicode support.

    Around your Arabic text you should include this: <span dir="RTL" lang="AR"> Sample Arabic Text</span>
    Squares and rectangles appear in text when the configured font doesn't contain the required characer.

    Back to Top


  31. Building a Japanese search engine
  32. Dear globalwebmaster,

    I need to build a Japanese Search Engine using Java. However, I am unable to compare Japanese Strings with each other. I have attempted using Unix programming as well as JBuilder. But have yet met with success. I'm trying to convert the native code over to Unicode first, then compare the Unicode strings. Do you have any advice on this?

    globalwebmaster says,

    The Java platform provides the java.text.Collator class to provide language-sensitive String comparisons. Convert the native text strings to Unicode using an InputStreamReader. Once you have Unicode String objects, using Collator is straightforward. Please see: http://java.sun.com/j2se/1.4.2/docs/api/java/text/Collator.html and http://java.sun.com/j2se/1.4.2/docs/api/java/io/InputStreamReader.html

    Back to Top



  33. Keyboard Shortcuts
  34. Dear globalwebmaster,

    How do I type keyboard shortcuts - composite characters - on Solaris using a PC keyboard that does not have Compose or Alt Graph keys?

    globalwebmaster says,

    The alternative key sequence for the Compose key is Ctrl-Shift-T (press the Control, Shift, and T keys simultaneously).
    There are some common Alt Graph keyboard shortcuts, particularly for typing the Euro symbol. Recent Solaris Unicode locales and some Linux systems support alternative keyboard shortcuts for the Alt Graph key, such as either or both of the following:

    Compose c=
    Compose e=

    where the Compose key can be substituted with Ctrl-T.

    Back to Top


    May 2003

Related Links