|
Note: Sun ONE Web Server is now marketed as the Sun Java System Web Server [SJSWS]
1. Introduction
The Internet addresses a global market, and Web site design
should take into consideration the requirements of users from different
countries. In an international setting, Web servers need to accommodate
data exchange in a wide range of character sets. HTTP was initially meant
for exchange of static documents, and internationalization issues were
not addressed properly. For instance, when a client submits a form to a
Web server, there is no mechanism to specify the encoding of the request.
Sun solves this problem and many others by providing a framework that allows
dynamic content exchange in different languages and character encodings
as requested by clients.
2. Static documents
Sun Java System Web Server can serve localized static documents according to
the client preferred language. This is achieved through content negotiation
between the client and the Web Server. When a client sends a request to
the server using HTTP, it includes the Accept-Language header
describing the various languages it accepts. You can configure our web server
to parse this language information in order to dynamically load localized
pages.
How to enable Accept-Language parsing in Web Server v6.0
Manual setting: Add acceptlanguage="on" to
the VSCLASS element in the server.xml file of the Web
Server instance.
Through Admin Console: Log into Admin Console and click on the
Manage
button on the Servers tab. Click on the Virtual Server
Class tab, then click on the Edit Classes link. Change the Accept-Language
value to on and submit.
How to set the browser language
Netscape:
-
Choose Preferences from the Edit menu.
-
Choose Languages under the Navigator heading on the dialog
box. A list of preferred languages is displayed.
-
Add languages and set the order of preference using the buttons on the
right hand side.
Internet Explorer:
-
Choose Internet Options from the Tools menu.
-
Click on the Languages... button on the General tab.
-
Add languages and set the order of preference using the buttons on the
right hand side.
Example:
When acceptlanguage is set to on,
suppose for instance that a client sends the Accept-Language header
with the values fr-CH, de, when requesting the following URL:
http://www.someplace.com/somepage.html
The Web Server searches for the file in the following order:
1. By language code and country code
http://www.someplace.com/fr_ch/somepage.html
http://www.someplace.com/somepage_fr_ch.html
http://www.someplace.com/de/somepage.html
http://www.someplace.com/somepage_de.html
2. By language code only
http://www.someplace.com/fr/somepage.html
http://www.someplace.com/somepage_fr.html
3. Using the DefaultLanguage value that is defined in the magnus.conf
file. For instance, if en is set to be the default, the lookup will
continue as follows:
http://www.someplace.com/en/somepage.html
http://www.someplace.com/somepage_en.html
4. If none of these are found, the server tries:
http://www.someplace.com/somepage.html
Note :
When naming your localized files, country codes like CH and TW are
converted to lower case and dashes ( - ) are converted to underscores (
_ ).
Using Other Language
Settings
The following directives in the magnus.conf file specify language
defaults:
| Directive |
Values |
Description |
| ClientLanguage |
en,
fr,
de,
ja |
Specifies
the language in which client messages, such as "Not Found" or "Access denied"
are to be expressed. This value is used to determine which ns-httpd.db
database to use for the localized messages. |
| DefaultLanguage |
en,
fr,
de,
ja |
Specifies
the language used if a resource cannot be found for the client language. |
3. Servlet Internationalization
3.1 Request Character Encoding
When form data is submitted from a browser to the server using
the POST method, the browser url-encodes the POST data and sets the Content-Type
header to application/x-www-form-urlencoded, but does not send
any charset information.
On the server side, if a servlet tries to access POST data using getParameter
or getParameterValues, the servlet container does not have any information
about which character encoding to use for getParameter strings. You can
configure Sun Java System Web Server 6.0 to instruct the servlet container which
character encoding to use for interpreting POST data strings. To do this,
specify the character encoding using the parameter-encoding element
in web-apps.xml:
<parameter-encoding enc="value1" form-hint-field="value2"/>
| enc |
Allowed values are auto (the default), none,
or a specific encoding such as UTF8 or Shift_JIS |
|
any supported Java character encoding |
A specific encoding, such as UTF8 or Shift_JIS.
Set this option if you know the encoding that servlet parameters use. A
complete list is available here:
http://java.sun.com/j2se/1.3/docs/guide/intl/encoding.doc.html |
|
none |
Uses the system default encoding. Set this option if the
encoding of the servlet parameter data is the same as the system default
encoding. |
|
auto(default) |
Tries to figure out the proper encoding from, in order:
1) the charset specified in the Content-Type header,
2) the parameterEncoding attribute (see ServletRequest.setAttribute),
3) a hidden form field defined in form-hint-field. Otherwise, the system
default encoding is used. Set this option to prevent misinterpretation
of non-ASCII characters in servlet parameters. |
| form-hint-field |
The name of the hidden field in the form that
specifies the encoding. The default is j_encoding. |
Which option you choose from the above list depends on your
application. If you design your application for only one specific language,
for instance Japanese using the Shift_JIS encoding, you can specify the
value: <parameter-encoding enc="Shift_JIS">. If you want your
application to be multilingual, you can choose UTF-8 since it covers most
languages: <parameter-encoding enc="UTF8">.
However,
some types of clients prefer a more locale-specific encoding; in these
cases UTF-8 is not the best choice. The auto choice in conjunction
with the hidden field is a more flexible solution but requires more effort.
Each time you send a request to the Web Server, you need to specify the
encoding of that request so that the Web Container does the correct conversion
when you call the getParameter function. If the hidden field is set correctly,
the Web Container will automatically do the correct conversion.
Examples using the hidden field:
- The users who access your application are registered users. In this
scenario each user has a profile which contains preferences for language
and charset. The language is used for localized documents and messages.
The charset value is used for data conversion when receiving requests and
sending responses. After the user is authenticated, the charset value is
loaded from the user profile. Every form that is sent to the user includes
the hidden field.
- In this next scenario anyone can access the Web site. No profile data
for the users is saved on the server. When a user accesses the application
for the first time, the request.getLocale() method is called to find the
preferred language of that client. Each language is mapped to a charset
in a special mapping table. A new session object is then created to store
the language and charset values. Each time a form is sent to the user,
a hidden field is included. The value of the hidden field is obtained from
the charset value stored in the session object. The user can also be offered
the option of dynamically switching between languages within the same session.
For example, a banner can always be included in the pages that are sent
to the client. A servlet that modifies the values of language and charset
can be called when the user clicks on a particular language in the banner.
If a user is in an English locale and wishes to switch to a Japanese locale,
they could click on a Japanese language link which invokes a servlet:
http://<host>/ChangeLocaleServlet?lang=ja&charset=Shift_JIS.
The ChangeLocaleServlet will alter the session object to set the value
of language to ja and charset to Shift_JIS;
it will then redirect the call to a Japanese page. The user will now have
a Japanese interface. Later on, when a form is sent to the client, the
hidden field value will be Shift_JIS.
3.2 Response Character Encoding
The charset for the response can be specified with setContentType
method of the ServletResponse class. For example, the call response.setContentType("text/html;
charset=Shift_JIS") will inform the Web Container to convert the response
byte stream using the Shift_JIS encoding. The header that is sent to the
client will be set to <meta http-equiv="Content-Type" content="text/html;
charset=Shift_JIS">; this will allow the client to correctly interpret
the response content. If the response includes a form, the POST data will
be converted using the same encoding that is specified in the header. It
is very important to always call the setContentType method with
the charset value in order to ensure consistent communication between the
client and the server.
If the charset value is not specified in the setContentType call, or
if setContentType is not called at all , ISO-8859-1 will be used
as the default encoding.
The setContentType method should always be called before calling the
getWriter method to get the PrintWriter object.
Another servlet internationalization function is ServletResponse.setLocale(java.util.Locale);
when this function is called, the servlet will set the Locale information
of the response. But since there is no one-to-one mapping between locale
and charset, a best match will be applied to map the charset to the locale
that is passed to the function call.
The setLocale and setContentType methods should be called
before the getWriter method of the ServletResponse interface is called.
This ensures that the returned PrintWriter is configured appropriately
for the target Locale. A call to the setContentType method with a charset
component for a particular content type will override the value set via
a prior call to setLocale.
4. Posting to JSPs
You can configure parameter-encoding to work the same way when
you are posting to a JSP instead of a servlet. The following example demonstrates
a JSP configuration of auto to read parameters which are in the
Shift_JIS encoding:
<%@ page contentType="text/html; charset=Shift_JIS"
%>
<html>
<head>
<title>JSP Test Case</title>
</head>
<body>
<% request.setAttribute("com.iplanet.server.http.servlet.parameterEncoding",
"Shift_JIS"); %>
<h1>The Entered Name is : <%= request.getParameter("test")
%> </h1>
</body>
</html>
5. Serving all the documents using the same charset
You can override a client's default character set setting for
a document, a set of documents, or a directory by selecting a resource
and entering a character set for that resource. The browser uses a MIME-type
charset parameter in the document header to detect the charset of the requested
document.
To change the charset, go to the Admin Console of the Web Server
and follow these steps:
>From the Class Manager, click the ContentMgmt tab.
In the left frame, click on the International Characters link.
Choose The Entire Server from the resource picker to apply your
change to the whole class, or navigate to the document root for a specific
virtual server, or to a specific directory, or within a specific virtual
server.
Set the Character set (charset) for all or part of the server. If
you leave this field blank, the charset is set to NONE.
Click OK.
6. Conclusion
Sun Java System Web Server 6.0 simplifies the management of international
data. With some planning and evaluation of your customers, you can configure
the web server to satisfy their document requirements. For more information about
Sun Java System Web Server, see the product information page at:
http://wwws.sun.com/software/products/web_srvr/home_web_srvr.html
|
|