Website Internationalization
There are various ways to implement internationalization (“i18n”) on websites. This article discusses some of the topics that should be considered.
User Interface and Content
It is important to distinguish between the user interface of a website and the content.
The user interface of this website is quite small. On this page, the user interface consists of the site menu (“Articles”, etc.), the article metadata (“Author”, etc.), and the copyright information in the page footer. If the site were localized to Japanese, the site menu items and article metadata headings would be in Japanese (「記事」…) and the published date would be localized (「2021年5月6日」). If tag labels were used, they would be localized as well. Copyright information in the footer could be localized, but it is often not.
On this page, the content is the text of the article. Note that this would include any text in images or other assets if there were any.
Locale Selection
Users need a way to select their desired locale. There are various ways to do this, including the following:
- Display a locale selection interface on every page, usually in the page header or page footer. This method can provide the best usability on sites where users do not have accounts, as it allows users to switch locales without having to navigate away from the page and then search for it again using the different locale. Sometimes an implementation will link/redirect to the website/locale root when changing locales, which is terrible usability.
- Display a locale selection interface on a settings page. This method is most often used when users have accounts on the site, where the selected locale is persisted in a database and loaded whenever the user logs in.
- Display a locale selection interface when a user first visits the site. In my experience, many users find this to be annoying.
There are various ways to implement local selection interfaces, including the following:
- Display the locale names in text. Each locale name is usually displayed in that locale because users can often spot their language more easily when it is written in their language.
- Select locales using flags. This is often considered to be insensitive to users. For examples, English speakers around the world may find it annoying to have to click on a British or U.S. flag to select their own language.
- Select locales using a map. This method does not work well for small countries and can be difficult for the geographically challenged. It is usually only seen on huge corporate websites that force users to select a locale when they first visit the site.
Locale Persistence
When a locale is selected, that locale should be used throughout the site. There are various ways to persist the locale selection, including ways that include the locale in URLs as well as ways that do not. Note one generally cannot share a link directly to a page with a specified locale if the locale is not included in the URL.
The following types of sites usually do not include the locale in URLs:
- Sites that require login
- Sites with content that is not localized (such as user content)
There are various ways to persist the locale, including the following:
- The locale can be stored in a server-side database.
- The locale can be stored in a session variable on the server. The value may be initialized from a database, while the session variable is used as a cache so that the database does not need to be queried on each page load.
- The locale can be stored in a cookie, perhaps initialized from a database. This is a popular choice for “stateless” server implementations, which makes it trivial to implement load balancing.
Other types of sites usually include the locale in URLs. One reason for doing so is search engine optimization (SEO). There are various ways to include the locale in a URL, including the following:
- The locale can be part of the URL path (example:
https://www.example.com/ja/about). This is the probably the most popular choice at this time. The biggest issue is that a site might not localize all content to all supported locales. This poses a problem for locale selection, and many developers implement the easy solution of linking to the locale root when changing locales, which is annoying for users.
- The locale can be part of the URL host (example:
https://ja.example.com/about). This method is essentially the same as putting the locale in the path, but there are a number of technical differences. To implement HTTPS, one needs to either use a separate certificate for each locale or a wildcard certificate. Common scripts either need to be served from each host, or Cross-Origin Resource Sharing (CORS) must be implemented. If cookies are used, policies must be configured to allow appropriate access across the various hosts used.
- The locale can be passed as a GETparameter. This method allows links to pages using specific locales (example:https://www.example.com/about?lang=ja) as well as links where the locale is determined automatically (example:https://www.example.com/about) to be shared. It is not popular these days because it does not work well with search engine optimization (SEO). It can also be annoying to implement because it requires customization of all site links on a page for each locale.
Locale Initialization
When a locale is not already selected, the initial locale must be determined. There are many ways to do this, including the following:
- The HTTP Accept-Languageheader is used to relate the user’s preferred locales, generally based on browser preferences. This list of weighted preferences can be compared to the list of locales supported on the site/page to determine an initial locale. This is usually the most user-friendly method.
- Geolocation
can be used to (approximately) determine where a user is accessing the
site from based on the IP address, and that location can be used to
determine an initial locale. Note that ignoring
Accept-Languageheaders and setting the locale based on geolocation alone is often a source of frustration for users.
- The site may use a default locale. The default locale is often English, but note that it depends on the target audience of the website.
For example, visiting https://www.example.com may use
all of the above methods (in the above priority order) to determine the
initial locale. If Accept-Language indicates that Japanese
is preferred, and Japanese is supported on the site, then geolocation
and the default locale are not used. The user is automatically
redirected to https://www.example.com/ja and uses the
Japanese locale until a different locale is explicitly selected.