Internationalization: Automating localized UI using Selenium WebDriver

In my last company, we had a web app that supported seven different languages. The challenge was to develop UI automated tests which could run on all localized UIs. We used Selenium WebDriver and its Java binding to simulate browser interactions with the web app. At the time we already had automated significant portion of the app for English UI. All test development was done on windows and in Eclipse IDE. Eclipse wouldn’t understand native characters from most of the languages, although I played around changing encoding to UTF-8 etc.  The text used to appear as garbled (My latest experience on OSX is that native text in Eclipse appear correctly, perhaps has to do with default encoding in OSX).

So my research began and I started to looking into internals of WebDriver and how it worked. I learnt that Selenium Webdriver uses JsonWireProtocol to communicate with browser and it supports UTF-8. A quick note from JsonWireProtocol documentation, “Although the server may be extended to respond to other content-types, the wire protocol dictates that all commands accept a content-type of application/json;charset=UTF-8. Likewise, the message bodies for POST and PUT request must use an application/json;charset=UTF-8 content-type“. That tells me that the underlying WebDriver implementation for any browser is going to convert request body into UTF-8 and communicate to the browser. However I had to deal with Eclipse errors and funky characters. So my next step was to find a tool that can convert native characters into UTF-8. I found that JDK provides an excellent utility called native2ascii which can easily convert native characters into their ASCII representation. Since this was a one time or once a while process, I decided to leave this step as manual. However there is a native2ascii maven plugin or an API which can be potentially used if you desire to automate this step. You can also provide desired encoding. So lets take below for example (If you are developing on OSX this step is perhaps not required)

source.txt
ラドクリフ、マラソン五輪代表に1万m出場にも含み

command
native2ascii -encoding utf8 source.txt output.txt

output.txt
\u30e9\u30c9\u30af\u30ea\u30d5\u3001\u30de\u30e9\u30bd\u30f3\u4e94\u8f2a
\u4ee3\u8868\u306b1\u4e07m\u51fa\u5834\u306b\u3082\u542b\u307f

Try this sample quick code below,

WebDriver driver = new FirefoxDriver();
driver.get("http://www.google.com.hk/");
String textToSend = "\u30e9\u30c9\u30af\u30ea\u30d5\u3001\u30de\u30e9\u30bd\u30f3\u4e94\u8f2a
\u4ee3\u8868\u306b1\u4e07m\u51fa\u5834\u306b\u3082\u542b\u307f";
WebElement searchButton = driver.findElement(By.name("q"));
searchButton.sendKeys(textToSend);

You will notice that the search will actually be populated with ラドクリフ、マラソン五輪代表に1万m出場にも含み. Nice and nice!!!! Now I have a way to easily transform something that Selenium WebDriver + my IDE can understand. This was promising indeed. So I decided to externalize text that I am verifying in properties files (key-value pair); one per locale. Obviously keys across locale will remain the same, just the values would differ. Based on the locale under test, I decided to load the appropriate property file and the automation will run seamlessly.

I was happy with this approach and went ahead with it. Our application contained many pages and two weeks later I realized that my property files are growing big. I had seven property files for seven languages to maintain and things were going out of control. I wanted to retrospect on my approach and find if I could change my approach quickly before its too late. Agile taught me well! I posted questions on forums but didn’t get any convincing answers.

I went to our front end developers to understand the actual implementation of localized UIs and learnt that backend implementation is independent of locale. They received locale files from external vendors who translated text into desired languages. These locale files were essentially a key/value pair, similar to my earlier approach. Keys across different locale files were same, but values would change based on the language. All Html files for individual locales were pre-compiled as a part of build and based on the users language preference at the login screen corresponding pre-compiled resources would be loaded, backend would remain the same. In some pages, based on user actions javascript would return dynamic text (locale specific success/error messages etc). I then realized that I was reinventing the wheel earlier and I could just straight use these resource bundles from vendors. After further thoughts, I decided to use locale files from vendors directly, parse them as Properties file and read in my code. Before I parse them, I converted all resource bundles from native characters to ASCII characters (UTF-8) so that Eclipse wouldn’t complain plus WebDriver could understand them as well. Like I showed above,

native2ascii -encoding utf8 japanese_native.txt japanese_ascii.txt
native2ascii -encoding utf8 chinese_native.txt chinese_ascii.txt
and so on....

So to put things into perspective, Below is what we would get from the vendors. Columns indicate locale file and two rows demonstrate sample content of those files.

english.text japanese_native.txt chinese_native.txt
SUBMIT_LABLE= Submit SUBMIT_LABLE= 提出する SUBMIT_LABLE= 提交
ERROR= Error ERROR=エラー ERROR=錯誤

native2ascii tool easily converted these native source files to ASCII file which were essentially fed into my automation suite. So the conversion would look like below,

english.text japanese_ascii.txt chinese_ascii.txt
SUBMIT_LABLE= Submit SUBMIT_LABLE= \u63d0\u51fa\u3059\u308b SUBMIT_LABLE= \u63d0\u4ea4
ERROR= Error ERROR=\u30a8\u30e9\u30fc ERROR=\u932f\u8aa4

And here is some psuedo-code to give you an idea. Based on the language of preference, I load appropriate locale file,

if(language.equalsIgnoreCase("japanese"))
properties.load(new FileInputStream("japanese_ascii.txt"));
else if (language.equalsIgnoreCase("spanish"))
properties.load(new FileInputStream("spanish_ascii.txt"));
and so on...

My page objects would have something like below. So just like the application code, my automation suite is only dependent on loading the correct file, page objects were independent of the locale.

String key = properties.get("SUBMIT_LABLE");
submitButton.getAttribute("value").equals(key)

Benefits of this approach

  • No need to maintain and create locale files.
  • UI functional automated tests ran on all locales seamlessly
  • Straight forward, less complicated- UI automation is difficult in itself; don’t complicate 
  • Automation suite found many locale specific bugs
    • Some languages were missing few keys during translation due to obvious human errors. Such errors resulted in default english text to appear when locale specific text was expected
    • Integration bugs detected with Javascript like dynamic success/error messages etc.

The approach worked great and provided significant value from testing standpoint.  The lesson I learnt from this experience is that we need to retrospect constantly and improve testing practices. Fail fast so that you can recover quickly. Work closely with developers and product stake holders. Working closely with developers and product stake holders, I was able to understand internals of our web application and it led me to answers I was looking for. UI automation is a hard problem to solve, not to mention the ever changing/evolving web app and need for testing on different browsers, platforms etc. I believe complicating browser automation suites further with adding more layers to support testing different locales is just an overhead and destined to fail. In my case the web app used resource bundles, some apps may have different implementation, I believe that for testing localized UIs, we should leverage native application solutions for internationalization as much as possible .

What do you think? How would you approach?

26 thoughts on “Internationalization: Automating localized UI using Selenium WebDriver

  1. All I am trying to go to Chinese sales force . and click on the products linkbyText using UTF.. how do i do it? when i do via my method I get an exception.

    WebDriver driver = new FirefoxDriver();
    driver.get(“http://www.salesforce.com/cn/”);
    driver.findElement(By.linkText(“\ufeff\u89e3\u51b3\u65b9\u6848”)).click();
    System.out.println(“finish”);
    Error Message:

    Exception in thread “main” org.openqa.selenium.NoSuchElementException: Unable to locate element: {“method”:”link text”,”selector”:”解决方案”} Command duration or timeout: 1.08 seconds For documentation on this error, please visit: http://seleniumhq.org/exceptions/no_such_element.html

    • Thats because you have Byte Order Mark (BOM) \ufeff at the beginning of your linkText. Remove that and the linkText should work for you. I just tested it and it works.

      WebElement element = driver.findElement(By.linkText("\u89e3\u51b3\u65b9\u6848"));
      element.click();

  2. Pingback: Overview of Localization and Internationalization Testing

  3. You rock Really .. I was going through this steps and got it working for Appium + android mobile automation . I would like to share a shortcut that i came across during my testing . If you are working with eclipse, Create a .properties file inside a package and create a key (doesn’t matter) and copy paste the locale data that you are intended to convert to utf-8 format …Bingo..!!! it’s convert automatically . U can directly use the properties file as input to ur tests .

    Key=General
    value=सेटिंग (which was auto converted to \u0938\u0947\u091F\u093F\u0902\u0917 )

    String ss=props.getProperty(“General”);
    driver.findElement(By.name(ss)).click();

    Cheers,

  4. hii
    i am having problem to check if a text on page with ui locator how can i check it exists in the message.properties file or if i use message.properties file then navigation between the pages pose a problem

  5. hi but what about user data which is given by user stored in database and displayed on gui . it will never convert according to language hence fails for test cases Any alternative u can suggest

  6. Hi, Great post, I am in the similar situation to develop UI automation to run in 10 different languages, could you please give me a framework structure, like how to get locale info and processing from there.. it would be really helpful.. thanks

  7. Hi Nilesh, I need ur suggestion. In my application we have just started localizing the app in Chinese and in near future we will be translating it to Russian language. We have an automation framework running but only in english language and the structure of frame work is we are using page object model with testng. We have an input file in json format which contains all the data that we have to send in the forms as input then we have page objects which contains web elements for specific page and we have created small functions to perform different actions based on if then else, while loops and finally testng class which call the functions from page object and pass the input data from input json file. My issue is in page object the way we have find elements is by xpath, linktext, id (which contains text in english).All these approaches to find web elements contains english text.
    My question is how I can utilize existing framework and existing scripts to do localized language web app testing without making too much of effort and time. The only thing I can think about is i the web element I should replace english text into its corresponding UTF-8 character but this will take more effort in updating almost all the web element ids, xpath etc
    Can you please suggest a better approach for this problem.

Leave a comment