Automate your Android App Testing Using Robotium

I like that it’s the time of the year when I actually have some time to invest in blogging here. Updates have been pretty sporadic this year (another item to add to my list of resolutions once 2013 hits in a few days).

In this post, I’d like to talk a little about Robotium, and how you can use it to automate the QA of your Android applications. Mobile applications are perfect for Quality Assurance automation – they are usually small, have limited functionality when compared with desktop applications, and are rarely complex in terms of testing steps. Also, if your a solo mobile developer, you may not have the time (or money) to invest in proper quality assurance of your applications.

We’ve been using a Robotium based solution written in Java at my day job for a while, to validate the quality of our localized product builds, both as a unit testing tool and as an actual test automation tool used by our QA team.

One of the main reasons why we chose to invest time in Robotium is the fact that it offers flexibility in terms of how it recognizes objects within your app (e.g. like it would have to in order to click a button). This is especially important for us, as we want to develop a solution once, and have it support multiple language versions of our application. We also looked at MonkeyRunner, which ships with the Android SDK. MonkeyRunner uses Jython (a Python implementation in Java) scripts to walk through applications.

I prefer Robotium as MonkeyRunner lacks the tight UI integration offered by Robotium, (although MonkeyRunner is much easier to setup and run).

When developing automated test scripts using Robotium, we can use the text from a UI object to identify that object at run-time, for example the label on a button. What we do is compile the localized resources from all our languages with our test application, so if we switch our device language to German for example, the resources from the ‘values-de’ folder are loaded by our test application, and thus we can add new languages to our automation solution simply by re-packaging the APK file containing our test automation scripts, and it will run on any of our supported locales without code modifications.

One of the disadvantages of this approach, (and almost all UI based test automation solutions to be fair), is that the maintenance may be high – for example, you will probably lose 90% of your code in a UI refresh.

Creating a Simple Robotium Automated Test Case

  • First off, download the latest version of Robotium, (3.6 at the time of writing).
  • Now simply create a new Android application in the IDE of your choice, I prefer NetBeans due to familiarity, but I know most people like Eclipse for Android development.
  • Add the Robotium JAR as a reference in your project.

For the purpose of this example, let’s say we want to click the ‘Cancel’ button in the below screen capture. As I mentioned above, our Robotium code uses the string ID’s to find objects (based on the current device language, we will look for the value of that string ID for that language if available, and use that to look for the object on the current screen).

MMS Image

Code (Finally!)


package com.jc.robotium

import android.app.Activity;
import android.test.ActivityInstrumentationTestCase2;
import com.jayway.android.robotium.solo.Solo;
import android.content.res.Resources;
import android.content.Context;
import java.util.Locale;

public class RobotiumExample extends ActivityInstrumentationTestCase2 // Provides functional testing of an activity
{
    private static final String TARGET_PACKAGE_ID = "com.yourcompany.yourapp";
    private static final String LAUNCHER_ACTIVITY_FULL_CLASSNAME = "com.yourcompany.yourapp.Activities.MainActivity";
    private static Class launcherActivityClass;
    
    private Solo solo;
    private Activity activity;
    private Resources res;
    private Context context;
   
    // This will launch the application specified above on the device
    static 
    {
        try
        {
            launcherActivityClass = Class.forName(LAUNCHER_ACTIVITY_FULL_CLASSNAME);
        }
        catch(ClassNotFoundException e)
        {
            throw new RuntimeException(e);
        }
    }
    
    public RobotiumExample() throws ClassNotFoundException
    {
        super(TARGET_PACKAGE_ID, launcherActivityClass);
    }
    
    @Override
    protected void setUp() throws Exception
    {
        activity = getActivity();
        solo = new Solo(getInstrumentation(), activity);
        context = getInstrumentation().getContext();
        res = context.getResources();
    }
    
    public void testUI() throws Throwable
    {
        // ....
		
	// Robotium code begins here
		
	// Wait for the 'Cancel' button
	solo.waitForText(res.getString(R.string.cancel));  
		
	// We may want to take a screen capture...
	solo.takeScreenshot();
		
	// Click the cancel button
	solo.clickOnText(res.getString(R.string.cancel)); 
	
	// ....

    }
    
    @Override
    protected void tearDown() throws Exception
    {
        try
        {
            solo.finalize();
        }
        catch(Throwable e)
        {
            // Catch this
        }
        getActivity().finish();
        super.tearDown();
    }

Notice a few things above:

  • We are extending the ActivityInstrumentationTestCase2 class, which provides us with the ability to run test methods on the UI thread.
  • The method that contains your test logic must be prefixed with ‘test’, notice mine is called testUI().
  • We could have looked for the ‘Cancel’ button by passing the text ‘Cancel’ to the waitForText() and clickOnText() methods, but that would only work on the English version of our product. Instead we use the string ID, and look up the value of that string ID based on the current device locale.
  • If you use the ‘takeScreenshot()’ method I have shown an example of above, screen captures will be saved in ‘/sdcard/Robotium-Screenshots/’ on your test device.

Running the Test

Compile the APK containing your test automation script and install it on your test device, along with the APK under test. One ‘gotcha’ here with Robotium is that the APK under test and the APK containing the test logic must be signed with the same certificate. This may not be the case if your APK under test comes out of a build system of some sort. You can use resign.jar to re-sign your APK under test with the same signature as your APK containing the test logic (by running it on the same machine on which you compiled the test APK).

Once both are installed on a test device, we can launch the test from a command prompt via adb:

adb shell am instrument -e class com.jc.robotium.RobotiumExample -w com.jc.robotium/android.test.InstrumentationTestRunner

Conclusion

Robotium is a good test framework for getting automation for Android applications up and running very quickly. It has an active community and updates are released regularly, the latest version (3.6 at time of writing), also seems to be a lot more stable than previous versions.

On the flip side, since we are relying on text resources (which may change frequently), the maintenance on Robotium based automation solutions can be high if your UI changes a lot, which is probably a high possibility for a mobile application.

If your looking for an automation framework for your Android application that you can get up and running quickly, and use to run scripts across an Android application supported in many languages, I would recommend Robotium, and will be keeping a close eye on it as it develops further.

Interacting with web pages using Selenium WebDriver for C#

I’ve been using Selenium WebDriver for C# a lot lately, for a number of projects that involved interacting with a web browser in some manner. I’ve used a lot of applications and libraries over the past few years that provide this functionality, but I’ve never come across one as intuitive and reliable as Selenium WebDriver – if you work on any projects that involve interacting with a web browser to automate some process, you need to read this post.

In this post I’ll take you through the process of using Selenium WebDriver to automate some interaction with a web browser and hopefully show you how powerful Selenium is. We’ll take a simple scenario as an example – submitting a request to the Google search engine.

Boot up Visual Studio and create a new C# console application. You’ll need to download the Selenim WebDriver for C# ZIP from Google Code, and add the DLL’s it contains as references to your project. For this example, you are only required to add WebDriver.dll and Newtonsoft.Json.Net35.dll. You’ll need to add the following using statements also:

using OpenQA.Selenium;
using OpenQA.Selenium.IE;

Now your ready to write some code that can drive your web browser. First, we’ll need to create the object that can do just that. With Selenium WebDriver, that is an IWebDriver object. This object can be instantiated to control Internet Explorer, Firefox, or Chrome. In this example, I’m going to use Internet Explorer.

IWebDriver driver = new InternetExplorerDriver();

You could also use ChromeDriver or FirefoxDriver above.

Next, let’s navigate to the Google website. The following code will handle this. One of the nice things about Selenium is that this call won’t return until the page is loaded in the browser. Other frameworks and libraries return immediately, requiring you to add waits/sleeps in your code to ensure the page is actually loaded, which is a terrible approach.

driver.Navigate().GoToUrl("http://www.google.com");

Aside – Internet Explorer Problems

One setting has to be changed in Internet Explorer in order for Selenium WebDriver to work correctly. Your security zones need to have protected mode either enabled or disabled – it doesn’t matter which, as long as it’s the same for each zone. I’ve got it set to on for each zone on my machine. To achieve this follow these steps:

  1. Open IE and go to Internet Options.
  2. Go to the ‘Security’ tab.
  3. Click on each of the zones, i.e. ‘Internet’, ‘Local intranet’, ‘Trusted sites’ and ‘Restricted sites’ and ensure that the ‘Enable Protected Mode (requires restarting Internet Explorer)’ check box is checked.

At this point, build your project in Visual Studio and run it. If you have followed the above steps correctly you will see Internet Explorer open, and navigate to Google automatically.

Now to actually interact with the open web page, Google in this case. In order to submit a search term, we’ll need to interact with the search term text box (to enter a search term), and the ‘Google Search’ button (to submit the request).

To do this, we’ll need to look at the source code for the page, to find the information we’ll need to interact with these controls. From looking at the source code, we can see that the markup for these controls looks like the following (formatted and commented for clarity):

Google Search

So, lets define the search term text box, and enter a search search:

IWebElement searchTermTB = driver.FindElement(By.Name("q"));
searchTermTB.SendKeys("jimmy collins blog");

Take note of what we’re doing here – we’re using the browser object we defined earlier to find an element with the name ‘q’. This is another great thing about Selenium – we can use just about any element attribute to try to find it, you could also use the ID, class name, tag name, or even the XPATH to find an element on the page.

Now build and run your application – you will once again see Internet Explorer opening up, but this time you’ll also see the search term being entered.

The final step is to actually click the ‘Google Search’ button and submit the query. The same approach that we used to find and interact the search term text box is followed:

IWebElement searchBtn = driver.FindElement(By.Name("btnG"));
searchBtn.Click();

Running your application now will open up Google, enter your search term, and hit the search button. The final thing to do is some cleanup – you will notice that currently when your application runs, the browser is left open once it completes. All we have to do to close the browser is:

driver.Close();

That’s how easy Selenium is to use. The ideal scenario is that you have interaction with your development team, and get them to agree to providing static IDs on all controls, that don’t change between versions of your site (unless in the case of a substantial UI revamp). That would make it a simple task to provide re-usable automation that can automatically verify changes to your site, or be used for regression testing.

Running Selenium Tests with C# & NUnit

Something I’m working on currently requires some automation of a web browser, so what a perfect opportunity to get some exposure to Selenium.

In this post I’ll outline the basics of creating and running a simple Selenium test using Selenium and NUnit. The implementation language will be C#. To get started, you will need to download the following:

Extract the Selenium Client Driver files, these DLL’s will be referenced in the Visual Studio project we create. Install NUnit using the .msi installer.

Now, let’s create the actual test:

  1. Launch Visual Studio 2010 and create a new class library project.
  2. Add a reference to nunit.framework.dll. This can be found under the NUnit installation directory at ‘bin\net-2.0\framework’.
  3. Add references to all the DLL’s contained in the Selenium Client Driver package you downloaded earlier.
  4. We’re now ready to add the code that will run a Selenium test. Add the following code to your class library project:
using NUnit.Framework;
using OpenQA.Selenium;
using OpenQA.Selenium.IE;

namespace FirstSeleniumTest
{
    [TestFixture]
    public class SeleniumTest
    {
        private IWebDriver driver;

        [SetUp]
        public void SetUp()
        {
            driver = new InternetExplorerDriver();
        }

        [Test]
        public void TestGoogle()
        {
            driver.Navigate().GoToUrl("http://www.google.com");
        }

        [TearDown]
        public void TearDown()
        {
            driver.Quit();
            driver.Dispose();
        } 
    }
}

The above code should be pretty easy to understand. Notice the annotations around the functions.

  • [SetUp] – This is where any test setup should be completed. In the above example we’re creating a new instance of InternetExplorerDriver, setting it up for our test to run later.
  • [Test] – This is where the test steps are defined. In this example, we’re just navigating to Google.
  • [TearDown] – In this section, any steps to be taken to cleanup the environment after your test has run can be defined. Here, all we’ll do is close Internet Explorer

Now that we have written a simple test, let’s try running it using NUnit. Before moving on, ensure that the above code builds successfully in your environment.

To run the test it’s just a matter of launching NUnit and opening up the DLL built from the Visual Studio project created above. You should see the ‘TestGoogle’ test listed. Simply select the test and hit the ‘Run’ button to initiate the test. You will see Internet Explorer launch, and then close.

NUnit

One thing you may need to do, depending on your IE version, is to disable protected mode for Internet and Restricted Zones in Internet Explorer Security Settings (don’t forget to re-enable these once you’ve finished experimenting with Selenium).

In a future post, I’ll outline how to do more complex tasks during your tests such as taking screenshots and navigating around web pages under tests.

Retrieve settings from COM+ components via C#

Recently, I had a requirement to be able to retrieve settings information from a number of COM+ components running on a server, such as the Constructor String etc. The idea behind this was to give us a snapshot of a servers configuration, and also allow easy comparisons between different servers in the event of issues. This is tedious and time consuming to do manually, especially if you’ve got a large number of components within each COM+ application, so I resolved to write a small C# program to do this for me and write the data to a file.

COM+ provides an administration object model that exposes all of the functionality of the Component Services administrative tool, so by adding a reference to the necessary library, you can achieve anything you can do through the graphical administrative tool, programmatically. To get started, you’ll need to add a reference to the necessary library – ‘COM + 1.0 Admin Type Library’. This can be found under the ‘COM’ tab when you go to add a reference to your project in Visual Studio.

You’ll need to add the following import also:


using COMAdmin;

First, we’ll need to create an Object to store the catalog of COM+ components installed on the machine. Here’s the code to create this catalog, and also retrieve a list of all the COM+ applications it contains:


COMAdminCatalog catalog;
COMAdminCatalogCollection applications;

// Get the catalog
catalog = new COMAdminCatalog();

// Get the list of all COM+ applications contained within this catalog
applications = (COMAdminCatalogCollection)catalog.GetCollection("Applications");
applications.Populate();

Now we have an Object above, ‘applications’, which contains all the data regarding what COM+ applications are installed on this machine. To go a little deeper, and see which components each application contains, it’s just as easy:


foreach (COMAdminCatalogObject application in applications)
{
COMAdminCatalogCollection components;
components = (COMAdminCatalogCollection)
components = (COMAdminCatalogCollection)applications.GetCollection ("Components", Application.Key);
components.Populate();

foreach (COMAdminCatalogObject component in components)
{
Console.WriteLine("Component: " + component.Name);
}
}

The above code shows you how to get a list of COM+ applications and their components, but what about retrieving or setting the values of specific component settings like the Constructor String of a component?

Here’s how:


// Set the value of a constructor string
component.set_Value("ConstructorString", "127.0.0.1");
// Get the value of a constructor string
component.get_Value("ConstructorString"));

That’s a quick overview, I leave it as an exercise to the reader to explore the other functionality of the ‘COMAdmin’ library, but if you just need to retrieve values of settings from specific components, the above will get you started.

As per normal, MSDN has some great documentation here.

Test Automation Success Criteria

I’ve been thinking recently about how to validate the success of an automation project, and actually prove a return on investment to anyone interested. This is important for a number of reasons.

Firstly, it shows the QA team, (who are most likely set in their ways with the traditional manual testing approach), that at least some of their work can be completed in a more efficient manner, and testing coverage can be increased without any extra effort on their part. Secondly, it shows management that automation is worth investing in. It also boosts the morale of the people who actually worked on the development of the automation, and gives them confidence to continue and make even more improvements.

Brett Pettichord defines what I think are 4 excellent items to validate any test automation project against:

  1. The automation runs
  2. The automation does real testing
  3. The automation finds defects
  4. The automation saves time

I believe the fourth item above is probably the most important, (assuming 1 and 2 are satisifed). The whole point of test automation is process improvement, and having the ability to absorb more work without requiring additional resources. Item 3 is important also, any defects should obviously be flagged, if not logged automatically also.

If the four points mentioned are satisfied once your automation project is complete, I believe it can be qualified as a success.

Programmatically verify resources in a DLL

I had a requirement recently to be able to programmatically check certain resources were contained in a set of (native) DLL resource files. The idea behind this was to add some post-build automated engineering checks to our existing automated test suite, e.g. ensuring resources for all the required languages have been injected correctly.

I wanted to write a simple C# application to perform these checks. I came accross this handy library which contained functions for almost all the functionality I required:

ResourcesLib

Using this, we can perform functions such as importing resources, loading strings, and even injecting new resources.

For example, here’s how you could retrieve all languages contained in a resource DLL:

string file = "resource.dll";
RawResourceFile resFile = new RawResourceFile();
resFile.Load(file);

for (int i = 0; i < resFile.Languages.Count; i++) { Console.WriteLine("Language: " + resFile.Languages[i]); }

Download the library and take a look, it can save you lots of time if you need to perform any checks/actions on DLL file.

Developing Language Independent test automation

At my day job, we’re currently developing an automation suite to perform Build Verification Testing (BVT) and some basic functional testing on new builds coming into QA. This is a challenge in itself, but becomes even more difficult when you consider we currently release products across 28 distinct locales, including right-to-left languages like Arabic and Hebrew.

The drawback of this, is that we need to think of the bigger picture when writing an automation script/program. A program we develop to perform some action on an English build, on an English XP, may not perform as expected on say a German build on a German language version of XP.

Here are some general guidelines I’ve learnt along the way, to keep in mind when developing automation scripts/programs that you intend to run accross multiple languages/platforms.

1. Never hard code language dependent information

I’ve seen this a lot. Such information may be the expected title of an alert, that appears when the script performs some action. Hard-coding the expected alert title in English will cause the script to be useless on any other language. All these strings should be externalised in some way, even to a simple text file.

2. Never hard code paths to Windows system folders

Hard-coding these paths, such as the paths to ‘Program Files’ or the ‘Documents and Settings’ folder will cause your automation to fail on non-English platforms. The problem with these is that they may be localized on some environments. For example, ‘Program Files’ becomes ‘Programme’ on a German environment. If you need to use these paths, always use the Windows environment variables to retrieve the value, that way you can be sure the path will be valid for that platform.

For example, retrieving the path to the ‘Program Files’ directory (in VBScript):


Set oShell = CreateObject( "WScript.Shell")
strProgramFilesDir = oShell.ExpandEnvironmentStrings("%PROGRAMFILES%")

Where an environment variable is not available, you can usually find the path in the Windows registry with some Googling. For example, the ‘All Users\Application Data’ folder can be found at the registry location below on all Windows variants:


HKLM\Software\Microsoft\Windows\Explorer\Shell Folders\Common AppData

3. Don’t assume elements will always be in the same place

In controls such as drop-down menus, the order of the items will be different on each language. Assuming the same order will cause unexpected results or cause your automated tests to fail outright. Also, in RTL languages such as Arabic, the elements themselves will be in a completely different area on the dialog.

4. Never hard-code date formats

Always use the date and time formatting functions provided by the development language you are working with. Also, a common error I’ve seen is that developers assume the date seperator used is always the same, not true! This is a ‘-‘ (hyphen) on an English platform, whereas a ‘.’ (period) on a German environment.

For example,


31.12.2009 - Germany
31/12/2009 - Belgium
31-12-2009 - Ireland

5. Never use ‘record and playback’ automation tools

These are pretty useless in an environment where you are attempting to develop automation to run across multiple languages, with the obvious example being attempting to run a script on an Arabic build which you previously recorded on an English build, where the dialogs are mirrored and the elements are in completely different areas of the screen.

6. Create automation which is resistant to changes in the UI

A couple of string changes in the UI should not cause your automation to fail. Use Object ID’s where possible, as these will rarely change, and it’s easy to add a fix if they do.

Just adhering to the above simple guidelines should solve many of the common issues encountered when attempting to develop an automation suite to run across multiple languages.

Automating Virtual Machine operations on ESXi Server from C#

VMware provides two really useful API’s for automating virtual machine (VM) tasks on both VMware Workstation and VMware ESXi server.

  • VI Infrastructure API
  • VIX API

These are extremely easy to use from C#. In a QA environment, the automation of VM’s can be hugely benifical, wheather attempting to automate an environment for build sanity checks or functional tests.

This post will outline the basics of using the VIX API from C#, in order to perform operations on VMware ESXi server. If you don’t have access to an ESXi server, you can install it on a VM, it’s free to download from the VMware website!

For starters, you will need to install the API’s on your development machine. In order to download, you will need to create a VMware account, which you may already have if you have downloaded Workstation or ESXi server in the past. If you dont, you can create an account for free. Once logged into your account, you can download both API’s from the ‘Support & Downloads’ section.

Let me explain the differance between these two API’s. From VMware’s own documentation:

The VI API provides access to the VMware Infrastructure management components—the managed objects that can be used to manage, monitor, and control life-cycle operations of virtual machines and other VMware infrastructure components (datacenters, datastores, networks, and so on).”

VIX on the other hand, is used to automate the actual operations on VM’s, such as booting them, copying in files, getting/setting VM environment varibles and other tasks you may wish to perform. The coolest part of VIX is that a wrapper for C# exists, created by Daniel Doubrovkine over at dblock.org. This wrapper, ‘VMwareTasks’, provides a simple object-orientated approach to VIX, which will be familar to C# developers. Download the wrapper here.

Now for the basics of using the VIX API and VMwareTasks wrapper. Create a new console application project in Visual Studio. You will need to add a reference to the VMwareTasks DLL, which is located in the ‘bin’ directory when you extract the VMwareTasks download.

Look how simple it is to power on a VM!


// Declare a new virtual host
VMWareVirtualHost host = new VMWareVirtualHost();

// Connect to the ESXi server
host.ConnectToVMWareVIServer("192.168.1.39", "root", "password123");

// Power on an existing VM by name
VMWareVirtualMachine machine = host.Open("[datastore1] XPP_SP2.vmx");
machine.PowerOn();

The simple code above just connects to an ESXi server, and powers on an existing VM, but you can see how easy it is to perform operations on VM’s.

Here’s how to create and revert to a snapshot:


VMWareVirtualHost host = new VMWareVirtualHost();
host.ConnectToVMWareVIServer("192.168.1.39", "root", "password123");
VMWareVirtualMachine machine = host.Open("[datastore1] Vista_EN.vmx");
machine.PowerOn();
machine.Login("Tester", "testing");

string snapShotName = "base";
machine.Snapshots.CreateSnapshot(snapShotName, "Clean");
machine.PowerOff();

VMWareSnapshot snapshot = machine.Snapshots.GetNamedSnapshot("base");
snapshot.RevertToSnapshot();

Or to create a directory:


machine.CreateDirectoryInGuest(@"C:\TestDir");

You can see from the above examples how easy it is to perform operations on VM’s using these API’s. Install the API’s and play around with the functionality, I guarantee you’ll be impressed!