Automate your Android App Testing Using Robotium

I like that it’s the time of the year when I actually have some time to invest in blogging here. Updates have been pretty sporadic this year (another item to add to my list of resolutions once 2013 hits in a few days).

In this post, I’d like to talk a little about Robotium, and how you can use it to automate the QA of your Android applications. Mobile applications are perfect for Quality Assurance automation – they are usually small, have limited functionality when compared with desktop applications, and are rarely complex in terms of testing steps. Also, if your a solo mobile developer, you may not have the time (or money) to invest in proper quality assurance of your applications.

We’ve been using a Robotium based solution written in Java at my day job for a while, to validate the quality of our localized product builds, both as a unit testing tool and as an actual test automation tool used by our QA team.

One of the main reasons why we chose to invest time in Robotium is the fact that it offers flexibility in terms of how it recognizes objects within your app (e.g. like it would have to in order to click a button). This is especially important for us, as we want to develop a solution once, and have it support multiple language versions of our application. We also looked at MonkeyRunner, which ships with the Android SDK. MonkeyRunner uses Jython (a Python implementation in Java) scripts to walk through applications.

I prefer Robotium as MonkeyRunner lacks the tight UI integration offered by Robotium, (although MonkeyRunner is much easier to setup and run).

When developing automated test scripts using Robotium, we can use the text from a UI object to identify that object at run-time, for example the label on a button. What we do is compile the localized resources from all our languages with our test application, so if we switch our device language to German for example, the resources from the ‘values-de’ folder are loaded by our test application, and thus we can add new languages to our automation solution simply by re-packaging the APK file containing our test automation scripts, and it will run on any of our supported locales without code modifications.

One of the disadvantages of this approach, (and almost all UI based test automation solutions to be fair), is that the maintenance may be high – for example, you will probably lose 90% of your code in a UI refresh.

Creating a Simple Robotium Automated Test Case

  • First off, download the latest version of Robotium, (3.6 at the time of writing).
  • Now simply create a new Android application in the IDE of your choice, I prefer NetBeans due to familiarity, but I know most people like Eclipse for Android development.
  • Add the Robotium JAR as a reference in your project.

For the purpose of this example, let’s say we want to click the ‘Cancel’ button in the below screen capture. As I mentioned above, our Robotium code uses the string ID’s to find objects (based on the current device language, we will look for the value of that string ID for that language if available, and use that to look for the object on the current screen).

MMS Image

Code (Finally!)


package com.jc.robotium

import android.app.Activity;
import android.test.ActivityInstrumentationTestCase2;
import com.jayway.android.robotium.solo.Solo;
import android.content.res.Resources;
import android.content.Context;
import java.util.Locale;

public class RobotiumExample extends ActivityInstrumentationTestCase2 // Provides functional testing of an activity
{
    private static final String TARGET_PACKAGE_ID = "com.yourcompany.yourapp";
    private static final String LAUNCHER_ACTIVITY_FULL_CLASSNAME = "com.yourcompany.yourapp.Activities.MainActivity";
    private static Class launcherActivityClass;
    
    private Solo solo;
    private Activity activity;
    private Resources res;
    private Context context;
   
    // This will launch the application specified above on the device
    static 
    {
        try
        {
            launcherActivityClass = Class.forName(LAUNCHER_ACTIVITY_FULL_CLASSNAME);
        }
        catch(ClassNotFoundException e)
        {
            throw new RuntimeException(e);
        }
    }
    
    public RobotiumExample() throws ClassNotFoundException
    {
        super(TARGET_PACKAGE_ID, launcherActivityClass);
    }
    
    @Override
    protected void setUp() throws Exception
    {
        activity = getActivity();
        solo = new Solo(getInstrumentation(), activity);
        context = getInstrumentation().getContext();
        res = context.getResources();
    }
    
    public void testUI() throws Throwable
    {
        // ....
		
	// Robotium code begins here
		
	// Wait for the 'Cancel' button
	solo.waitForText(res.getString(R.string.cancel));  
		
	// We may want to take a screen capture...
	solo.takeScreenshot();
		
	// Click the cancel button
	solo.clickOnText(res.getString(R.string.cancel)); 
	
	// ....

    }
    
    @Override
    protected void tearDown() throws Exception
    {
        try
        {
            solo.finalize();
        }
        catch(Throwable e)
        {
            // Catch this
        }
        getActivity().finish();
        super.tearDown();
    }

Notice a few things above:

  • We are extending the ActivityInstrumentationTestCase2 class, which provides us with the ability to run test methods on the UI thread.
  • The method that contains your test logic must be prefixed with ‘test’, notice mine is called testUI().
  • We could have looked for the ‘Cancel’ button by passing the text ‘Cancel’ to the waitForText() and clickOnText() methods, but that would only work on the English version of our product. Instead we use the string ID, and look up the value of that string ID based on the current device locale.
  • If you use the ‘takeScreenshot()’ method I have shown an example of above, screen captures will be saved in ‘/sdcard/Robotium-Screenshots/’ on your test device.

Running the Test

Compile the APK containing your test automation script and install it on your test device, along with the APK under test. One ‘gotcha’ here with Robotium is that the APK under test and the APK containing the test logic must be signed with the same certificate. This may not be the case if your APK under test comes out of a build system of some sort. You can use resign.jar to re-sign your APK under test with the same signature as your APK containing the test logic (by running it on the same machine on which you compiled the test APK).

Once both are installed on a test device, we can launch the test from a command prompt via adb:

adb shell am instrument -e class com.jc.robotium.RobotiumExample -w com.jc.robotium/android.test.InstrumentationTestRunner

Conclusion

Robotium is a good test framework for getting automation for Android applications up and running very quickly. It has an active community and updates are released regularly, the latest version (3.6 at time of writing), also seems to be a lot more stable than previous versions.

On the flip side, since we are relying on text resources (which may change frequently), the maintenance on Robotium based automation solutions can be high if your UI changes a lot, which is probably a high possibility for a mobile application.

If your looking for an automation framework for your Android application that you can get up and running quickly, and use to run scripts across an Android application supported in many languages, I would recommend Robotium, and will be keeping a close eye on it as it develops further.

Interacting with web pages using Selenium WebDriver for C#

I’ve been using Selenium WebDriver for C# a lot lately, for a number of projects that involved interacting with a web browser in some manner. I’ve used a lot of applications and libraries over the past few years that provide this functionality, but I’ve never come across one as intuitive and reliable as Selenium WebDriver – if you work on any projects that involve interacting with a web browser to automate some process, you need to read this post.

In this post I’ll take you through the process of using Selenium WebDriver to automate some interaction with a web browser and hopefully show you how powerful Selenium is. We’ll take a simple scenario as an example – submitting a request to the Google search engine.

Boot up Visual Studio and create a new C# console application. You’ll need to download the Selenim WebDriver for C# ZIP from Google Code, and add the DLL’s it contains as references to your project. For this example, you are only required to add WebDriver.dll and Newtonsoft.Json.Net35.dll. You’ll need to add the following using statements also:

using OpenQA.Selenium;
using OpenQA.Selenium.IE;

Now your ready to write some code that can drive your web browser. First, we’ll need to create the object that can do just that. With Selenium WebDriver, that is an IWebDriver object. This object can be instantiated to control Internet Explorer, Firefox, or Chrome. In this example, I’m going to use Internet Explorer.

IWebDriver driver = new InternetExplorerDriver();

You could also use ChromeDriver or FirefoxDriver above.

Next, let’s navigate to the Google website. The following code will handle this. One of the nice things about Selenium is that this call won’t return until the page is loaded in the browser. Other frameworks and libraries return immediately, requiring you to add waits/sleeps in your code to ensure the page is actually loaded, which is a terrible approach.

driver.Navigate().GoToUrl("http://www.google.com");

Aside – Internet Explorer Problems

One setting has to be changed in Internet Explorer in order for Selenium WebDriver to work correctly. Your security zones need to have protected mode either enabled or disabled – it doesn’t matter which, as long as it’s the same for each zone. I’ve got it set to on for each zone on my machine. To achieve this follow these steps:

  1. Open IE and go to Internet Options.
  2. Go to the ‘Security’ tab.
  3. Click on each of the zones, i.e. ‘Internet’, ‘Local intranet’, ‘Trusted sites’ and ‘Restricted sites’ and ensure that the ‘Enable Protected Mode (requires restarting Internet Explorer)’ check box is checked.

At this point, build your project in Visual Studio and run it. If you have followed the above steps correctly you will see Internet Explorer open, and navigate to Google automatically.

Now to actually interact with the open web page, Google in this case. In order to submit a search term, we’ll need to interact with the search term text box (to enter a search term), and the ‘Google Search’ button (to submit the request).

To do this, we’ll need to look at the source code for the page, to find the information we’ll need to interact with these controls. From looking at the source code, we can see that the markup for these controls looks like the following (formatted and commented for clarity):

Google Search

So, lets define the search term text box, and enter a search search:

IWebElement searchTermTB = driver.FindElement(By.Name("q"));
searchTermTB.SendKeys("jimmy collins blog");

Take note of what we’re doing here – we’re using the browser object we defined earlier to find an element with the name ‘q’. This is another great thing about Selenium – we can use just about any element attribute to try to find it, you could also use the ID, class name, tag name, or even the XPATH to find an element on the page.

Now build and run your application – you will once again see Internet Explorer opening up, but this time you’ll also see the search term being entered.

The final step is to actually click the ‘Google Search’ button and submit the query. The same approach that we used to find and interact the search term text box is followed:

IWebElement searchBtn = driver.FindElement(By.Name("btnG"));
searchBtn.Click();

Running your application now will open up Google, enter your search term, and hit the search button. The final thing to do is some cleanup – you will notice that currently when your application runs, the browser is left open once it completes. All we have to do to close the browser is:

driver.Close();

That’s how easy Selenium is to use. The ideal scenario is that you have interaction with your development team, and get them to agree to providing static IDs on all controls, that don’t change between versions of your site (unless in the case of a substantial UI revamp). That would make it a simple task to provide re-usable automation that can automatically verify changes to your site, or be used for regression testing.

Running Selenium Tests with C# & NUnit

Something I’m working on currently requires some automation of a web browser, so what a perfect opportunity to get some exposure to Selenium.

In this post I’ll outline the basics of creating and running a simple Selenium test using Selenium and NUnit. The implementation language will be C#. To get started, you will need to download the following:

Extract the Selenium Client Driver files, these DLL’s will be referenced in the Visual Studio project we create. Install NUnit using the .msi installer.

Now, let’s create the actual test:

  1. Launch Visual Studio 2010 and create a new class library project.
  2. Add a reference to nunit.framework.dll. This can be found under the NUnit installation directory at ‘bin\net-2.0\framework’.
  3. Add references to all the DLL’s contained in the Selenium Client Driver package you downloaded earlier.
  4. We’re now ready to add the code that will run a Selenium test. Add the following code to your class library project:
using NUnit.Framework;
using OpenQA.Selenium;
using OpenQA.Selenium.IE;

namespace FirstSeleniumTest
{
    [TestFixture]
    public class SeleniumTest
    {
        private IWebDriver driver;

        [SetUp]
        public void SetUp()
        {
            driver = new InternetExplorerDriver();
        }

        [Test]
        public void TestGoogle()
        {
            driver.Navigate().GoToUrl("http://www.google.com");
        }

        [TearDown]
        public void TearDown()
        {
            driver.Quit();
            driver.Dispose();
        } 
    }
}

The above code should be pretty easy to understand. Notice the annotations around the functions.

  • [SetUp] – This is where any test setup should be completed. In the above example we’re creating a new instance of InternetExplorerDriver, setting it up for our test to run later.
  • [Test] – This is where the test steps are defined. In this example, we’re just navigating to Google.
  • [TearDown] – In this section, any steps to be taken to cleanup the environment after your test has run can be defined. Here, all we’ll do is close Internet Explorer

Now that we have written a simple test, let’s try running it using NUnit. Before moving on, ensure that the above code builds successfully in your environment.

To run the test it’s just a matter of launching NUnit and opening up the DLL built from the Visual Studio project created above. You should see the ‘TestGoogle’ test listed. Simply select the test and hit the ‘Run’ button to initiate the test. You will see Internet Explorer launch, and then close.

NUnit

One thing you may need to do, depending on your IE version, is to disable protected mode for Internet and Restricted Zones in Internet Explorer Security Settings (don’t forget to re-enable these once you’ve finished experimenting with Selenium).

In a future post, I’ll outline how to do more complex tasks during your tests such as taking screenshots and navigating around web pages under tests.

Retrieve settings from COM+ components via C#

Recently, I had a requirement to be able to retrieve settings information from a number of COM+ components running on a server, such as the Constructor String etc. The idea behind this was to give us a snapshot of a servers configuration, and also allow easy comparisons between different servers in the event of issues. This is tedious and time consuming to do manually, especially if you’ve got a large number of components within each COM+ application, so I resolved to write a small C# program to do this for me and write the data to a file.

COM+ provides an administration object model that exposes all of the functionality of the Component Services administrative tool, so by adding a reference to the necessary library, you can achieve anything you can do through the graphical administrative tool, programmatically. To get started, you’ll need to add a reference to the necessary library – ‘COM + 1.0 Admin Type Library’. This can be found under the ‘COM’ tab when you go to add a reference to your project in Visual Studio.

You’ll need to add the following import also:


using COMAdmin;

First, we’ll need to create an Object to store the catalog of COM+ components installed on the machine. Here’s the code to create this catalog, and also retrieve a list of all the COM+ applications it contains:


COMAdminCatalog catalog;
COMAdminCatalogCollection applications;

// Get the catalog
catalog = new COMAdminCatalog();

// Get the list of all COM+ applications contained within this catalog
applications = (COMAdminCatalogCollection)catalog.GetCollection("Applications");
applications.Populate();

Now we have an Object above, ‘applications’, which contains all the data regarding what COM+ applications are installed on this machine. To go a little deeper, and see which components each application contains, it’s just as easy:


foreach (COMAdminCatalogObject application in applications)
{
COMAdminCatalogCollection components;
components = (COMAdminCatalogCollection)
components = (COMAdminCatalogCollection)applications.GetCollection ("Components", Application.Key);
components.Populate();

foreach (COMAdminCatalogObject component in components)
{
Console.WriteLine("Component: " + component.Name);
}
}

The above code shows you how to get a list of COM+ applications and their components, but what about retrieving or setting the values of specific component settings like the Constructor String of a component?

Here’s how:


// Set the value of a constructor string
component.set_Value("ConstructorString", "127.0.0.1");
// Get the value of a constructor string
component.get_Value("ConstructorString"));

That’s a quick overview, I leave it as an exercise to the reader to explore the other functionality of the ‘COMAdmin’ library, but if you just need to retrieve values of settings from specific components, the above will get you started.

As per normal, MSDN has some great documentation here.

Test Automation Success Criteria

I’ve been thinking recently about how to validate the success of an automation project, and actually prove a return on investment to anyone interested. This is important for a number of reasons.

Firstly, it shows the QA team, (who are most likely set in their ways with the traditional manual testing approach), that at least some of their work can be completed in a more efficient manner, and testing coverage can be increased without any extra effort on their part. Secondly, it shows management that automation is worth investing in. It also boosts the morale of the people who actually worked on the development of the automation, and gives them confidence to continue and make even more improvements.

Brett Pettichord defines what I think are 4 excellent items to validate any test automation project against:

  1. The automation runs
  2. The automation does real testing
  3. The automation finds defects
  4. The automation saves time

I believe the fourth item above is probably the most important, (assuming 1 and 2 are satisifed). The whole point of test automation is process improvement, and having the ability to absorb more work without requiring additional resources. Item 3 is important also, any defects should obviously be flagged, if not logged automatically also.

If the four points mentioned are satisfied once your automation project is complete, I believe it can be qualified as a success.

Programmatically verify resources in a DLL

I had a requirement recently to be able to programmatically check certain resources were contained in a set of (native) DLL resource files. The idea behind this was to add some post-build automated engineering checks to our existing automated test suite, e.g. ensuring resources for all the required languages have been injected correctly.

I wanted to write a simple C# application to perform these checks. I came accross this handy library which contained functions for almost all the functionality I required:

ResourcesLib

Using this, we can perform functions such as importing resources, loading strings, and even injecting new resources.

For example, here’s how you could retrieve all languages contained in a resource DLL:

string file = "resource.dll";
RawResourceFile resFile = new RawResourceFile();
resFile.Load(file);

for (int i = 0; i < resFile.Languages.Count; i++) { Console.WriteLine("Language: " + resFile.Languages[i]); }

Download the library and take a look, it can save you lots of time if you need to perform any checks/actions on DLL file.