Interacting with web pages using Selenium WebDriver for C#

I’ve been using Selenium WebDriver for C# a lot lately, for a number of projects that involved interacting with a web browser in some manner. I’ve used a lot of applications and libraries over the past few years that provide this functionality, but I’ve never come across one as intuitive and reliable as Selenium WebDriver – if you work on any projects that involve interacting with a web browser to automate some process, you need to read this post.

In this post I’ll take you through the process of using Selenium WebDriver to automate some interaction with a web browser and hopefully show you how powerful Selenium is. We’ll take a simple scenario as an example – submitting a request to the Google search engine.

Boot up Visual Studio and create a new C# console application. You’ll need to download the Selenim WebDriver for C# ZIP from Google Code, and add the DLL’s it contains as references to your project. For this example, you are only required to add WebDriver.dll and Newtonsoft.Json.Net35.dll. You’ll need to add the following using statements also:

using OpenQA.Selenium;
using OpenQA.Selenium.IE;

Now your ready to write some code that can drive your web browser. First, we’ll need to create the object that can do just that. With Selenium WebDriver, that is an IWebDriver object. This object can be instantiated to control Internet Explorer, Firefox, or Chrome. In this example, I’m going to use Internet Explorer.

IWebDriver driver = new InternetExplorerDriver();

You could also use ChromeDriver or FirefoxDriver above.

Next, let’s navigate to the Google website. The following code will handle this. One of the nice things about Selenium is that this call won’t return until the page is loaded in the browser. Other frameworks and libraries return immediately, requiring you to add waits/sleeps in your code to ensure the page is actually loaded, which is a terrible approach.

driver.Navigate().GoToUrl("http://www.google.com");

Aside – Internet Explorer Problems

One setting has to be changed in Internet Explorer in order for Selenium WebDriver to work correctly. Your security zones need to have protected mode either enabled or disabled – it doesn’t matter which, as long as it’s the same for each zone. I’ve got it set to on for each zone on my machine. To achieve this follow these steps:

  1. Open IE and go to Internet Options.
  2. Go to the ‘Security’ tab.
  3. Click on each of the zones, i.e. ‘Internet’, ‘Local intranet’, ‘Trusted sites’ and ‘Restricted sites’ and ensure that the ‘Enable Protected Mode (requires restarting Internet Explorer)’ check box is checked.

At this point, build your project in Visual Studio and run it. If you have followed the above steps correctly you will see Internet Explorer open, and navigate to Google automatically.

Now to actually interact with the open web page, Google in this case. In order to submit a search term, we’ll need to interact with the search term text box (to enter a search term), and the ‘Google Search’ button (to submit the request).

To do this, we’ll need to look at the source code for the page, to find the information we’ll need to interact with these controls. From looking at the source code, we can see that the markup for these controls looks like the following (formatted and commented for clarity):

Google Search

So, lets define the search term text box, and enter a search search:

IWebElement searchTermTB = driver.FindElement(By.Name("q"));
searchTermTB.SendKeys("jimmy collins blog");

Take note of what we’re doing here – we’re using the browser object we defined earlier to find an element with the name ‘q’. This is another great thing about Selenium – we can use just about any element attribute to try to find it, you could also use the ID, class name, tag name, or even the XPATH to find an element on the page.

Now build and run your application – you will once again see Internet Explorer opening up, but this time you’ll also see the search term being entered.

The final step is to actually click the ‘Google Search’ button and submit the query. The same approach that we used to find and interact the search term text box is followed:

IWebElement searchBtn = driver.FindElement(By.Name("btnG"));
searchBtn.Click();

Running your application now will open up Google, enter your search term, and hit the search button. The final thing to do is some cleanup – you will notice that currently when your application runs, the browser is left open once it completes. All we have to do to close the browser is:

driver.Close();

That’s how easy Selenium is to use. The ideal scenario is that you have interaction with your development team, and get them to agree to providing static IDs on all controls, that don’t change between versions of your site (unless in the case of a substantial UI revamp). That would make it a simple task to provide re-usable automation that can automatically verify changes to your site, or be used for regression testing.