Make Test Automation Scripts Fast Using HTML Parsing Frameworks



If test execution speed is most important, HTML parser libraries like JSOUP should be used when Selenium WebDriver scripts are too slow.

JSOUP is a Java library for working with real-world HTML.

It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.


Selenium WebDriver scripts are very slow

The speed of Selenium WebDriver scripts is very slow as it depends on:
  • browser load time; how fast the browser loads depends on how good the computer is
  • site's performance; the performance of the site depends on server hardware, number of concurrent users, site architecture
  • internet connection's performance

Let's take for example a simple script that implements the following test case:
  • open the home page of the http://www.vpl.ca site
  • do a keyword search
  • on the results page, click on the title link for the first result
  • on the details page, check that the book title and book author values exist

All code samples are just that, samples.

No page object classes are being used.

@Test
public void test2() throws InterruptedException {

driver.get("http://www.vpl.ca");


WebElement searchField = wait.until(ExpectedConditions.
visibilityOfElementLocated(By.xpath("//input[@id='globalQuery']")));


searchField.click();
searchField.sendKeys("java");

WebElement searchButton = wait.until(ExpectedConditions.
elementToBeClickable(By.xpath("//input[@class='search_button']")));


searchButton.click();

WebElement resultTitle = wait.until(ExpectedConditions.
elementToBeClickable(By.xpath("(//a[@testid='bib_link'])[1]")));


resultTitle.click();

String bookTitleValue = "", bookAuthorValue = "";;

WebElement bookTitleElement = wait.until(ExpectedConditions.
visibilityOfElementLocated(By.xpath("//h1[@id='item_bib_title']")));

bookTitleValue = bookTitleElement.getText();
assertTrue(bookTitleValue.length() > 0);

//some books do not have author so I need to use a try/catch
try {


WebElement bookAuthorElement = wait.until(ExpectedConditions.
visibilityOfElementLocated(By.xpath("//a[@testid='author_search']")));

bookAuthorValue = bookAuthorElement.getText();
assertTrue(bookAuthorValue.length() > 0);


}
catch(Exception e) { }

}


The test script runs correctly in about 15 seconds.

Let's assume that we want to create a script for another test case that does the same things as the previous one but for all book title links from the results page (10 links).

The script is a bit more complicated as it needs to iterate through all book title links:
  • open the home page of the http://www.vpl.ca site
  • do a keyword search
  • on the results page, do the following for each book title link
    • click on the title link
    • on the details page, check that the book title and book author values exist
    • go back
    • continue with the next link

@Test
public void test1() throws InterruptedException {

driver.get("http://www.vpl.ca");

WebElement searchField = wait.until(ExpectedConditions.
visibilityOfElementLocated(By.xpath("//input[@id='globalQuery']")));


searchField.click();
searchField.sendKeys("java");

WebElement searchButton = wait.until(ExpectedConditions.
elementToBeClickable(By.xpath("//input[@class='search_button']")));


searchButton.click();

for (int i = 1; i <= 10; i++) {

WebElement resultTitle = wait.until(ExpectedConditions.
elementToBeClickable(By.xpath("(//a[@testid='bib_link'])[" + i + "]")));


resultTitle.click();

String bookTitleValue = "", bookAuthorValue = "";

WebElement bookTitleElement = wait.until(ExpectedConditions.
visibilityOfElementLocated(By.xpath("//h1[@id='item_bib_title']")));


bookTitleValue = bookTitleElement.getText();

assertTrue(bookTitleValue.length() > 0);

try {

WebElement bookAuthorElement = wait.until(ExpectedConditions.
visibilityOfElementLocated(By.xpath("//a[@testid='author_search']")));


bookAuthorValue = bookAuthorElement.getText();

assertTrue(bookAuthorValue.length() > 0);


}
catch(Exception e) { }

driver.navigate().back();
}


The script runs successfully but it needs 98 seconds to complete.

Is there another way for making the second script faster?

The first script proves that the site navigation between the home page, result and details pages works well.

If we agree that the navigation is not important for the second script, we can implement it not using the Selenium WebDriver framework but with the JSOUP HTTP parser library.


Implement time consuming automation scripts using the JSOUP library

Let's start cooking :)

We need the soup ingredients first: vegetables, herbs, oil ..............

Just kidding.



We will be cooking a different type of soup: JSOUP.

A few words about JSOUP (from the jsoup official site):

jsoup is a Java library for working with real-world HTML.
It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.

jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.
  1. scrape and parse HTML from a URL, file, or string
  2. find and extract data, using DOM traversal or CSS selectors
  3. manipulate the HTML elements, attributes, and text
  4. clean user-submitted content against a safe white-list, to prevent XSS attacks
  5. output tidy HTML

To user JSOUP, first download the JAR file from http://jsoup.org/ and add it to the project properties.

The second script written with JSOUP looks like this:

@Test public void test1() throws IOException {

Document resultsPage = Jsoup.connect("https://vpl.bibliocommons.com/search?q=java&t=keyword").get();

Elements titles = resultsPage.select("span.title");

for (Element title : titles) {

Element link = title.child(0);

String detailsPageUrl = "https://vpl.bibliocommons.com" + link.attr("href");

Document detailsPage = Jsoup.connect(detailsPageUrl).get();

Elements bookTitle = detailsPage.getElementsByAttributeValue("testid", "text_bibtitle");

if (bookTitle.size() > 0)
assertTrue(bookTitle.get(0).text().length() > 0);

Elements bookAuthor = detailsPage.getElementsByAttributeValue("testid", "author_search");

if (bookAuthor.size() > 0)
assertTrue(bookAuthor.get(0).text().length() > 0); 

}

}

Lets see what each line does:

//establishes a connection to the page;
//uses the get method to get the page content and return a document object
Document resultsPage = Jsoup.connect("https://vpl.bibliocommons.com/search?q=java&t=keyword").get();

//selects all span elements that have the title class from the document object
Elements titles = resultsPage.select("span.title");

//for each span element from the list
for (Element title : titles) {

//gets the first node of the span element; this is the title link
Element link = title.child(0);

//gets the href attribute of the a element
String detailsPageUrl = "https://vpl.bibliocommons.com" + link.attr("href");

//establishes a connection to the details page
//gets the page and returns a document object
Document detailsPage = Jsoup.connect(detailsPageUrl).get();

//finds all elements in the details page that have a testid attribute with the text_bibtitle value
Elements bookTitle = detailsPage.getElementsByAttributeValue("testid", "text_bibtitle");

//get the first found element using get(0) and its text using text()
//assert that the text length is > 0
if (bookTitle.size() > 0)
assertTrue(bookTitle.get(0).text().length() > 0);

//finds all elements in the details page that have a testid attribute with the author_search value
Elements bookAuthor = detailsPage.getElementsByAttributeValue("testid", "author_search");

//get the first found element using get(0) and its text using text()
//assert that the text length is > 0
if (bookAuthor.size() > 0)
assertTrue(bookAuthor.get(0).text().length() > 0); 

The script is simpler than the WebDriver one.

It does not work by interacting with the site through the browser.

It uses http requests instead to get the page data and then it parses or navigates through the page data.

The best part of this script is that it executes in 8 seconds!!!!

Compare this with 98 seconds needed for executing the WebDriver script.




Hope that you liked the soup recipe!

If you have any questions about the recipe, ingredients or the cook, please post them in the comments section.

The cook appreciates your interest in his recipes :)


How To Take Entire Page Screenshots In Selenium WebDriver Scripts

The ashot framework from Yandex can be used for taking screenshots in Selenium WebDriver scripts for
  1. full web pages
  2. web elements
This framework can be found on https://github.com/yandex-qatools/ashot.


Selenium WebDriver TakeScreenshot interface is rather limited


By default, screenshots can be taken in Selenium WebDriver scripts using the TakeScreenshot interface:

import static openqa.selenium.OutputType.*;
 
File screenshotFile = ((Screenshot)driver).getScreenshotAs(file);
String screenshotBase64 = ((Screenshot)driver).getScreenshotAs(base64);

If you need to take entire page screenshots or screenshots of a web element, this is possible but complicated.


Take Entire Page Screenshots With The AShot Framework



Yandex has other frameworks available in addition to Allure.

The ASHOT framework is helpful for taking screenshots.

Lets see some code samples.

The following test script implements a test case for the www.vpl.ca site that
  1. opens the home page of the site
  2. executes a keyword search
  3. clicks on the title of the first result on the results page
  4. checks that the book title is displayed on the details page
  5. checks that the book title's length is greater than 0

import static org.junit.Assert.*;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Ignore;
import org.junit.Test;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

public class TestWithoutScreenshots {

WebDriver driver;

@Before
public void setUp() {
driver = new FirefoxDriver();
driver.manage().window().maximize();
}

@After
public void tearDown() {
driver.quit();
}

@Test
public void testFirstResult() throws InterruptedException
{


//opens the home page of the site
driver.get("http://www.vpl.ca");

//executes a keyword search
WebElement searchField = driver.findElement(By.xpath("//input[@id='globalQuery']"));
searchField.click();
searchField.sendKeys("java");

WebElement searchButton = driver.findElement(By.xpath("//input[@class='search_button']"));
searchButton.click();

Thread.sleep(3000);

//clicks on the title of the first result on the results page
WebElement searchResultLink = driver.findElement(By.xpath("(//a[@testid='bib_link'])[2]"));
searchResultLink.click();

Thread.sleep(3000);

WebElement bookTitleElement = driver.findElement(By.xpath("//h1[@id='item_bib_title']"));
String bookTitleValue = bookTitleElement.getText();

//checks that the book title is displayed on the details page
assertEquals(bookTitleElement.isDisplayed(), true);
//checks that the book title's length is greater than 0 assertTrue(bookTitleValue.length() > 0);

}

}

The test script is part of a Maven Project.


Add The ashot Dependency To The POM.XML File

Add the following lines to the pom.xml file:

 <dependency>
<groupId>ru.yandex.qatools.ashot</groupId>
<artifactId>ashot</artifactId>
<version>1.4.12</version>
</dependency>  


Then, change the code so that entire page screenshots are taken for each page:


import static org.junit.Assert.*;
import java.io.File;
import java.io.IOException;
import javax.imageio.ImageIO;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import ru.yandex.qatools.ashot.AShot;
import ru.yandex.qatools.ashot.Screenshot;
import ru.yandex.qatools.ashot.screentaker.ViewportPastingStrategy;

public class TestWithScreenshots {

WebDriver driver;

@Before
public void setUp() {
driver = new FirefoxDriver();
driver.manage().window().maximize();
}

@After
public void tearDown() {
driver.quit();
}

@Test
public void testFirstResult() throws InterruptedException, IOException
{

driver.get("http://www.vpl.ca");

//take the screenshot of the entire home page and save it to a png file
Screenshot screenshot = new AShot().shootingStrategy(new ViewportPastingStrategy(1000)).takeScreenshot(driver);
ImageIO.write(screenshot.getImage(), "PNG", new File("c:\\temp\\home.png"))
;

WebElement searchField = driver.findElement(By.xpath("//input[@id='globalQuery']"));
searchField.click();
searchField.sendKeys("java");

WebElement searchButton = driver.findElement(By.xpath("//input[@class='search_button']"));
searchButton.click();

Thread.sleep(3000);

//take the screenshot of the entire results page and save it to a png file
screenshot = new AShot().shootingStrategy(new ViewportPastingStrategy(1000)).takeScreenshot(driver);
ImageIO.write(screenshot.getImage(), "PNG", new File("c:\\temp\\results.png"));


WebElement searchResultLink = driver.findElement(By.xpath("(//a[@testid='bib_link'])[2]"));
searchResultLink.click();

Thread.sleep(3000);

//take the screenshot of the entire details page and save it to a png file
screenshot = new AShot().shootingStrategy(new ViewportPastingStrategy(1000)).takeScreenshot(driver);
ImageIO.write(screenshot.getImage(), "PNG", new File("c:\\temp\\details.png"));


WebElement bookTitleElement = driver.findElement(By.xpath("//h1[@id='item_bib_title']"));
String bookTitleValue = bookTitleElement.getText();

assertEquals(bookTitleElement.isDisplayed(), true);
assertTrue(bookTitleValue.length() > 0);

}

}

Lets see how the screenshots look like.

The screenshot for the home page looks good:





The screenshot for the details page looks good as well:



The screenshot for the results page does not look too good:



How can we get a proper screenshot for the results page?


Take Web Element Screenshots 

The framework offers the option of getting screenshots of web elements.

See below the changed code:


import static org.junit.Assert.*;
import java.io.File;
import java.io.IOException;
import javax.imageio.ImageIO;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import ru.yandex.qatools.ashot.AShot;
import ru.yandex.qatools.ashot.Screenshot;
import ru.yandex.qatools.ashot.screentaker.ViewportPastingStrategy;

public class TestWithScreenshots {

WebDriver driver;

@Before
public void setUp() {
driver = new FirefoxDriver();
driver.manage().window().maximize();
}

@After
public void tearDown() {
driver.quit();
}

@Test
public void testFirstResult() throws InterruptedException, IOException
{

driver.get("http://www.vpl.ca");

//take the screenshot of the entire home page and save it to a png file
Screenshot screenshot = new AShot().shootingStrategy(new ViewportPastingStrategy(1000)).takeScreenshot(driver);
ImageIO.write(screenshot.getImage(), "PNG", new File("c:\\temp\\home.png"));

WebElement searchField = driver.findElement(By.xpath("//input[@id='globalQuery']"));
searchField.click();
searchField.sendKeys("java");

WebElement searchButton = driver.findElement(By.xpath("//input[@class='search_button']"));
searchButton.click();

Thread.sleep(3000);

//take the screenshot of the entire results page and save it to a png file
screenshot = new AShot().shootingStrategy(new ViewportPastingStrategy(1000)).takeScreenshot(driver);
ImageIO.write(screenshot.getImage(), "PNG", new File("c:\\temp\\results.png"));

//take the screenshot of a div element that includes all results page details>br/> screenshot = new AShot().takeScreenshot(driver, driver.findElement(By.xpath("(//div[@id='ct_search'])[1]")));
ImageIO.write(screenshot.getImage(), "PNG", new File("c:\\temp\\div_element.png"));


WebElement searchResultLink = driver.findElement(By.xpath("(//a[@testid='bib_link'])[2]"));
searchResultLink.click();

Thread.sleep(3000);

//take the screenshot of the entire details page and save it to a png file
screenshot = new AShot().shootingStrategy(new ViewportPastingStrategy(1000)).takeScreenshot(driver);
ImageIO.write(screenshot.getImage(), "PNG", new File("c:\\temp\\details.png"));


WebElement bookTitleElement = driver.findElement(By.xpath("//h1[@id='item_bib_title']"));
String bookTitleValue = bookTitleElement.getText();

assertEquals(bookTitleElement.isDisplayed(), true);
assertTrue(bookTitleValue.length() > 0);

}

}


The screenshot of the div element shows the full results page:





NEXT