The class com.gargoylesoftware.htmlunit.WebClient is the
main starting point. This simulates a web browser and will be used to execute
all of the tests. (see
WebClient - the browser
)
Android
Using HtmlUnit on Android has some challanges because of the subtle technical distinction
of java on android. Because of this we offer a customized distribution to work around these problem.
Please check out
htmlunit-android
on github.
Most unit testing will be done within a framework like
JUnit
so all the examples here will assume that we are using that.
In the first sample, we create the web client and have it load the homepage from the HtmlUnit website.
We then verify that this page has the correct title. Note that getPage() can return different types
of pages based on the content type of the returned data. In this case we are expecting a content
type of text/html so we cast the result to an com.gargoylesoftware.htmlunit.html.HtmlPage.
@Test
public void homePage() throws Exception {
try (final WebClient webClient = new WebClient()) {
final HtmlPage page = webClient.getPage("https://htmlunit.sourceforge.io/");
Assert.assertEquals("HtmlUnit – Welcome to HtmlUnit", page.getTitleText());
final String pageAsXml = page.asXml();
Assert.assertTrue(pageAsXml.contains("<body class=\"topBarDisabled\">"));
final String pageAsText = page.asNormalizedText();
Assert.assertTrue(pageAsText.contains("Support for the HTTP and HTTPS protocols"));
Frequently we want to change values in a form and submit the form back to the server. The
following example shows how you might do this.
@Test
public void submittingForm() throws Exception {
try (final WebClient webClient = new WebClient()) {
// Get the first page
final HtmlPage page1 = webClient.getPage("http://some_url");
// Get the form that we are dealing with and within that form,
// find the submit button and the field that we want to change.
final HtmlForm form = page1.getFormByName("myform");
final HtmlSubmitInput button = form.getInputByName("submitbutton");
final HtmlTextInput textField = form.getInputByName("userid");
// Change the value of the text field
textField.type("root");
// Now submit the form by clicking the button and get back the second page.
final HtmlPage page2 = button.click();
Often you will want to simulate a specific browser. This is done by passing a
com.gargoylesoftware.htmlunit.BrowserVersion into the WebClient constructor.
Constants have been provided for some common browsers but you can create your own specific
version by instantiating a BrowserVersion.
@Test
public void homePage_Firefox() throws Exception {
try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX)) {
final HtmlPage page = webClient.getPage("https://htmlunit.sourceforge.io/");
Assert.assertEquals("HtmlUnit – Welcome to HtmlUnit", page.getTitleText());
Specifying this BrowserVersion will change the user agent header that is sent up to the
server and will change the behavior of some of the JavaScript.
Once you have a reference to an HtmlPage, you can search for a specific HtmlElement by one of
'get' methods, or by using XPath or CSS selectors.
Below is an example of finding a 'div' by an ID, and getting an anchor by name:
@Test
public void getElements() throws Exception {
try (final WebClient webClient = new WebClient()) {
final HtmlPage page = webClient.getPage("http://some_url");
final HtmlDivision div = page.getHtmlElementById("some_div_id");
final HtmlAnchor anchor = page.getAnchorByName("anchor_name");
A simple way for finding elements might be to find all elements of a specific type.
@Test
public void getElements() throws Exception {
try (final WebClient webClient = new WebClient()) {
final HtmlPage page = webClient.getPage("http://some_url");
NodeList inputs = page.getElementsByTagName("input");
final Iterator<E> nodesIterator = nodes.iterator();
// now iterate
There is rich set of methods usable to locate page elements e.g.
HtmlPage.getAnchors(); HtmlPage.getAnchorByHref(String); HtmlPage.getAnchorByName(String); HtmlPage.getAnchorByText(String)
HtmlPage.getElementById(String); HtmlPage.getElementsById(String); HtmlPage.getElementsByIdAndOrName(String);
HtmlPage.getElementByName(String); HtmlPage.getElementsByName(String)
HtmlPage.getFormByName(String); HtmlPage.getForms()
HtmlPage.getFrameByName(String); HtmlPage.getFrames()
You can also start searching from the document element (HtmlPage.getDocumentElement()) and then traverse the dom tree
HtmlElement.getElementsByAttribute(String, String, String)
DomElement.getElementsByTagName(String); DomElement.getElementsByTagNameNS(String, String)
DomElement.getChildElements(); DomElement.getChildElementCount()
DomElement.getFirstElementChild(); DomElement.getLastElementChild()
HtmlElement.getEnclosingElement(String); HtmlElement.getEnclosingForm()
DomNode.getChildNodes(); DomNode.getChildren(); DomNode.getDescendants(); DomNode.getDomElementDescendants(); DomNode.getFirstChild(); DomNode.getHtmlElementDescendants()
DomNode.getLastChild(); DomNode.getNextElementSibling(); DomNode.getNextSibling(); DomNode.getPreviousElementSibling(); getPreviousSibling()
@Test
public void xpath() throws Exception {
try (final WebClient webClient = new WebClient()) {
final HtmlPage page = webClient.getPage("https://htmlunit.sourceforge.io/");
//get list of all divs
final List<?> divs = page.getByXPath("//div");
//get div which has a 'id' attribute of 'banner'
final HtmlDivision div = (HtmlDivision) page.getByXPath("//div[@id='banner']").get(0);
You can also use CSS selectors
@Test
public void cssSelector() throws Exception {
try (final WebClient webClient = new WebClient()) {
final HtmlPage page = webClient.getPage("https://htmlunit.sourceforge.io/");
//get list of all divs
final DomNodeList<DomNode> divs = page.querySelectorAll("div");
for (DomNode div : divs) {
//get div which has the id 'breadcrumbs'
final DomNode div = page.querySelector("div#breadcrumbs");