A key feature of New Relic Synthetics is scripted browser monitors. These are scripts you create with a JavaScript-like scripting language that drives Selenium-powered Google Chrome browsers. A scripted browser monitor lets you track multi-step processes on a website with a real browser. Among several object locator options available when creating scripted monitors, XPath selectors are perhaps the most powerful.
XPath selectors are a type of locator used in New Relic Synthetics to identify specific elements on a webpage. They allow you to pinpoint the exact location of elements such as buttons, forms, or links, and interact with them during script execution.
Optimizing your XPath selectors can greatly improve the performance and reliability of your New Relic Synthetics monitors. By using efficient and specific selectors, you can reduce the time it takes for scripts to locate elements on the page, which in turn improves the overall runtime of your monitor.
The basics: What is an XPath selector?
XPath means XML Path Language. It’s a language used to select nodes on an XML document. In the context of New Relic Synthetics, XPath selectors are used to locate elements on a webpage. These selectors use a path-like syntax to navigate through the HTML structure of a page.
XPath selectors start with a forward slash (/) and can contain elements separated by forward slashes, similar to a file path in your computer's directory. For example, if you want to select an element with the ID "btn-submit" that is nested inside a form element with the class "login-form", your XPath selector would look like:
/form[@class="login-form"]/input[@id="btn-submit"]
XPath selectors point to object elements within the tree structure of an XML document. By optimizing the XPath selectors in your Synthetics monitors, you can:
- Reduce the likelihood that changes in your web application will break XPath selectors. This helps avoid unnecessary monitor failures and alerts, as application changes could cause your script to find no matches, or cause it to interact with incorrect elements.
- Improve the readability of your monitors, making troubleshooting and future revisions easier.
XPath syntax
XPath syntax is similar to that of a file system path, with some additional rules specific to XML documents. XPath expressions consist of a series of steps separated by forward slashes (/), where each step represents a location in the document. These steps can be either element names, attribute names, or special characters such as periods (.) for the current node and double periods (..)
The examples and tips included in this post can help you make the best XPath optimizations and avoid unnecessary toil in your monitor scripts.
When to use XPath Selectors
XPath selectors are commonly used in web scraping and testing automation, as mentioned earlier. They can also be used to locate specific elements within an XML document, making it easier to extract data from large datasets. XPath selectors are particularly useful when the structure of the document is known and consistent, as they can efficiently navigate through the elements without manual inspection.
Additionally, XPath selectors can also be used in conjunction with other tools and languages, such as Selenium and Python, to enhance the functionality of web scraping and testing automation processes. This gives developers more flexibility in their approach and allows for more precise targeting of elements on a webpage.
Why create your own XPath selectors?
XPath selectors created by browser exports or recorders like the Katalon Automation Recorder work, but often provide overlong XPath selector statements that impact readability and could lead to stability issues if you make changes in your application. The Katalon Recorder does well with ID, linkText, and other simple object identifiers, but it can create XPath selectors that are hard to read and may not correctly identify the necessary objects.
Identifying and revising XPath selectors
Developer tools in Google Chrome greatly simplify the process of writing monitor scripts. To get started, identify the element you want to inspect and then right click on it.
This opens Chrome’s developer tools. Then, within the elements view, right click on the object you want to interact with in your monitor and select Copy > Copy XPath, which creates an XPath statement; in this example:
//*[@id="post-47124"]/div[1]/h3[1]/strong
The XPath the browser produced in this example is far from ideal. First, the XPath will no longer match the correct object if you make changes to the page structure under the post-47124
object. Second, this XPath makes the script more difficult to read and troubleshoot. We could find this same object with a simpler XPath selector, for example:
//*[text()="What it means to be entity-centric"]
We can even validate this selector without a monitor. In the elements view, activate Find with command + F. To the Find search field, add the XPath selector
//*[text()="What it means to be entity-centric"]
Chrome will highlight the matching object and indicate how many matches were found, as shown here:
Learning from examples
Let’s look at a couple of examples of poorly constructed “bad” XPaths and show suggestions for how to improve them.
1. Bad XPath: /html/body/div[2]/form/div/div[1]/div[1]
Suggestion: Avoid using an absolute XPath, which includes the entire path from the object you need all the way up to the HTML element. With an absolute path, if you make any application changes that impact the path down to the div object your script interacts with, the monitor will either fail or cause actions (clicks, sendKeys, etc.) to interact with an incorrect object on the page.
Use a relative XPath instead, and use the least number of levels in the XPath as possible. Check the div element to determine if there are any identifying attributes that you could use in your XPath. For example, if the element had the text of "Hello World", you could use //div[text()="Hello World"]
as your selector.
2. Bad XPath: (.//*[normalize-space(text()) and normalize-space(.)='Next Slide'])[1]/following::span[2]
Suggestion: This example is an XPath created by the Katalon Recorder that will click the "Introducing New Relic One" link from the New Relic homepage. While this XPath technically works, it’s complicated, difficult to read, and fragile should the application change.
The XPath targets the second span element after the first element with the text of "Next Slide," which isn’t ideal for a few reasons. First, it would work better to target the <a
> element instead of the span. Second, the example unnecessarily locates an object based on the location of a separate element. Changes to this object could break the selector. Finally, the XPath itself doesn’t make it clear what it’s looking for, which could complicate troubleshooting efforts. Better to use the title attribute of the <a
> element in the selector:
//a[@title="Introducing New Relic One"]
Five tips for XPath selector optimization
Finally, these tips can help optimize your XPath syntax:
1. Use the contains function to match a selector to an object based on partial text contents or portions of an attribute:
Text: //strong[contains(text(), "Absolute")]
Attribute: //img[contains(@src, "attachment-menu-button.png")]
2. If there are additional tags, such as <strong>
, for example, as part of the object you’re matching, use the dot (“.
”) instead of text()://div[., "Hello World"]
3. Instead of looking for a specific div or span, use conditional logic such as and/or expressions and search for any objects that match using the “*
” wildcard. This XPath looks for an object with a class attribute that contains panel-title and also has the text Reports://*[contains(@class, "panel-title") and text() = "Reports"]
4. For case-insensitive text (switching upper case letters to lower case letters), use the translate function. This converts the object text to lower case so that you no longer need to worry about case sensitivity://a[contains(translate(text(), "ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz"), "newsroom")]
5. Use the normalize-space function to handle new lines, trailing spaces, and so on. This removes leading and trailing whitespace and replaces multiple spaces with a single space://span[normalize-space(text()) = "Introducing New Relic One >"]
Obviously this isn’t an exhaustive list of XPath best practices, but hopefully these tips will give you some ideas on how to best use XPath syntax in your scripted monitors. For more detail, check out this Guru99 tutorial on XPath in Selenium Webdrivers, and don’t miss our blog post on Tips for Creating Resilient Selenium Scripts with CSS Selectors.
This opens Chrome’s developer tools. Then, within the elements view, right click on the object you want to interact with in your monitor and select Copy > Copy XPath, which creates an XPath statement; in this example:
//*[@id="post-47124"]/div[1]/h3[1]/strong
The XPath the browser produced in this example is far from ideal. First, the XPath will no longer match the correct object if you make changes to the page structure under the post-47124
object. Second, this XPath makes the script more difficult to read and troubleshoot. We could find this same object with a simpler XPath selector, for example:
//*[text()="What it means to be entity-centric"]
We can even validate this selector without a monitor. In the elements view, activate Find with command + F. To the Find search field, add the XPath selector
//*[text()="What it means to be entity-centric"]
Chrome will highlight the matching object and indicate how many matches were found, as shown here:
Learning from examples
Let’s look at a couple of examples of poorly constructed “bad” XPaths and show suggestions for how to improve them.
1. Bad XPath: /html/body/div[2]/form/div/div[1]/div[1]
Suggestion: Avoid using an absolute XPath, which includes the entire path from the object you need all the way up to the HTML element. With an absolute path, if you make any application changes that impact the path down to the div object your script interacts with, the monitor will either fail or cause actions (clicks, sendKeys, etc.) to interact with an incorrect object on the page.
Use a relative XPath instead, and use the least number of levels in the XPath as possible. Check the div element to determine if there are any identifying attributes that you could use in your XPath. For example, if the element had the text of "Hello World", you could use //div[text()="Hello World"]
as your selector.
2. Bad XPath: (.//*[normalize-space(text()) and normalize-space(.)='Next Slide'])[1]/following::span[2]
Suggestion: This example is an XPath created by the Katalon Recorder that will click the "Introducing New Relic One" link from the New Relic homepage. While this XPath technically works, it’s complicated, difficult to read, and fragile should the application change.
The XPath targets the second span element after the first element with the text of "Next Slide," which isn’t ideal for a few reasons. First, it would work better to target the <a
> element instead of the span. Second, the example unnecessarily locates an object based on the location of a separate element. Changes to this object could break the selector. Finally, the XPath itself doesn’t make it clear what it’s looking for, which could complicate troubleshooting efforts. Better to use the title attribute of the <a
> element in the selector:
//a[@title="Introducing New Relic One"]
5 tips for XPath selector optimization
Finally, these tips can help optimize your XPath syntax:
1. Use the contains function to match a selector to an object based on partial text contents or portions of an attribute:
Text: //strong[contains(text(), "Absolute")]
Attribute: //img[contains(@src, "attachment-menu-button.png")]
2. If there are additional tags, such as <strong>
, for example, as part of the object you’re matching, use the dot (“.
”) instead of text()://div[., "Hello World"]
3. Instead of looking for a specific div or span, use conditional logic such as and/or expressions and search for any objects that match using the “*
” wildcard. This XPath looks for an object with a class attribute that contains panel-title and also has the text Reports://*[contains(@class, "panel-title") and text() = "Reports"]
4. For case-insensitive text (switching upper case letters to lower case letters), use the translate function. This converts the object text to lower case so that you no longer need to worry about case sensitivity://a[contains(translate(text(), "ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz"), "newsroom")]
5. Use the normalize-space function to handle new lines, trailing spaces, and so on. This removes leading and trailing whitespace and replaces multiple spaces with a single space://span[normalize-space(text()) = "Introducing New Relic One >"]
Conclusion
Obviously this isn’t an exhaustive list of XPath basics and best practices, but hopefully these tips will give you some ideas on how to best use XPath syntax in your scripted monitors. For more information, check out this Guru99 tutorial on XPath in Selenium Webdrivers, and don’t miss our blog post on How to monitor the performance of your web pages with Selenium 4.
Las opiniones expresadas en este blog son las del autor y no reflejan necesariamente las opiniones de New Relic. Todas las soluciones ofrecidas por el autor son específicas del entorno y no forman parte de las soluciones comerciales o el soporte ofrecido por New Relic. Únase a nosotros exclusivamente en Explorers Hub ( discus.newrelic.com ) para preguntas y asistencia relacionada con esta publicación de blog. Este blog puede contener enlaces a contenido de sitios de terceros. Al proporcionar dichos enlaces, New Relic no adopta, garantiza, aprueba ni respalda la información, las vistas o los productos disponibles en dichos sitios.