cx:selenium

The cx:selenium step is an atomic step:

<p:declare-step type="cx:selenium">
  <p:input port="source" content-types="text xml"/>
  <p:output port="result" sequence="true"/>
  <p:option name="browser" as="xs:string?"/>
  <p:option name="capabilities" as="map(xs:QName, item())?"/>
  <p:option name="arguments" as="xs:string*"/>
</p:declare-step>

Here’s a pipeline that uses cx:selenium instead of p:load to get a web page. Like our earlier pipeline, this one loads a page and serializes it. But this time, it loads the page with Selenium, allowing the browser to evaluate any scripts it contains:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
                xmlns:cx="http://xmlcalabash.com/ns/extensions"
                name="main" version="3.1">
  <p:import href="https://xmlcalabash.com/ext/library/selenium.xpl"/>
  <p:output port="result"/>
  <p:option name="uri"/>

  <cx:selenium>
    <p:with-option name="arguments" select="('--headless')"/>
    <p:with-input>
      <p:inline content-type="text/plain">script version 0.2 .
      page "{$uri}" .
      pause PT0.5S .
      output to result .
      </p:inline>
    </p:with-input>
  </cx:selenium>
</p:declare-step>

Running that script gives us table data:

<html xmlns="http://www.w3.org/1999/xhtml">
   <head>
      <meta charset="utf-8"/>
      <title>Some cities in the UK</title>
      <script defer="defer" src="cities.js"/>
      <link href="../style.css" rel="stylesheet"/>
      <link href="cities.css" rel="stylesheet"/>
   </head>
   <body>
      <p>[<a href="/">Home</a>]</p>
      <h1>Some cities in the UK</h1>
      <table>
         <thead>
            <tr>
               <th>City</th>
               <th>Country</th>
               <th>Latitude</th>
               <th>Longitude</th>
            </tr>
         </thead>
         <tbody>
            <tr>
               <td>Abbots Langley</td>
               <td>England</td>
               <td>51.701 </td>
               <td>-0.416 </td>
            </tr>
            <tr>
               <td>Aberaman</td>
               <td>Wales</td>
               <td>51.7   </td>
               <td>-3.4333</td>
            </tr><tr>
               <td>Addlestone</td>
               <td>England</td>
               <td>51.3695</td>
               <td>-0.4901</td>
            </tr>
         </tbody>
      </table>
      <p>Load <button id="more">More</button>
      </p>
   </body>
</html>

How does it work? The page command loads the page, the pause command waits for half a second so the browser has a chance to fill the table, and the output command sends the current page DOM to the result port.

We can also interact with the page. This script:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
                xmlns:cx="http://xmlcalabash.com/ns/extensions"
                name="main" version="3.1">
  <p:import href="https://xmlcalabash.com/ext/library/selenium.xpl"/>
  <p:output port="result"/>
  <p:option name="uri"/>

  <cx:selenium>
    <p:with-option name="arguments" select="('--headless')"/>
    <p:with-input>
      <p:inline content-type="text/plain">script version 0.2 .
      page "{$uri}" .

      find $button by id = "more" .
      click $button .
      pause PT0.5S .

      output to result .
      </p:inline>
    </p:with-input>
  </cx:selenium>
</p:declare-step>

Finds the button on the page with the id “more”, clicks it, waits, then returns the page. That will return the second page of results.

As a final example, consider this script:

script version 0.2 .
page "{$uri}" .

until "not(empty($row))" do
  find $row by selector = "table tbody tr" .
  pause PT0.25S .
done

# Search for $city, hit more until we find it
find $city by xpath = "//td[. = '{$city}']".
while "empty($city)" do
  call clickNext .
  find $city by xpath = "//td[. = '{$city}']".
done

find $row by xpath "//tr[td[. = '{$city}']]" .

output xpath "normalize-space(replace($row/*:td[3], ' ', ' '))" to result .
output xpath "normalize-space(replace($row/h:td[4], ' ', ' '))" to result .

close .

subroutine clickNext
   find $button by selector = "button" .
   scroll to $button .
   click $button .
   pause PT0.25S .
end

This script:

  1. Loads the page.

  2. Waits until the table has data, rather than assuming 0.5s will be long enough.

  3. Looks for the city with an XPath expression. While it isn’t present, it uses a subroutine to click the next button.

  4. If we’ve found the city, we get its row.

  5. Then we output the latitude and longitude after doing a little cleanup.

  6. Then we close the browser.

  7. The “clickNext” subroutine scrolls finds the button, scrolls to it, clicks it, and wait’s ¼s.