Headless Chrome and the Puppeteer Library for Scraping and Testing the Web
Written by Nikos Vaggalis
Wednesday, 29 November 2017
With the advent of Single Page Applications, scraping pages for information as well as running automated user interaction tests has become much harder due to its highly dynamic nature. The solution? Headless Chrome and the Puppeteer library.
While there's always been Selenium, PhantomJS and others, and despite headless Chrome and Puppeteer arriving late to the party, they make for valuable additions to the team of web testing automation tools, which allow developers to simulate interaction of real users with a web site or application.
Headless Chrome is able to run without Puppeteer, as it can be programmatically controlled through the Chrome DevTools Protocol, typically invoked by attaching to a remotely running Chrome instance:
From the official documentation, here is an example that navigates to https://example.com and saves a screenshot as example.png::
For example, let's go to www.smadeseek.com and load a list of all smartphones availability.Then programmaticaly click on the img element of the second displayed device to bring up its detailed specifications page. From there we can access the innerHTML of the first table element:
There's just one caveat. Since CDP only works with Chromium, Chrome and other Blink-based browsers, so does Puppeteer. If you require more than that, then sticking to Selenium and its WebDriver API still remains the best option..
Apple's grip on the walled garden that is iOS is complete, but MacOS X comes from a freer time. Are the current moves to notarization a way of building a wall around MacOS apps? It's going to be the d [ ... ]
Team Fluoroacetate was the only one to attempt to hack the Tesla Model 3 at the recent Pwn2Own contest held in conjunction with the annual CanSecWest security contest earlier this month. Their hack wa [ ... ]