
Other fds cannot be passed to the child process. The WinApi call to create a child process ( CreateProcess) allows to setup pipes for the three common fds ( stdin, stdout, stderr) using the STARTUPINFO structure, see CreateProcessA function (processthreadsapi.h) and STARTUPINFOA structure (processthreadsapi.h). Passing fds from a parent process to child process is common under Unix, but not under Windows.
#Google chrome webscraper code#
That's why the code uses pipes to communicate with Edge.Įdge uses the third file descriptor (fd) for reading messages and the fourth fd for writing messages. But if the process is run on a terminal server with more than one user, this is not acceptable. This may pose no risks on single user computers or dedicated virtual containers. Any user on the computer has access to the webserver. The Webserver lacks any security features. The low-level access to the CDP protocol is avaible by two means: Either Edge starts a small Webserver on a specific port or via pipes. To generate and parse JSON, the code uses the VBA-JSON library from here. The CDP protocol is a message-based protocol. The class clsEdge implements the CDP protocol. MsgBox ( " finish! Vote count is " & strVotes) StrVotes = objBrowser.jsEval( " ctl00_RateArticle_VountCountHist.innerText") ' click on codeproject link Call objBrowser.jsEval( " document.evaluate("".//h3"", document).iterateNext().click()")ĭim strVotes As String ' if a javascript expression evaluates to a plain type it is passed back to VBA ' wait till search has finished Call objBrowser.waitCompletion ' run search Call objBrowser.jsEval( " document.getElementsByName(""q"").form.submit()") ' fill search form (textbox is named q) Call objBrowser.jsEval( " document.getElementsByName(""q"").value=""automate edge vba""") ' evaluate javascript Call objBrowser.jsEval( " alert(""hi"")") ' navigate Call objBrowser.navigate( " ") ' Attach to any ("") or a specific page Call objBrowser.attach( " ") ' Start Browser Dim objBrowser As clsEdge
#Google chrome webscraper how to#
The main code is as follows:Ĭopy Code ' This is an example of how to use the classes Sub runedge() Evaluate arbitrary JavaScript expressions in the context of a page and return the resultīut these functions should suffice to do basic Webscraping.Basic functions to set up the communication channel.The code implements only a very narrow set of functions:
#Google chrome webscraper full#
A full documentation of the protocol can be found here. The code uses the Chrome Devtools Protocol (CDP) to communicate with the browser. Otherwise the tabs are opened in the currently running process, not the one that has been started and subsequent communication between VBA and Edge fails. Keep in mind, that all running Edge procceses must be terminated before running the code. The following solution needs no additional software, apart from a Chrome-based browser. But this requires the installation of a Webdriver, which might not be feasible in some environments. There are libraries that try to fill this gap using Selenium, see Seleniumbasic as an example.

Microsoft seems uninterested in creating a drop-in replacement for the IE OLE Object. Microsoft Edge is no longer based on ActiveX technology. But Microsoft will end support for IE in the near future and wants users to move to newer browsers like Microsoft Edge. It was very easy to automate IE for tasks like Webscraping or testing from OLE-aware programming languages like VBA. Internet Explorer classic (IE in the following) was based on ActiveX technology.
