Package Torello.Browser
Class BRDPC
- java.lang.Object
-
- Torello.Browser.BRDPC
-
public class BRDPC extends java.lang.Object
This class helps to start an headless-instance of a Web-Browser, and open a connection to that browser.BRDPC: Browser Remote Debug-Port Connection.
This class is the launch-pad for the commands provided by this package
Below is a simple example of using this class to start a browser, and get the html on a specific page.
Example:
import Torello.Browser.*; import static Torello.Java.Shell.C.*; // Color text to terminal import Torello.Java.StorageWriter; // The Log public class BrowseChrome { public static void main(String[] argv) throws Exception { // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Start up Google Chrome in Headless Mode // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Make sure to tell BRDPC where the Chrome-Binary is BRDPC.CHROME_EXE = "/usr/bin/google-chrome"; System.out.println(BCYAN + "Starting Chrome" + RESET); // This just runs the binary. It passes some command line arguments to the binary // when it invokes it using the java.lang.Process stuff... BRDPC.startBrowserExecutable (true, 9223, null, 4); //"https://en.wikipedia.org/wiki/Christopher_Columbus"); // This makes a "WebSocket Connection" with the Chrome-Instance that was started if (! BRDPC.buildDefaultWebSocketConnection (9223, false, (BrowserEvent e) -> System.out.println(e.toString()))) { System.out.println (BCYAN + "Could Not Get a Web-Socket Connection.\n" + RESET + "Exiting."); System.exit(0); } System.out.println(BGREEN + "Starting Commands" + RESET); BRDPC.defaultSender.sw = new StorageWriter(); // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Send some commands to the Web-Browser using its Remote-Debug-Port // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** String tID = Target.createTarget ("http://www.news.yahoo.com", null, null, null, null, null, null) .exec().await(); Target.activateTarget(tID).exec().await(); } }
These are the commands that I type inside of a GCP (Google Cloud Platform) Debian Terminal/Shell to make sure that a Chrome Headless Browser is working. These commands are copied from a Google Cloud Platform Help-Page about running Chrome inside the Cloud. This is a "Headless Instance" - meaning you aren't going to see anything at all, except some text printed to the screen.
UNIX or DOS Shell Command:
# Install manually all the missing libraries sudo apt-get update sudo apt-get install -y gconf-service libasound2 libatk1.0-0 libcairo2 libcups2 libfontconfig1 libgdk-pixbuf2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libxss1 fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils # Install Chrome sudo wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb sudo apt-get -fy install sudo dpkg -i google-chrome-stable_current_amd64.deb
This is the complete-contents of a Google-Cloud Help-Page about running Chrome. This version is actually something that is run inside of Python, but can be easily changed to Java. See: Installing and running Google-Chrome as a Headless Browser Google Cloud Shell:
UNIX or DOS Shell Command:
Use the official Python image. ## MY COMMENTS ALL HAVE DOUBLE-HASH'S (##) ## THIS PAGE WAS COPIED, VERBATIM, FROM: ## https://dev.to/googlecloud/using-headless-chrome-with-cloud-run-3fdp ## IT WAS FOR PYTHON (OR SOMETHING), DON'T USE PYTHON! ## ## REPLACE ALL OF THE "RUN" WITH "SUDO" ## THIS LINE IS NOT NEEDED, BUT I'M LEAVING IT AS IS... # https://hub.docker.com/_/python FROM python:3.7 # Install manually all the missing libraries RUN apt-get update RUN apt-get install -y gconf-service libasound2 libatk1.0-0 libcairo2 libcups2 libfontconfig1 libgdk-pixbuf2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libxss1 fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils # Install Chrome RUN wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb RUN dpkg -i google-chrome-stable_current_amd64.deb; apt-get -fy install ## ================================================================================= ## EVERYTHING FROM THIS POINT, AND FORWARD IS COMPLETELY UNNECESSARY, AS LONG AS THE ## BRDPC CLASS IS CAPABLE OF STARING CHROME FROM JAVA ## ## IGNORE THIS STUFF BELOW ## ================================================================================= # Install Python dependencies. COPY requirements.txt requirements.txt RUN pip install -r requirements.txt # Copy local code to the container image. ENV APP_HOME /app WORKDIR $APP_HOME COPY . . # Run the web service on container startup. Here we use the gunicorn # webserver, with one worker process and 8 threads. # For environments with multiple CPU cores, increase the number of workers # to be equal to the cores available. CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 main:app
Hi-Lited Source-Code:- View Here: Torello/Browser/BRDPC.java
- Open New Browser-Tab: Torello/Browser/BRDPC.java
File Size: 22,754 Bytes Line Count: 543 '\n' Characters Found
-
-
Field Summary
Fields Modifier and Type Field Description static boolean
ALLOW_NULLABLE_PARAMETERS
Configuration to Override the null-check.static String
CANARY_EXE
A common location for the Chome-Canary Binary on a Windows Machine.static String
CHROME_EXE
A commmon location for the Chrome-Binary on a Windows Machine.static WebSocketSender
defaultSender
A singleton instance of the web-socket connection.static boolean
QUIET
Configuration-flag for setting the verbosity of this library-package.static StorageWriter
sw
The log output.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static boolean
buildDefaultWebSocketConnection(Integer port, boolean quiet, Consumer<BrowserEvent> eventHandler)
After an instance of Chrome has been started, this method should be used to create aWebSocket
-Connection to the browser.static String
readAPIFromServer(Integer port)
You may use this method (if you so choose, for whatever reason) to retrieve the Remote Debugging Port API from Google.static Ret2<String,String>
readURLToUseFromServer(Integer port)
This is supposed to be the easy part.static void
startBrowserExecutable(boolean headless, Integer port, String initialURL, int maxCount)
A browser-instance must be loaded before using this package's commands
-
-
-
Field Detail
-
defaultSender
public static WebSocketSender defaultSender
A singleton instance of the web-socket connection.
-
CHROME_EXE
public static java.lang.String CHROME_EXE
A commmon location for the Chrome-Binary on a Windows Machine.
IMPORTANT: When this class'startBrowserExecutable(boolean, Integer, String, int)
method is invoked, it will attempt to run the program specified by this field. It is of utmost importance to set this field to whatever location has your Chrome-Binary!
-
CANARY_EXE
public static java.lang.String CANARY_EXE
A common location for the Chome-Canary Binary on a Windows Machine.
IMPORANT: I don't actually know when or why one would use the Canary-Binary, rather than the standard Chrome executable. I have read that the original intent of this particular application (which is a full fledged instance of Chrome) - was to provide a headless version of the standard Google-Browser.
The current version of a Chrome-Executable seems to acceptRemote Debug Port
commands quite readily, without having to use Canary. I will leave it here as a discussion-point.
-
QUIET
-
sw
public static StorageWriter sw
The log output. Output text will be sent to this field. This field is not declaredfinal
, and (obviously) may be set to any desired log output. This field is not exactly thread-safe - it is shared by all threads which use the headless browser library tools!
Because web-sockets, themselves are a multi-threaded application, there isn't really a way to identify which thread has caused a particular response. Some percentage of the messages received from the browser will be events, and in a multi-threaded application of this tool, there would be no way to identify which thread caused which event.
-
ALLOW_NULLABLE_PARAMETERS
public static final boolean ALLOW_NULLABLE_PARAMETERS
Configuration to Override the null-check.- See Also:
- Constant Field Values
- Code:
- Exact Field Declaration Expression:
public static final boolean ALLOW_NULLABLE_PARAMETERS = false;
-
-
Method Detail
-
startBrowserExecutable
public static void startBrowserExecutable(boolean headless, java.lang.Integer port, java.lang.String initialURL, int maxCount) throws java.io.IOException
A browser-instance must be loaded before using this package's commands
Starts a headless (or full/complete) browser instance. All this method does is use the Standard-Javajava.lang.Process
class to start an operating-system executable. This just means invoking the executable specified byCHROME_EXE
.- Parameters:
headless
- IfFALSE
is passed to this method, then the option-headless
will not be passed to the browser at startup. In such scenarios, an actual browser should popup on your desktop. Normally, this parameter should receiveTRUE
.port
- It is standard operating procedure to pass9222
to this parameter.9223
is also very common. You may pass null to this parameter, and the default value will be used (which is9223
).initialURL
- You may elect to have the page open at the specifiedURL
. This parameter may be null, and when it is, it shall be ignored.maxCount
- Some delay is inserted between the starting of a browser executable, and this method exiting.- Throws:
java.io.IOException
- If there are any web-socket or http problems when attempting to connect to the browser.- Code:
- Exact Method Body:
if (port == null) port = 9223; if (maxCount < 2) maxCount = 2; Vector<String> comm = new Vector<>(); comm.add(CHROME_EXE); if (! headless) comm.add("-disable-gpu"); comm.add("-remote-debugging-port=" + port); if (headless) comm.add("-headless"); // I don't know as much about Google-Chrome as you might think... // I don't know what this does... // comm.add("-no-sandbox"); if (initialURL != null) comm.add(initialURL); String[] COMM = comm.toArray(new String[0]); for (String s : COMM) System.out.print(s + " "); System.out.println(); Process pro = java.lang.Runtime.getRuntime().exec(COMM); R reader1 = new R(pro.getInputStream()); R reader2 = new R(pro.getErrorStream()); Thread t1 = new Thread(reader1); Thread t2 = new Thread(reader2); t1.setDaemon(true); t2.setDaemon(true); t1.start(); t2.start(); int i = 0; System.out.println("Waiting to read the code."); int counter = 0; while (reader1.stillReading && reader2.stillReading) if (i++ % 500000000 == 0) { System.out.print("."); if (++counter == maxCount) break; }
-
readAPIFromServer
public static java.lang.String readAPIFromServer(java.lang.Integer port) throws java.io.IOException
You may use this method (if you so choose, for whatever reason) to retrieve the Remote Debugging Port API from Google. This method will return a (very long) JSON-File, as ajava.lang.String
- Parameters:
port
- This should be the port that was used to start the server. This is, by default, port9223
.- Returns:
- A very-long JSON-
String
containing the currently-exported API that your web-browser provides. - Throws:
java.io.IOException
- If there are problems sending this request, or retrieving the response, this throws.- Code:
- Exact Method Body:
if (port == null) port = 9223; String urlStr = "http://127.0.0.1:" + port + "/json/protocol"; URL url = new URL(urlStr); System.out.println("Querying WS Server for JSON\n" + url); // NOTE: This is a very large JSON File - more than 20,000 lines of text return Scrape.scrapePage(url);
-
readURLToUseFromServer
public static Ret2<java.lang.String,java.lang.String> readURLToUseFromServer (java.lang.Integer port) throws java.io.IOException
This is supposed to be the easy part. There is a very long "Identifier String" that contains some connection information.
NOTE: There is really no need to use this method, as this "Connection Stuff" is all taken care of inside this class (and the classWebSocketSender
).
If, for some reason, there is a need to sue the feature of having multiple browser connections, and it is deemed important to build your own instance ofWebSocketSender
- you will have to retrieve thisURL
information in order to construct aWebSocketSender
instance.- Parameters:
port
- The browser-port upon. which theWebSocket
was connected- Returns:
- An instance of
Ret2
:String
(BrowserURL
)
The completeURL
to use when connecting to the browser over its RDP Connection.
String
(BrowserURL
)
A "Connection Code" that the browser uses to identify instances of a Web-Socket Connection.
- Throws:
java.io.IOException
- When connecting to the browser, this is always a possibility.- Code:
- Exact Method Body:
if (port == null) port = 9223; // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Retrieve the Web-Socket Address from URL - Attemt #1 // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** URL url = new URL("http://127.0.0.1:" + port + "/json/list"); if (! QUIET) sw.println( '\n' + BRED + "ATTEMPT #1:" + RESET + " (Works when there is already a page opened)\n" + "Querying WebSocketServer for Listening Address Using:\n\t" + BYELLOW + url.toString() + RESET ); try { String json = Scrape.scrapePage(url); StringReader sr = new StringReader(json); if (! QUIET) sw.println("Attempt #1, Server Responded With:\n" + json); if (json.length() > 10) // else it gave us an empty response { JsonArray jArr = Json.createReader(sr).readArray(); JsonObject jObj = null; // used in next-section if ((jArr != null) && (jArr.size() > 0)) { String res = jArr.getJsonObject(0).getString("webSocketDebuggerUrl", null); String code = StringParse.fromLastFrontSlashPos(res); if (res != null) return new Ret2<>(res, code); } } } catch (Exception e) { sw.println( BGREEN + "NOTE: FOR ATTEMPT #1, THIS IS SOMEWHAT EXPECTED" + RESET + '\n' + BRED + "Exception Message Attempt #1:\n\t" + RESET + e.getMessage() + '\n' + "-------------------------------------------------------------------\n" + EXCC.toString(e) + "-------------------------------------------------------------------\n" ); } // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** // Retrieve the Web-Socket Address from URL #2 // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** url = new URL("http://127.0.0.1:" + port + "/json/version"); // Again, I'm not 100% what chrome is really doing here... // url = new URL("http://127.0.0.1/json/version"); if (! QUIET) sw.println( BRED + "ATTEMPT #2:" + RESET + " (Creates an entirely new connection)\n" + "Querying WebSocketServer for Listening Address Using:\n\t" + BYELLOW + url.toString() + RESET ); try { String json = Scrape.scrapePage(url); StringReader sr = new StringReader(json); JsonObject jObj = Json.createReader(sr).readObject(); if (! QUIET) sw.println("Attempt #2, Server Responded With:\n" + json); if (jObj != null) { String res = jObj.getString("webSocketDebuggerUrl", null); String code = StringParse.fromLastFrontSlashPos(res); sw.println( "res: " + res + '\n' + "code: " + code ); if ((res != null) && res.startsWith("ws://")) return new Ret2<>(res, code); } } catch (Exception e) { sw.println( BRED + "Exception Message Attempt #2:\n\t" + RESET + e.getMessage() + '\n' + "-------------------------------------------------------------------\n" + EXCC.toString(e) + "-------------------------------------------------------------------\n" ); } // Failed sw.println(BRED + "Unable to retrieve Chrome Web-Socket URL. Sorry buddy." + RESET); return null;
-
buildDefaultWebSocketConnection
public static boolean buildDefaultWebSocketConnection (java.lang.Integer port, boolean quiet, java.util.function.Consumer<BrowserEvent> eventHandler) throws java.io.IOException, WebSocketException
After an instance of Chrome has been started, this method should be used to create aWebSocket
-Connection to the browser.You must invoked this method before sending commands
After starting a browser instance using the methodstartBrowserExecutable(boolean, Integer, String, int)
, you will also have to open a Web-Socket Connection to the browser using this method. Then (and only then), is it possible to begin sending requets to the browser.- Parameters:
port
- You should just elect to use9223
, that is what is recommended by all of the sources on that Internet that explain this browser-feature. Chrome will be listening on whatever port you started the headless version with.
NOTE: You don't have to start the browser in headless mode. Controlling a full-fleged, and opened, browser is allowed with Chrome.quiet
- You may configure the verbosity-level for theWebsocket
-Connection to be different from the verbosity-level for this class (classBRDPC
).eventHandler
- It is advisable to register an event-handler. This parameter may be passed null, and if it is, events will not be handled (they are then ignored).- Throws:
java.io.IOException
- Connecting to the browser may cause this exception.WebSocketException
- TheNeoVisionaries
package forWebSockets
may throw this when attempting to build the connection.- Code:
- Exact Method Body:
Ret2<String, String> browserListenerURL = readURLToUseFromServer(port); if ((browserListenerURL == null) || (browserListenerURL.a == null)) { sw.println("Unable to retrieve a Web-Socket Connection. No Request URL's"); return false; } BRDPC.defaultSender = new WebSocketSender( browserListenerURL.a, quiet, (eventHandler == null) ? null : (Object o) -> eventHandler.accept((BrowserEvent) o) ); if (! QUIET) sw.println ("Web Socket Connection Opened:\n" + BYELLOW + browserListenerURL.a + RESET); return true;
-
-