Class BRDPC


  • public class BRDPC
    extends java.lang.Object
    This class helps to start an headless-instance of a Web-Browser, and open a connection to that browser.

    BRDPC: Browser Remote Debug-Port Connection.

    This class is the launch-pad for the commands provided by this package

    Below is a simple example of using this class to start a browser, and get the html on a specific page.

    Example:
    import Torello.Browser.*;
    import static Torello.Java.Shell.C.*;   // Color text to terminal
    import Torello.Java.StorageWriter;      // The Log
    
    public class BrowseChrome
    {
        public static void main(String[] argv) throws Exception
        {
            // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
            // Start up Google Chrome in Headless Mode
            // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
    
            // Make sure to tell BRDPC where the Chrome-Binary is
            BRDPC.CHROME_EXE = "/usr/bin/google-chrome";
    
            System.out.println(BCYAN + "Starting Chrome" + RESET);
    
            // This just runs the binary.  It passes some command line arguments to the binary
            // when it invokes it using the java.lang.Process stuff...
    
            BRDPC.startBrowserExecutable
                (true, 9223, null, 4); //"https://en.wikipedia.org/wiki/Christopher_Columbus");
    
            // This makes a "WebSocket Connection" with the Chrome-Instance that was started
            if (! BRDPC.buildDefaultWebSocketConnection
                (9223, false, (BrowserEvent e) -> System.out.println(e.toString())))
            {
                System.out.println
                    (BCYAN + "Could Not Get a Web-Socket Connection.\n" + RESET + "Exiting.");
                System.exit(0);
            }
    
            System.out.println(BGREEN + "Starting Commands" + RESET);
    
            BRDPC.defaultSender.sw = new StorageWriter();
    
    
            // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
            // Send some commands to the Web-Browser using its Remote-Debug-Port
            // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
    
            String tID = Target.createTarget
                ("http://www.news.yahoo.com", null, null, null, null, null, null)
                .exec().await();
    
            Target.activateTarget(tID).exec().await(); 
        }
    }
    

    These are the commands that I type inside of a GCP (Google Cloud Platform) Debian Terminal/Shell to make sure that a Chrome Headless Browser is working. These commands are copied from a Google Cloud Platform Help-Page about running Chrome inside the Cloud. This is a "Headless Instance" - meaning you aren't going to see anything at all, except some text printed to the screen.

    UNIX or DOS Shell Command:
    # Install manually all the missing libraries sudo apt-get update sudo apt-get install -y gconf-service libasound2 libatk1.0-0 libcairo2 libcups2 libfontconfig1 libgdk-pixbuf2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libxss1 fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils # Install Chrome sudo wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb sudo apt-get -fy install sudo dpkg -i google-chrome-stable_current_amd64.deb


    This is the complete-contents of a Google-Cloud Help-Page about running Chrome. This version is actually something that is run inside of Python, but can be easily changed to Java. See: Installing and running Google-Chrome as a Headless Browser Google Cloud Shell:

    UNIX or DOS Shell Command:
    Use the official Python image. ## MY COMMENTS ALL HAVE DOUBLE-HASH'S (##) ## THIS PAGE WAS COPIED, VERBATIM, FROM: ## https://dev.to/googlecloud/using-headless-chrome-with-cloud-run-3fdp ## IT WAS FOR PYTHON (OR SOMETHING), DON'T USE PYTHON! ## ## REPLACE ALL OF THE "RUN" WITH "SUDO" ## THIS LINE IS NOT NEEDED, BUT I'M LEAVING IT AS IS... # https://hub.docker.com/_/python FROM python:3.7 # Install manually all the missing libraries RUN apt-get update RUN apt-get install -y gconf-service libasound2 libatk1.0-0 libcairo2 libcups2 libfontconfig1 libgdk-pixbuf2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libxss1 fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils # Install Chrome RUN wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb RUN dpkg -i google-chrome-stable_current_amd64.deb; apt-get -fy install ## ================================================================================= ## EVERYTHING FROM THIS POINT, AND FORWARD IS COMPLETELY UNNECESSARY, AS LONG AS THE ## BRDPC CLASS IS CAPABLE OF STARING CHROME FROM JAVA ## ## IGNORE THIS STUFF BELOW ## ================================================================================= # Install Python dependencies. COPY requirements.txt requirements.txt RUN pip install -r requirements.txt # Copy local code to the container image. ENV APP_HOME /app WORKDIR $APP_HOME COPY . . # Run the web service on container startup. Here we use the gunicorn # webserver, with one worker process and 8 threads. # For environments with multiple CPU cores, increase the number of workers # to be equal to the cores available. CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 main:app


    • Field Summary

      Fields 
      Modifier and Type Field Description
      static boolean ALLOW_NULLABLE_PARAMETERS
      Configuration to Override the null-check.
      static String CANARY_EXE
      A common location for the Chome-Canary Binary on a Windows Machine.
      static String CHROME_EXE
      A commmon location for the Chrome-Binary on a Windows Machine.
      static WebSocketSender defaultSender
      A singleton instance of the web-socket connection.
      static boolean QUIET
      Configuration-flag for setting the verbosity of this library-package.
      static StorageWriter sw
      The log output.
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static boolean buildDefaultWebSocketConnection​(Integer port, boolean quiet, Consumer<BrowserEvent> eventHandler)
      After an instance of Chrome has been started, this method should be used to create a WebSocket-Connection to the browser.
      static String readAPIFromServer​(Integer port)
      You may use this method (if you so choose, for whatever reason) to retrieve the Remote Debugging Port API from Google.
      static Ret2<String,​String> readURLToUseFromServer​(Integer port)
      This is supposed to be the easy part.
      static void startBrowserExecutable​(boolean headless, Integer port, String initialURL, int maxCount)
      A browser-instance must be loaded before using this package's commands
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • CANARY_EXE

        🡅  🡇     🗕  🗗  🗖
        public static java.lang.String CANARY_EXE
        A common location for the Chome-Canary Binary on a Windows Machine.

        IMPORANT: I don't actually know when or why one would use the Canary-Binary, rather than the standard Chrome executable. I have read that the original intent of this particular application (which is a full fledged instance of Chrome) - was to provide a headless version of the standard Google-Browser.

        The current version of a Chrome-Executable seems to accept Remote Debug Port commands quite readily, without having to use Canary. I will leave it here as a discussion-point.
      • QUIET

        🡅  🡇     🗕  🗗  🗖
        public static boolean QUIET
        Configuration-flag for setting the verbosity of this library-package.

        To sent more vebose messages to your StorageWriter instance (field sw), change the value of this field to TRUE.
      • sw

        🡅  🡇     🗕  🗗  🗖
        public static StorageWriter sw
        The log output. Output text will be sent to this field. This field is not declared final, and (obviously) may be set to any desired log output. This field is not exactly thread-safe - it is shared by all threads which use the headless browser library tools!

        Because web-sockets, themselves are a multi-threaded application, there isn't really a way to identify which thread has caused a particular response. Some percentage of the messages received from the browser will be events, and in a multi-threaded application of this tool, there would be no way to identify which thread caused which event.
    • Method Detail

      • startBrowserExecutable

        🡅  🡇     🗕  🗗  🗖
        public static void startBrowserExecutable​(boolean headless,
                                                  java.lang.Integer port,
                                                  java.lang.String initialURL,
                                                  int maxCount)
                                           throws java.io.IOException

        A browser-instance must be loaded before using this package's commands


        Starts a headless (or full/complete) browser instance. All this method does is use the Standard-Java java.lang.Process class to start an operating-system executable. This just means invoking the executable specified by CHROME_EXE.
        Parameters:
        headless - If FALSE is passed to this method, then the option -headless will not be passed to the browser at startup. In such scenarios, an actual browser should popup on your desktop. Normally, this parameter should receive TRUE.
        port - It is standard operating procedure to pass 9222 to this parameter. 9223 is also very common. You may pass null to this parameter, and the default value will be used (which is 9223).
        initialURL - You may elect to have the page open at the specified URL. This parameter may be null, and when it is, it shall be ignored.
        maxCount - Some delay is inserted between the starting of a browser executable, and this method exiting.
        Throws:
        java.io.IOException - If there are any web-socket or http problems when attempting to connect to the browser.
        Code:
        Exact Method Body:
         if (port == null) port = 9223;
        
         if (maxCount < 2) maxCount = 2;
        
         Vector<String> comm = new Vector<>();
        
         comm.add(CHROME_EXE);
        
         if (! headless) comm.add("-disable-gpu");
        
         comm.add("-remote-debugging-port=" + port);
        
         if (headless) comm.add("-headless");
        
         // I don't know as much about Google-Chrome as you might think...
         // I don't know what this does...
         // comm.add("-no-sandbox");
        
         if (initialURL != null) comm.add(initialURL);
        
         String[] COMM = comm.toArray(new String[0]);
        
         for (String s : COMM) System.out.print(s + " "); System.out.println();
        
         Process pro = java.lang.Runtime.getRuntime().exec(COMM);
        
         R reader1 = new R(pro.getInputStream());
         R reader2 = new R(pro.getErrorStream());
        
         Thread t1 = new Thread(reader1);
         Thread t2 = new Thread(reader2);
        
         t1.setDaemon(true);
         t2.setDaemon(true);
        
         t1.start();
         t2.start();
        
         int i = 0;
        
         System.out.println("Waiting to read the code.");
        
         int counter = 0;
         while (reader1.stillReading && reader2.stillReading)
             if (i++ % 500000000 == 0)
             {
                 System.out.print(".");
                 if (++counter == maxCount) break;
             }
        
      • readAPIFromServer

        🡅  🡇     🗕  🗗  🗖
        public static java.lang.String readAPIFromServer​(java.lang.Integer port)
                                                  throws java.io.IOException
        You may use this method (if you so choose, for whatever reason) to retrieve the Remote Debugging Port API from Google. This method will return a (very long) JSON-File, as a java.lang.String
        Parameters:
        port - This should be the port that was used to start the server. This is, by default, port 9223.
        Returns:
        A very-long JSON-String containing the currently-exported API that your web-browser provides.
        Throws:
        java.io.IOException - If there are problems sending this request, or retrieving the response, this throws.
        Code:
        Exact Method Body:
         if (port == null) port = 9223;
        
         String  urlStr  = "http://127.0.0.1:" + port + "/json/protocol";
         URL     url     = new URL(urlStr);
        
         System.out.println("Querying WS Server for JSON\n" + url);
        
         // NOTE: This is a very large JSON File - more than 20,000 lines of text
         return Scrape.scrapePage(url);
        
      • readURLToUseFromServer

        🡅  🡇     🗕  🗗  🗖
        public static Ret2<java.lang.String,​java.lang.String> readURLToUseFromServer​
                    (java.lang.Integer port)
                throws java.io.IOException
        
        This is supposed to be the easy part. There is a very long "Identifier String" that contains some connection information.

        NOTE: There is really no need to use this method, as this "Connection Stuff" is all taken care of inside this class (and the class WebSocketSender).

        If, for some reason, there is a need to sue the feature of having multiple browser connections, and it is deemed important to build your own instance of WebSocketSender - you will have to retrieve this URL information in order to construct a WebSocketSender instance.
        Parameters:
        port - The browser-port upon. which the WebSocket was connected
        Returns:
        An instance of Ret2:

        • String (Browser URL)

          The complete URL to use when connecting to the browser over its RDP Connection.

        • String (Browser URL)

          A "Connection Code" that the browser uses to identify instances of a Web-Socket Connection.

        Throws:
        java.io.IOException - When connecting to the browser, this is always a possibility.
        Code:
        Exact Method Body:
         if (port == null) port = 9223;
        
         // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
         // Retrieve the Web-Socket Address from URL - Attemt #1
         // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
        
         URL url = new URL("http://127.0.0.1:" + port + "/json/list");
        
         if (! QUIET) sw.println(
             '\n' +
             BRED + "ATTEMPT #1:" + RESET + " (Works when there is already a page opened)\n" +
             "Querying WebSocketServer for Listening Address Using:\n\t" +
             BYELLOW + url.toString() + RESET
         );
        
         try
         {
             String          json    = Scrape.scrapePage(url);
             StringReader    sr      = new StringReader(json);
        
             if (! QUIET) sw.println("Attempt #1, Server Responded With:\n" + json);
        
             if (json.length() > 10) // else it gave us an empty response
             {
                 JsonArray    jArr    = Json.createReader(sr).readArray();
                 JsonObject   jObj    = null; // used in next-section
        
                 if ((jArr != null) && (jArr.size() > 0))
                 {        
                     String res  = jArr.getJsonObject(0).getString("webSocketDebuggerUrl", null);
                     String code = StringParse.fromLastFrontSlashPos(res);
        
                     if (res != null) return new Ret2<>(res, code);
                 }
             }
         }
        
         catch (Exception e)
         {
             sw.println(
                 BGREEN + "NOTE: FOR ATTEMPT #1, THIS IS SOMEWHAT EXPECTED" + RESET + '\n' +
                 BRED + "Exception Message Attempt #1:\n\t" + RESET +
                 e.getMessage() + '\n' +
                 "-------------------------------------------------------------------\n" +
                 EXCC.toString(e) +
                 "-------------------------------------------------------------------\n"
             );
         }
        
        
         // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
         // Retrieve the Web-Socket Address from URL #2
         // *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
        
         url = new URL("http://127.0.0.1:" + port + "/json/version");
        
         // Again, I'm not 100% what chrome is really doing here...
         // url = new URL("http://127.0.0.1/json/version");
        
         if (! QUIET) sw.println(
             BRED + "ATTEMPT #2:" + RESET + " (Creates an entirely new connection)\n" +
             "Querying WebSocketServer for Listening Address Using:\n\t" +
             BYELLOW + url.toString() + RESET
         );
        
         try
         {
             String          json    = Scrape.scrapePage(url);
             StringReader    sr      = new StringReader(json);
             JsonObject      jObj    = Json.createReader(sr).readObject();
        
             if (! QUIET)
                 sw.println("Attempt #2, Server Responded With:\n" + json);
        
             if (jObj != null)
             {
                 String res  = jObj.getString("webSocketDebuggerUrl", null);
                 String code = StringParse.fromLastFrontSlashPos(res);
        
                 sw.println(
                     "res:  " + res + '\n' +
                     "code: " + code
                 );
        
                 if ((res != null) && res.startsWith("ws://")) return new Ret2<>(res, code);
             }
         }
        
         catch (Exception e)
         {
             sw.println(
                 BRED + "Exception Message Attempt #2:\n\t" + RESET +
                 e.getMessage() + '\n' +
                 "-------------------------------------------------------------------\n" +
                 EXCC.toString(e) +
                 "-------------------------------------------------------------------\n"
             );
         }
            
         // Failed
         sw.println(BRED + "Unable to retrieve Chrome Web-Socket URL.  Sorry buddy." + RESET);
         return null;
        
      • buildDefaultWebSocketConnection

        🡅     🗕  🗗  🗖
        public static boolean buildDefaultWebSocketConnection​
                    (java.lang.Integer port,
                     boolean quiet,
                     java.util.function.Consumer<BrowserEvent> eventHandler)
                throws java.io.IOException,
                       WebSocketException
        
        After an instance of Chrome has been started, this method should be used to create a WebSocket-Connection to the browser.

        You must invoked this method before sending commands

        After starting a browser instance using the method startBrowserExecutable(boolean, Integer, String, int), you will also have to open a Web-Socket Connection to the browser using this method. Then (and only then), is it possible to begin sending requets to the browser.
        Parameters:
        port - You should just elect to use 9223, that is what is recommended by all of the sources on that Internet that explain this browser-feature. Chrome will be listening on whatever port you started the headless version with.

        NOTE: You don't have to start the browser in headless mode. Controlling a full-fleged, and opened, browser is allowed with Chrome.
        quiet - You may configure the verbosity-level for the Websocket-Connection to be different from the verbosity-level for this class (class BRDPC).
        eventHandler - It is advisable to register an event-handler. This parameter may be passed null, and if it is, events will not be handled (they are then ignored).
        Throws:
        java.io.IOException - Connecting to the browser may cause this exception.
        WebSocketException - The NeoVisionaries package for WebSockets may throw this when attempting to build the connection.
        Code:
        Exact Method Body:
         Ret2<String, String> browserListenerURL = readURLToUseFromServer(port);
        
         if ((browserListenerURL == null) || (browserListenerURL.a == null))
         {
             sw.println("Unable to retrieve a Web-Socket Connection.  No Request URL's");
             return false;
         }
        
         BRDPC.defaultSender = new WebSocketSender(
             browserListenerURL.a, quiet,
             (eventHandler == null)
                 ? null
                 : (Object o) -> eventHandler.accept((BrowserEvent) o)
         );
        
         if (! QUIET) sw.println
             ("Web Socket Connection Opened:\n" + BYELLOW + browserListenerURL.a + RESET);
        
         return true;