java.lang.Object
- Torello.Browser.Example01

```
public class Example01
extends java.lang.Object
```
An example of this package's utility. This class is used to initiate a connection to a headless Chrome-Instace, and visiting a page.

Viewing the Output:
The text output which is generated by this Example - the text printed to the Terminal Output - may be viewed in the link below:

Example01.out.html

Installing Chrome in GCP Cloud Shell:

These are the commands that I type inside of a GCP (Google Cloud Platform) Debian Terminal/Shell to make sure that a Chrome Headless Browser is working. ChatGPT explained it to me, and wrote me a shell script to do the installation. I only do development on cloud servers, rather than local machines. I use laptops way too much.

🔑 If you are programming using your own computer, you likely already have a CDP compatible web browser installed. You should skip the intallation step completely, if so.

✔️ If you need to install chome, here's the script that A.I. wrote for me in the summer of '25. It still work great in GCP.

UNIX or DOS Shell Command:

## # Update package list sudo apt-get update # Install just the essentials for headless Chrome sudo apt-get install -y \ fonts-liberation \ libnss3 \ libatk1.0-0 \ libxss1 \ libgdk-pixbuf2.0-0 \ libgtk-3-0 \ libasound2t64 \ libnspr4 \ xdg-utils \ wget \ ca-certificates # Download Chrome manually wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb # Install Chrome and auto-fix dependencies sudo dpkg -i google-chrome-stable_current_amd64.deb || sudo apt-get -fy install

The above Shell-Commands, again, were generated by Chat-GPT on July 11th, 2025. They seem to have produced a perfect working copy being installed inside my Linux-Instance, without any errors occurring. The generated by the above commands are reproduced here.
Starting Chrome in the Cloud:
Once Google Chrome has been installed in your GCP Cloud Shell environment, you can start a headless Chrome instance that continues running in the background — even if you hit ^C, close your terminal, or go refill your drink at Starbucks.

This isn't the same as launching a full Compute Engine instance — you're just spinning up a background terminal process inside your ephemeral Cloud Shell session. The process will live until you shut down your shell, or until the session times out.

To launch Chrome headlessly in a way that ignores ^C and keyboard input:

UNIX or DOS Shell Command:

nohup google-chrome --headless --disable-gpu --remote-debugging-port=9222 \ --no-sandbox --disable-dev-shm-usage > /dev/null 2>&1 & disown

This command uses:
- nohup - Prevents the process from dying when the terminal closes or is interrupted.
- & - Puts the Chrome process in the background immediately.
- disown - Detaches the process from the shell's job control, so ^C has no effect.
To check if Chrome is running later:

UNIX or DOS Shell Command:

ps aux | grep '[g]oogle-chrome' ## The above command should produce output such as: narrati+ 5916 5.9 1.5 34396956 249664 pts/3 S<l 20:20 0:01 /usr/bin/google-chrome ...

To kill the headless Chrome instance when you're done:

UNIX or DOS Shell Command:

pkill -f 'google-chrome.*--headless' ## To kill by Process-ID kill <PID>
Page originally drafted by ChatGPT on July 11^th, 2025.
Edited and formatted by Ralph Torello for use in the Java HTML Library documentation.
Hi-Lited Source-Code:
- View Here: Torello/Browser/Example01.java
- Open New Browser-Tab: Torello/Browser/Example01.java
File Size: 15,133 Bytes Line Count: 388 '\n' Characters Found

Field Summary

Fields
Modifier and Type Field

protected static ConnRecord connRec

protected static String samAltmanURL

Method Summary

Main Method

Modifier and Type	Method
`static void`	`main(String[] argv)`

Example Steps
Modifier and Type	Method
`protected static WebSocketSender`	`STEP_01_openBrowserWebSocket()`
`protected static void`	`STEP_02_closeAllPages(WebSocketSender bws)`
`protected static String`	`STEP_03_openSamAltmanPage(WebSocketSender bws)`
`protected static WebSocketSender`	`STEP_04_getPageWebSocket(String targetID)`
`protected static String`	`STEP_05_runJavaScript(WebSocketSender pws)`
`protected static String[]`	`STEP_06_extractImageURLs(String html)`
`protected static void`	`STEP_07_downloadImages(String[] imageURLs)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

samAltmanURL

🡇 ⇈ ⮫ 🗕 🗗 🗖

protected static final java.lang.String samAltmanURL

The URL that is being scraped in this example

See Also:

Constant Field Values

Code:

Exact Field Declaration Expression:

 protected static final String samAltmanURL = "https://en.wikipedia.org/wiki/Sam_Altman";

connRec

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

protected static final ConnRecord connRec

Code:

Exact Field Declaration Expression:

 protected static final ConnRecord connRec = new ConnRecord();

Method Detail

main

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

public static void main(java.lang.String[] argv)
                 throws java.lang.Exception

This class is intended to be invoked from the Command Line.

Throws:

java.lang.Exception

Code:

Exact Method Body:

 // Opening a WebSocket Browser-Connection to the currently running Chrome-Instance
 final WebSocketSender bws = STEP_01_openBrowserWebSocket();

 // Close any currently opened pages / tabs inside the browser
 STEP_02_closeAllPages(bws);

 // Open a Browser-Page (using 'bws') for reading Sam Altman's Wikipedia Profile
 final String targetID = STEP_03_openSamAltmanPage(bws);

 // Create / Build a WebSocket-Connection object to the newly opened Sam Altman Page.
 final WebSocketSender pws = STEP_04_getPageWebSocket(targetID);

 // Execute some Java-Script so that the scrape code may run
 final String html = STEP_05_runJavaScript(pws);

 // Print the Image-URL's, retrieve those URL's too
 final String[] imgURLs = STEP_06_extractImageURLs(html);

 // Download the Images into a download folder
 STEP_07_downloadImages(imgURLs);

 bws.disconnect();
 pws.disconnect();

STEP_01_openBrowserWebSocket

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖
```
protected static WebSocketSender STEP_01_openBrowserWebSocket
            ()
        throws java.lang.Exception
```
This method demonstrates the first step in connecting to Chrome via the Chrome DevTools Protocol (CDP). It launches a headless instance of Chrome with remote debugging enabled and establishes the primary WebSocket connection that will be used for all subsequent CDP communication. This connection targets the browser-level control endpoint, not a tab-specific page socket.

Internally, the method starts Chrome with a --remote-debugging-port=9222 flag, waits a few seconds to ensure Chrome is fully initialized, and queries the /json/version endpoint to retrieve WebSocket metadata. It uses that metadata to construct a WebSocketSender for JSON request-response communication with Chrome.

If you're trying to automate or control browser behavior from Java, this is where it all begins: getting a working WebSocket connection to the Chrome backend.
Throws:

java.lang.Exception

Code:
Exact Method Body:

Printing.notice("Opening a WebSocket Browser Connection..."); final BrowserConn browserConn = BrowserConn.getBrowserConn(9222, false); System.out.println( '\n' + BCYAN + "Example01.java: " + RESET + BRED + "Opened Browser Connection:\n" + RESET + browserConn.toString() ); final WebSocketSender bws = browserConn.createSender(Example01.connRec); // Chat-GPT once suggested this line. I just haven't removed it. It's not hurting anyone! Thread.sleep(1000); return bws;

STEP_02_closeAllPages

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

protected static void STEP_02_closeAllPages(WebSocketSender bws)
                                     throws java.lang.Exception

This step closes all existing pages (i.e., browser tabs) currently open in the Chrome instance. CDP allows enumeration of all tabs via a call to /json/list, and each tab provides a targetId property that can be passed to Target.closeTarget

The method calls Target.getTargets() to obtain all open targets, then iterates through them and sends a Target.closeTarget(tID) command for each one that represents a page. This is useful to start from a clean browser state before performing automation.

If Chrome was already running with many tabs open, this call helps ensure that subsequent tab-based automation starts in a predictable environment.

Throws:

java.lang.Exception

Code:

Exact Method Body:

 Printing.notice("Closing All Currently Open Pages, using BrowserConn");


 // This is currently unused.  I used to filter for only the opened Wiki-Pages, but now this
 // method simply closes every open page.  No sense in deleting this line, though

 final Predicate<Target.TargetInfo> isSamAltman = (Target.TargetInfo t) ->
         t.type.equals("page")
     &&  (t.url != null)
     &&  (t.url.startsWith(samAltmanURL));

 System.out.println
     ('\n' + BCYAN + "Example01.java: " + RESET + "Getting all tabs...");
    
 final Target.TargetInfo[] allTabs = Target
     .getTargets(null /* FilterEntry[] */)
     .exec(bws)
     .await();

 System.out.println
     ('\n' + BCYAN + "Example01.java: " + RESET + "Found " + allTabs.length + " tabs.");

 if (allTabs.length > 0)

     for (int i = 0; i < allTabs.length; i++)
     {
         final String tid = allTabs[i].targetId;
         System.out.println(BRED + "Closing Tab: " + RESET + tid);
         Target.closeTarget(tid).exec(bws).await();
     }

STEP_03_openSamAltmanPage

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

protected static java.lang.String STEP_03_openSamAltmanPage
            (WebSocketSender bws)
        throws java.lang.Exception

This step creates a new browser tab (a new "target") by invoking Target.createTarget with a specific URL — in this case, the Sam Altman Wikipedia page. This uses the WebSocketSender connection previously established to send a CDP request and parse the result.

The return value is a Target.TargetID, as a java.lang.String object containing the tab identifier, which will be used in the next step to get its associated WebSocket.

This step illustrates how CDP allows opening URLs without user interaction — one of the key features that powers headless automation.

Throws:

java.lang.Exception

Code:

Exact Method Body:

 Printing.notice("Opening a Sam Altman Wikipedia Page, using BrowserConn.");

 final String targetID = Target
     .createTarget()
     .accept("url", samAltmanURL)
     .build()
     .exec(bws)
     .await();

 final Target.TargetInfo targetInfo = Target
     .getTargetInfo(targetID)
     .exec(bws)
     .await();

 System.out.println(
     '\n' + BCYAN + "Example01.java: " + RESET +
     BRED + "Created New Tab:\n" + RESET + targetInfo.toString()
 );


 // I leave these one second delays here.  AGAIN - Chat-GPT suggested them to me once.
 // Chat-GPT, in every sense of the word, knows more about my code than I do!  (The CDP 
 // Protocol is a very well understood protocol - just not in Java so much)

 Thread.sleep(1000);

 return targetID;

STEP_04_getPageWebSocket

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

protected static WebSocketSender STEP_04_getPageWebSocket
            (java.lang.String targetID)
        throws java.lang.Exception

Once a tab is opened with a known targetId, this step retrieves the specific WebSocket endpoint associated with that tab. CDP uses one WebSocket per tab, and this is necessary for interacting with page-level domains such as Page, Runtime, or DOM.

The method uses the /json/list HTTP-Endpoint to get metadata for all tabs and filters by targetId to find the matching webSocketDebuggerUrl. Then, it opens a new WebSocketSender for that tab.

From this point forward, CDP messages targeting the loaded page must use this tab-specific WebSocket.

Throws:

java.lang.Exception

Code:

Exact Method Body:

 Printing.notice("Create PageConn Web-Socket Connection to Altman's Wiki");

 // Attach to that Sam Altman Page (switch to tab-level WebSocket)
 final PageConn pageConn = PageConn
     .getAllPageConn(9222, false)
     .filter((PageConn pc) -> pc.id.equals(targetID))
     .findFirst()
     .orElseThrow(() -> new RuntimeException("The Page-Connection was Not found !!!"));

 System.out.println(
     '\n' + BCYAN + "Example01.java: " + RESET +
     BRED + "Found Page Connection to Sam Altman Wiki:\n" + RESET + pageConn.toString()
 );

 final WebSocketSender pws = pageConn.createSender(Example01.connRec);


 // I think this is the last one...  Wait 1 second, it might make a difference while the 
 // page actually loads, and the Web-Socket connects... I have no idea!  It's just 1 second!

 Thread.sleep(1000);

 return pws;

STEP_05_runJavaScript

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

protected static java.lang.String STEP_05_runJavaScript
            (WebSocketSender pws)
        throws java.lang.Exception

Before sending any JavaScript commands to the browser tab, certain CDP domains must be enabled. This method sends Page.enable() and RunTime.enable() commands to inform Chrome that you intend to receive events and execute script.

Without this step, attempts to run JavaScript, via RunTime.evaluate(), would fail or be ignored. Enabling the domains registers your WebSocket> session as a subscriber for those event types.

Think of this as turning on the light switches — telling Chrome what features you intend to use during the session.

Throws:

java.lang.Exception

Code:

Exact Method Body:

 Printing.notice("Execute the needed Java Script, so the Scraper can Run");

 // Enable the Page domain
 System.out.println('\n' + BCYAN + "Example01.java: " + RESET + "Page.enable()");
 Page.enable(null /* Boolean */).exec(pws).await();

 // Enable the DOM domain
 System.out.println('\n' + BCYAN + "Example01.java: " + RESET + "DOM.enable()");
 DOM.enable(null /* String */).exec(pws).await();

 // Enable the Runtime domain
 System.out.println('\n' + BCYAN + "Example01.java: " + RESET + "RunTime.enable()");
 RunTime.enable().exec(pws).await();

 // This is the actual last one.  Make sure that the DOM & RunTime modules are running!
 Thread.sleep(1000);

 // 5. Evaluate the HTML via JavaScript
 System.out.println('\n' + BCYAN + "Example01.java: " + RESET + "RunTime.evaluate()");

 final RunTime.evaluate$$RET r = RunTime
     .evaluate()
     .accept("expression", "document.documentElement.outerHTML")
     .accept("returnByValue", true)
     .build()
     .exec(pws)
     .await();

 System.out.println(
     '\n' + BCYAN + "Example01.java: " + RESET + "Response RemoteObject:" + '\n' +
     r.result.toString()
 );

 final String html = ((JsonString) r.result.value).getString();

 return html;

STEP_06_extractImageURLs

🡅 🡇 ⇈ ⮫ 🗕 🗗 🗖

protected static java.lang.String[] STEP_06_extractImageURLs
            (java.lang.String html)
        throws java.lang.Exception

This method executes a custom JavaScript snippet inside the browser page and extracts the result. It uses Runtime.evaluate with the awaitPromise flag to execute asynchronous JS code and wait for a result.

The JavaScript command fetches all image elements on the page with src attributes matching Flickr’s staticflickr.com domain. The result is a list of image URLs returned back to Java and parsed into a String[].

This is the first real instance of cross-boundary data flow — using CDP to run code inside Chrome and pull results into your Java program.

Throws:

java.lang.Exception

Code:

Exact Method Body:

 Printing.notice("Parsing HTML for Images Printing the URL's");

 final Vector<HTMLNode>      altPage = HTMLPage.getPageTokens(html, false);
 final int[]                 images  = TagNodeFind.all(altPage, TC.OpeningTags, "img");
 final String[]              imgURLs = Attributes.retrieve(altPage, images, "src");
 final int                   numImg  = imgURLs.length;

 System.out.println
     ('\n' + BCYAN + "Example01.java: " + RESET + "Number of Images Found: " + numImg);

 for (int i = 0; i < numImg; i++) System.out.println("    " + imgURLs[i]);

 return imgURLs;

STEP_07_downloadImages

🡅 ⇈ ⮫ 🗕 🗗 🗖

protected static void STEP_07_downloadImages(java.lang.String[] imageURLs)
                                      throws java.lang.Exception

The final step downloads each image URL retrieved in the previous step and saves the results to disk. The filenames are derived from the tail end of the URL path, and all downloads are saved to a configurable local directory.

This method doesn't involve CDP — it's just traditional HTTP file downloading using ImageScraper.download() But it completes the use case: open a tab, run JS to scrape content, and persist the result.

This step closes the automation loop: going from page navigation to content extraction and finally saving that content offline.

Make sure that a directory named image-downloads/ exists as a sub-directory of the directory from which this method is invoked.

Throws:

java.lang.Exception

Code:

Exact Method Body:

 Printing.notice("Download the Image's into a folder");

 final Stream.Builder<String> builder = Stream.builder();

 for (int i = 0; i < imageURLs.length; i++)
     if (imageURLs[i].startsWith("//"))
         builder.accept("https:" + imageURLs[i]);

 // Build a Request-Object
 final List<String>  imgURLsList = builder.build().collect(Collectors.toList());
 final Request       req         = Request.buildFromStrIter(imgURLsList);

 // Add a few more Scraper-Configurations to the Request Object
 req.targetDirectory                     = "image-downloads/";
 req.useDefaultCounterForImageFileNames  = true;
 req.skipOnDownloadException             = true;
 req.verbosity                           = Verbosity.Normal;

 try 
     // Run the scraper, Send all Text-Output to 'System.out' (Ignore / Discard Results)
     { final Results results = ImageScraper.download(req, System.out); }

 catch (Exception e)
     { System.out.println(EXCC.toString(e)); }

 finally 
     // This needs to happen, or this entire program will hang / lock up the terminal
     { ImageScraper.shutdownTOThreads(); }

Modifier and Type	Field
`protected static ConnRecord`	`connRec`
`protected static String`	`samAltmanURL`

Class Example01

Installing Chrome in GCP Cloud Shell:

Starting Chrome in the Cloud:

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

samAltmanURL

connRec

Method Detail

main

STEP_01_openBrowserWebSocket

STEP_02_closeAllPages

STEP_03_openSamAltmanPage

STEP_04_getPageWebSocket

STEP_05_runJavaScript

STEP_06_extractImageURLs

STEP_07_downloadImages