Package Torello.Java

Class LFEC.GCSSB

  • Enclosing class:
    LFEC

    public static class LFEC.GCSSB
    extends java.lang.Object
    The Google Cloud Server Storage Bucket extension of "Load File Exception Catch" does the work that LFEC does, but for GCS Storage Buckets, rather than operating system files.


    The following public static (inner) class provides the exact same set of features that class LFEC provides - namely data-file loading with failing print's and system halts - but for files that are stored on the Google Cloud Server Storage Bucket Infrastructure.

    The set classes of classes and methods that are 'exported' by the HTML search and scrape package, and the other packages in this Torello Jar File are neither dependent upon, nor do they mandate or require use of Google Cloud Services, the 'Cloud Shell' development-environment that google provides (for free), nor any of the Storage-Buckets or storage-systems that Google Corporation operates. Literally all of the development work for package HTML was done on Google's Cloud Shell, but hopefully it will all run - in the spirit of Java's write-once, run-anywhere platform proclamations - anywhere that the JAR files are loaded. If you are planning to utilize Google's Cloud Server to either develop your code, or to host your web-site, the method & classes that link to its servers in this class-infrastructure might seem invaluable. The "Storage Buckets" system, for instance, which they offer are pretty cheap - I have about 20 to 30 Gigabytes there right now, and have paid probably $1.50 per month to them...

    USING GCS: In order to use the Google Cloud Services platform, you will have to do one of the following two means of communicating commands and instructions to their servers:

    1. Make json calls to the appropriate server-names using whatever "usual http-connect methods" that you employ, and parse the json response object. Parsing json-response objects can be really easy if you know how to use regular-expressions, but the "Jackson" library also does this for free. Download a copy of the "Jackson" java jar-file, and read the "Jackson JSON Interpreter JavaDoc's" on the internet. (Type: Jackson Java JSON parse at a "google search prompt").
    2. ... or ... spend however long it takes to read, figure-out, understand (or interpret) Google's GCS Java-Library and write java-code using their JSON-free java-library making direct calls to Google's GCS servers.


    HERE, JAVA: The methods in this class do not use the "JSON" version of communicating with Google-Servers, instead, they use the java class-libraries that Google Exports. I do not know which the "official version" of it's jar-file is, but you may download an "unofficial-google version" from my web-site, which does work. I have not tampered with it, or altered the classes in any-way, other than to remove some of the complicated "extras" that were added.

    DOWNLOAD HERE: This is a download of file "GCS.jar" which has the classes and methods needed to access the translate API, the Storage Bucket API, and the Authorization classes "o-auth 2.0". If you make any calls to the methods here that facilitate access to the Storage Buckets, you will need to include some version of Google's GCS jar file. This is the jar-file I created, if you don't trust Torello.Directory, please don't use it:


    I am not really able to make any proclamations for or against about the contents of this particular jar-file. I am, personally, not a fan of the Java-Enabled versions of its services that are provided by Google, primarily because I have almost no idea how to use them (Google-Java is very poorly documented, and both the classes and the methods they offer are not well thought as API's - but rather they just expose their own internal stuff). One may usually figure GCS Services using the JSON, and just go with making JSON calls to-and-from GCS. Anyway, in the above java-jar file many (but not all) of the classes necessary to avoid falling back on JSON calls to GCS are available. Using Google's jar-files, it is possible to make "plain old java method calls" instead of using JSON. Most Importantly: You do not need this jar file included in your classpath for any of the methods or classes in this scrape package at all - unless - you have a GCS account, and want to use their Storage Buckets. If so, the two methods in class 'Programming' that use the term: 'GCSSB' an acronym that means "Google Cloud Server Storage Buckets"

    FINALLY: The following code snippet should explain how to provide an "o-auth-2.0" to obtain an instance of the class Storage needed to begin communicating with Google Cloud Server Storage Buckets:
    import com.google.cloud.storage.*;
    import com.google.auth.oauth2.*;
    ...
    
    String PROJECT_ID        = "The name you gave your project in which your storage buckets reside";
    String PATH_TO_JSON_KEY  = "The JSON key generated when you create a 'Service Account'.";
    
    Storage storage = StorageOptions
        .newBuilder()
        .setProjectId(PROJECT_ID)
        .setCredentials(GoogleCredentials.fromStream(new FileInputStream(PATH_TO_JSON_KEY)))
        .build()
        .getService();
    


    SPECIAL NOTE: I have never needed my GoogleCredentials "o auth 2.0" key until I started using the google-java jar-libraries for connecting with Storage Buckets directly.

    • When writing to my web-domain, using the command line program GSUTIL was usually much easier and eliminated the strict dependency on Google's web-hosting platform that would be mandated if using their Java-Libraries.
    • When translating Chinese Government web-publications using the Translate API, a Google "API Key" usually sufficed. An "API Key" is a 30 to 40 characters string that identifies the billable account to the server. Just don't share it on the web, and it will work fine.
    • Even the Vision / OCR (Optical Character Recognition) API's seemed to work without oauth2.0. I did have to make HTTP connections and interpret JSON, rather than using bona-fide java method invocations, but it was all fine. If you are going to use java-methods with the storage-buckets, get an "o-auth-2.0" key, and save the JSON file to your file-system. Keep that key private, and the previous code is how to "login to google" from java and start saving storing files to the cloud using their java jar-files.


    It should be interesting to note, that often, this method invocation works - and I don't actually know why, and cannot explain it. This leaves out the authentication portion from the Storage object, and sometimes still allows file access to the files in your storage-buckets. You may play around with it:
    import import com.google.cloud.storage.*;
    ...
    
    private static final Storage storage = StorageOptions.getDefaultInstance().getService();
    


    NOTE: The easiest way to use this class is to make sure to import the following line in your import package's (and classes!) section of your class-file. When you include the complete path name to a static-inner class in a Java 'import package' section of your java-class-definition ('.java' file), then through-out the remainder of your code, you may make calls to this class directly, without being required to type/enter the entire class-name on each and every line.
    // When this line is included, Using class' static method is done as follows:
    // GCSSB.loadFileToString(storage, "mybucket", "myFile.txt");
    // If this import-statement is not used, then (as for all static/internal classes), you must use
    // the entire class-name:
    // LFEC.GCSSB.loadFileToString( ... );
    
    import Torello.Java.LFEC.GCSSB;
    



    Stateless Class:
    This class neither contains any program-state, nor can it be instantiated. The @StaticFunctional Annotation may also be called 'The Spaghetti Report'. Static-Functional classes are, essentially, C-Styled Files, without any constructors or non-static member fields. It is a concept very similar to the Java-Bean's @Stateless Annotation.
    • 1 Constructor(s), 1 declared private, zero-argument constructor
    • 5 Method(s), 5 declared static
    • 0 Field(s)


    • Method Summary

       
      Read Files from Google Cloud Platform Storage Buckets
      Modifier and Type Method
      static String loadFileToString​(com.google.cloud.storage.Storage storage, String bucket, String completeFileName)
      static Vector<String> loadFileToVector​(com.google.cloud.storage.Storage storage, String bucket, String completeFileName, boolean includeNewLine)
      static <T> T readObjectFromFile​(com.google.cloud.storage.Storage storage, String bucket, String completeFileName, boolean zip, Class<T> returnClass)
       
      Write Files to Google Cloud Platform Storage Buckets
      Modifier and Type Method
      static void writeFile​(CharSequence fileAsStr, com.google.cloud.storage.Storage storage, String bucket, String completeFileName, boolean ASCIIorUTF8)
      static void writeObjectToFile​(Object o, com.google.cloud.storage.Storage storage, String bucket, String completeFileName, boolean zip)
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • readObjectFromFile

        🡇    
        public static <T> T readObjectFromFile​
                    (com.google.cloud.storage.Storage storage,
                     java.lang.String bucket,
                     java.lang.String completeFileName,
                     boolean zip,
                     java.lang.Class<T> returnClass)
        
        This will read a Java Serialized java.lang.Object from a location in a Google Cloud Server Storage Bucket.

        NOTE: This uses Java's variable-type parameter syntax. To easily provide a value for parameter 'returnClass', type the name of the java 'class' whose type is needed as a return-type, followed by the word 'class'.

        For example, to read a java.util.Vector from a data-file, pass the value 'Vector.class' to this parameter. It is a 'little-known' fact that each and every instance of java.lang.Object has a Class class field inside. (Now, boys and girls, don't start jumping up and down just yet!)
        Type Parameters:
        T - This type parameter is simply provided for convenience, to allow the user to specify the return class, without having to cast the object and suppress warnings, or catch exceptions.
        Parameters:
        storage - This must be an instance of Google Cloud Server's class Storage. The description at the top of this class should elucidate how to obtain such an Object-instance. If the explanation is not very clear, or if it is not working (any more), please just go to a google-search-bar and look for information about the Google Cloud Server Storage Buckets Java API, class 'Storage'. API's have been known to change, once in a while.
        bucket - The bucket name of the bucket from a Google Cloud Server account.
        completeFileName - This String-parameter needs to be the complete String representation of the directory-name plus the file-name of the location where the java serialized Object file was saved, or will be saved.

        GCS Storage-Buckets: The bucket-name as-a-string should **not** be included as a part of this String-parameter. However, both the file-name, and the directory-name where this file is residing must be present.
        zip - When this parameter is TRUE the serialized object will be run through Java's java.util.zip.GZIPInputStream and GZIPOutputStream when reading/writing the serialized-Object. If this parameter is FALSE, GZIP Compression will not be used when serializing or de-serializing the Object.
        returnClass - This is the type expected to be found by Java in the Serialized Object Data-File. If an Object is read from this location, but it does not have the type indicated by this parameter, the program will also halt, and an explanatory exception message will be printed to the console/terminal.
        Returns:
        A de-serialized java java.lang.Object that has been read from a GCS Storage Bucket, and cast to the type denoted by parameter 'returnClass'.
        See Also:
        FileRW.readObjectFromFile(String, boolean), FileRW.readObjectFromFileNOCNFE(String, boolean), LFEC.readObjectFromFile(String, boolean, Class), LFEC.ERROR_EXIT(String)
        Code:
        Exact Method Body:
         try
         {
             // Read Storage Bucket Data into a byte[] array
             byte[] bArr = storage.get(bucket, completeFileName).getContent();
        
             // Build an Input Stream, using that byte[] array as input
             ByteArrayInputStream bis = new ByteArrayInputStream(bArr);
        
             // Build an Object Input Stream, using the byte-array input-stream as input
             ObjectInputStream ois = zip
                 ? new ObjectInputStream(new GZIPInputStream(bis))
                 : new ObjectInputStream(bis);
        
             // Use Java's Object Serialization method to read the Object
             Object ret = ois.readObject();
        
             if (! returnClass.isInstance(ret)) ERROR_EXIT(
                 "Serialized Object read from GCS Storage Bucket: " + bucket + "\n" +
                 "And file-name: " + completeFileName + "\n" +
                 "Using expected (" + (zip ? "zip-compression" : "no-compression") + ")\n" +
                 "Didn't have an object with class-name: " + returnClass + "\n" +
                 "But rather with className: " + ret.getClass().getName()
             );
        
             ois.close(); bis.close();
        
             return returnClass.cast(ret);
         }
         catch (Throwable t)
         {
             ERROR_EXIT(
                 t,
                 "Serialized Object read from GCS Storage Bucket: " + bucket + "\n" +
                 "And file-name: " + completeFileName + "\n" +
                 "Using expected (" + (zip ? "zip-compression" : "no-compression") + ")\n" +
                 "And Expected class-name: " + returnClass + "\n"
             );
        
             throw new UnreachableError(); // Cannot reach this statement
         }
        
      • loadFileToString

        🡅  🡇    
        public static java.lang.String loadFileToString​
                    (com.google.cloud.storage.Storage storage,
                     java.lang.String bucket,
                     java.lang.String completeFileName)
        
        This merely loads a text-file from Google's Storage Bucket infrastructure into a String. Make sure to check that the file you are loading does indeed have text-content.
        Parameters:
        storage - This must be an instance of Google Cloud Server's class Storage. The description at the top of this class should elucidate how to obtain such an Object-instance. If the explanation is not very clear, or if it is not working (any more), please just go to a google-search-bar and look for information about the Google Cloud Server Storage Buckets Java API, class 'Storage'. API's have been known to change, once in a while.
        bucket - The bucket name of the bucket from a Google Cloud Server account.
        completeFileName - This String-parameter needs to be the complete String representation of the directory-name plus the file-name of the location where the java serialized Object file was saved, or will be saved.

        GCS Storage-Buckets: The bucket-name as-a-string should **not** be included as a part of this String-parameter. However, both the file-name, and the directory-name where this file is residing must be present.
        Returns:
        The text file on Google Cloud Server's Storage Bucket file/directory returned as a java.lang.String
        Code:
        Exact Method Body:
         return new String(storage.get(bucket, completeFileName).getContent());
        
      • loadFileToVector

        🡅  🡇    
        public static java.util.Vector<java.lang.String> loadFileToVector​
                    (com.google.cloud.storage.Storage storage,
                     java.lang.String bucket,
                     java.lang.String completeFileName,
                     boolean includeNewLine)
        
        This merely loads a text-file from Google's Storage Bucket infrastructure into a String. Make sure to check that the file you are loading does indeed have text-content.
        Parameters:
        storage - This must be an instance of Google Cloud Server's class Storage. The description at the top of this class should elucidate how to obtain such an Object-instance. If the explanation is not very clear, or if it is not working (any more), please just go to a google-search-bar and look for information about the Google Cloud Server Storage Buckets Java API, class 'Storage'. API's have been known to change, once in a while.
        bucket - The bucket name of the bucket from a Google Cloud Server account.
        completeFileName - This String-parameter needs to be the complete String representation of the directory-name plus the file-name of the location where the java serialized Object file was saved, or will be saved.

        GCS Storage-Buckets: The bucket-name as-a-string should **not** be included as a part of this String-parameter. However, both the file-name, and the directory-name where this file is residing must be present.
        includeNewLine - This tells the method to include, or not-include, a '\n' (newline) character to each String.
        Returns:
        The text file on Google Cloud Server's Storage Bucket file/directory stuff as a Vector of String's.
        See Also:
        loadFileToString(Storage, String, String)
        Code:
        Exact Method Body:
         String          s   = loadFileToString(storage, bucket, completeFileName);
         Vector<String>  ret = new Vector<>();
        
         int pos     = 0;
         int delta   = includeNewLine ? 1 : 0;
         int lastPos = 0;
        
         while ((pos = s.indexOf('\n')) != -1)
         {
             ret.add(s.substring(lastPos, pos + delta));
             lastPos = pos + 1;
         }
        
         if (lastPos < s.length()) ret.add(s.substring(lastPos));
        
         return ret;
        
      • writeFile

        🡅  🡇    
        public static void writeFile​(java.lang.CharSequence fileAsStr,
                                     com.google.cloud.storage.Storage storage,
                                     java.lang.String bucket,
                                     java.lang.String completeFileName,
                                     boolean ASCIIorUTF8)
        This will write the contents of a java 'CharSequence' - includes String, StringBuffer & StringBuilder to a file on Google Cloud Server's storage bucket system.
        Parameters:
        storage - This must be an instance of Google Cloud Server's class Storage. The description at the top of this class should elucidate how to obtain such an Object-instance. If the explanation is not very clear, or if it is not working (any more), please just go to a google-search-bar and look for information about the Google Cloud Server Storage Buckets Java API, class 'Storage'. API's have been known to change, once in a while.
        bucket - The bucket name of the bucket from a Google Cloud Server account.
        completeFileName - This String-parameter needs to be the complete String representation of the directory-name plus the file-name of the location where the java serialized Object file was saved, or will be saved.

        GCS Storage-Buckets: The bucket-name as-a-string should **not** be included as a part of this String-parameter. However, both the file-name, and the directory-name where this file is residing must be present.
        ASCIIorUTF8 - When writing java String's the file-system, it is generally not to important to worry about whether java has stored an 'ASCII' encoded String, or a String encoded using 'UTF-8'. Most foreign-language news-sites require the latter ('UTF-8'), but any site that is strictly English can get by with plain old ASCII.

        IMPORTANT: When this boolean is TRUE, this method will attempt to presume the character-sequence you have passed is in ASCII, and write it that way. When this boolean is set to FALSE, this method will attempt to write the String of byte's as a 'UTF-8' encoded character-set.

        ALSO: I have not made any allowance for Unicode or Unicode little endian, because I have never used them with either the Chinese or Spanish sites I scrape. UTF-8 has been the only other character set I encounter.
        Code:
        Exact Method Body:
         BlobInfo blobInfo = BlobInfo.newBuilder
             (BlobId.of(bucket, completeFileName)).setContentType("text/plain").build();
        
         byte[] file = ASCIIorUTF8
             ? fileAsStr.toString().getBytes()
             : fileAsStr.toString().getBytes(java.nio.charset.Charset.forName("UTF-8"));
        
         Blob blob = storage.create(blobInfo, file);
        
      • writeObjectToFile

        🡅    
        public static void writeObjectToFile​
                    (java.lang.Object o,
                     com.google.cloud.storage.Storage storage,
                     java.lang.String bucket,
                     java.lang.String completeFileName,
                     boolean zip)
                throws java.io.IOException
        
        This will write a Java Serializable Object to a location in a Google Cloud Server Storage Bucket.
        Parameters:
        storage - This must be an instance of Google Cloud Server's class Storage. The description at the top of this class should elucidate how to obtain such an Object-instance. If the explanation is not very clear, or if it is not working (any more), please just go to a google-search-bar and look for information about the Google Cloud Server Storage Buckets Java API, class 'Storage'. API's have been known to change, once in a while.
        o - This may be any Serializable Java Object. Serializable Java Objects are ones which implement the interface java.io.Serializable.
        bucket - The bucket name of the bucket from a Google Cloud Server account.
        completeFileName - This String-parameter needs to be the complete String representation of the directory-name plus the file-name of the location where the java serialized Object file was saved, or will be saved.

        GCS Storage-Buckets: The bucket-name as-a-string should **not** be included as a part of this String-parameter. However, both the file-name, and the directory-name where this file is residing must be present.
        zip - When this parameter is TRUE the serialized object will be run through Java's java.util.zip.GZIPInputStream and GZIPOutputStream when reading/writing the serialized-Object. If this parameter is FALSE, GZIP Compression will not be used when serializing or de-serializing the Object.
        Throws:
        java.io.IOException
        Code:
        Exact Method Body:
         // Retrieves a file-name object using a GCS BUCKET-NAME, and the FILE-NAME (in
         // the bucket)            
        
         BlobId blobId = BlobId.of(bucket, completeFileName);
        
         // This BlobInfo is GCS version of "java.io.File".  It points to a specific file
         // inside a GCS Bucket (which was specified earlier)
        
         BlobInfo blobInfo = BlobInfo.newBuilder(blobId)
             /* .setContentType("text/plain") */.build();
        
         // This will save the Serialized Object Data to a Stream (and eventually an array)
         ByteArrayOutputStream baos = new ByteArrayOutputStream();
        
         // This stream writes serialized Java-Objects to the Storage Bucket
         ObjectOutputStream oos = zip
             ? new ObjectOutputStream(new GZIPOutputStream(baos))
             : new ObjectOutputStream(baos);
        
         oos.writeObject(o);
        
         oos.flush(); baos.flush(); oos.close();
        
         // Convert that BAOS to a Byte-Array
         byte[]  bArr = baos.toByteArray();
        
         // Write the BYTE-ARRAY to the GCS Bucket and file using the "BlobInfo" that was built
         // a few lines ago.
        
         Blob    blob = storage.create(blobInfo, bArr);