In this post I’ll show you how to use a Google Cloud Function to access the machine learning API for natural language processing . Cloud functions are one of the serverless features of the GCP. Please keep in mind that serverless does not mean that your code does not run on some virtual machine. You just don’t have to provision and maintain that machine.
Google Cloud Function Basics
A Cloud Function is implemented in JavaScript and executed within a Node.js runtime environment. Cloud Functions are triggered by certain events:
- Cloud Pub/Sub messaging
- Cloud Storage operations
- HTTP requests
Depending on the event source, your program has to export a JavaScript function with a given signature. For Pub/Sub and Storage events your function may look like this:
1exports.entry_point = function(event, callback) {...}
The event parameter carries the data that are relevant to the given event, callback is a function reference that has to be called at the end of your Cloud Function.
When implementing a Cloud Function for handling HTTP requests, you will use a function like this
1exports.entry_point = function(request, response) {...}
where request and response represent the HTTP request and response objects.
Please note that your Cloud Function should be stateless. Depending on the load, the cloud environment will spawn multiple instances of the function, so you cannot rely on the global state of your JavaScript.
A minimal Node.js project consists of a single JavaScript file, typically named index.js, and a JSON file package.json that defines metadata like dependencies etc. You can edit and manage these resources with the Google Cloud dashboard or deploy them from your local file system with the gcloud CLI from the cloud SDK. We’ll use the second option.
Machine Learning Use Case
Our machine learning use case we are going to implement will look like this:
The upload to a Storage bucket nlp-test-in will trigger our Cloud Function that calls the Natural Language API to perform a text analysis of the content of the uploaded file. The results will be written to a JSON file in the Storage bucket nlp-test-out.
First, we add the required dependencies for the Natural Language and Storage APIs to the package.json file:
1{ 2 "name": "gcf-ml-aes", 3 "version": "1.0.0", 4 "dependencies": { 5 "@google-cloud/language": "^1.2.0", 6 "@google-cloud/storage": "^1.6.0" 7 } 8}
Our Cloud Function implementation in index.js may look like this:
1const Storage = require('@google-cloud/storage');
2const languageApi = require('@google-cloud/language');
3
4const OUT_BUCKET_NAME = "nlp-test-out";
5
6// Storage API
7const storage = new Storage();
8const outputBucket = storage.bucket(OUT_BUCKET_NAME);
9
10// Language API
11const client = new languageApi.LanguageServiceClient();
12
13function gcsUri(bucket, file) {
14 return `gs://${bucket}/${file}`;
15}
16
17function outputFilename(inputFilename) {
18 return inputFilename.replace(".txt", "-results.json");
19}
20
21/**
22 * Triggered from a message on a Cloud Storage bucket.
23 *
24 * @param {!Object} event The Cloud Functions event.
25 * @param {!Function} The callback function.
26 */
27exports.analyse_entity_sentiment = function(event, callback) {
28 const data = event.data;
29 const inputFileUri = gcsUri(data.bucket, data.name);
30 const outFilename = outputFilename(data.name);
31
32 console.log('Processing text from: ' + inputFileUri);
33 const aesRequest = {
34 gcsContentUri: inputFileUri,
35 type: 'PLAIN_TEXT'
36 };
37
38 // Call to Language API
39 client
40 .analyzeEntitySentiment({document: aesRequest})
41 .then(results => {
42 const outputFile = outputBucket.file(outFilename);
43 outputFile.save(JSON.stringify(results));
44 console.info('Text analysis results writtten to: ' + gcsUri(OUT_BUCKET_NAME,outFilename));
45 callback();
46 });
47}
There are two interesting things to observe:
- The client to the Natural Language API can read its input directly from a Storage bucket by setting the gcsContentUri field in the request document.
- The text analysis results are returned as a JSON object that we are saving to file in a separate output bucket.
Deployment
Let’s assume our project looks like this:
1$ ls -la 2-rw-r--r-- 1 tobias staff 541 6 Apr 11:27 .gcloudignore 3-rw-r--r--@ 1 tobias staff 41 6 Apr 11:12 .gitignore 4drwxr-xr-x 4 tobias staff 128 6 Apr 11:14 data 5-rwxr-xr-x@ 1 tobias staff 170 6 Apr 10:23 deploy.sh 6-rw-r--r--@ 1 tobias staff 1414 6 Apr 11:24 index.js 7drwxr-xr-x 277 tobias staff 8864 6 Apr 11:05 node_modules 8-rw-r--r-- 1 tobias staff 118819 6 Apr 11:05 package-lock.json 9-rw-r--r-- 1 tobias staff 152 6 Apr 11:05 package.json
The .gcloudignore file defines exclusions for a file that should not be uploaded the GCP including the node_modules and data folder. Given that we can simply deploy our Cloud Function with
1gcloud functions deploy aes-1 \ 2 --entry-point=analyse_entity_sentiment \ 3 --trigger-resource=nlp-test-in \ 4 --trigger-event=google.storage.object.finalize
the name of the Cloud Function will be aes-1, the exported JavaScript function analyse_entity_sentiment is called when the event of type google.storage.object.finalize is triggerd on the bucket nlp-test-in. There are several other options, see gcloud functions deploy –help.
After successful deloyment, the new Cloud Function aes-1 shows up in the GCP dashboard:
Test
To test our Cloud Function, we upload a test file to the nlp-test-in bucket:
1$ gsutil cp data/louvre.txt gs://nlp-test-in
Looking at the log shows a successful execution:
1$ gcloud functions logs read --limit 4 2LEVEL NAME EXECUTION_ID TIME_UTC LOG 3D aes-1 73379230141287 2018-04-13 13:02:04.535 Function execution started 4I aes-1 73379230141287 2018-04-13 13:02:04.753 Processing text from: gs://nlp-test-in/louvre.txt 5I aes-1 73379230141287 2018-04-13 13:02:06.255 Text analysis results writtten to: gs://nlp-test-out/louvre-results.json 6D aes-1 73379230141287 2018-04-13 13:02:06.354 Function execution took 1820 ms, finished with status: 'ok'
We download the file holding the analysis results to the local file system:
1$ gsutil cp gs://nlp-test-out/louvre-results.json .
I wrote a separate blog post that explains the results of the entity sentiment analysis in detail.
There is also the Cloud Functions emulator that let you deploy and run your functions on your local machine before deploying them to the GCP.
Summary
You learned how to implement and deploy a Cloud Function to the GCP. With the text analysis use case I demonstrated how easy it is to use the Natural Language API from your Node.js program.
The full source code can be found at my GitHub repo ttrelle/gcf/ml-aes .
If you are interested in AI and machine learning, have a look at our codecentric.AI channel on YouTube.
More articles
fromTobias Trelle
Your job at codecentric?
Jobs
Agile Developer und Consultant (w/d/m)
Alle Standorte
More articles in this subject area
Discover exciting further topics and let the codecentric world inspire you.
Gemeinsam bessere Projekte umsetzen.
Wir helfen deinem Unternehmen.
Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.
Hilf uns, noch besser zu werden.
Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.
Blog author
Tobias Trelle
Software Architect
Do you still have questions? Just send me a message.
Do you still have questions? Just send me a message.