Learn how to perform optical character recognition (OCR) on Google Cloud Platform. This tutorial demonstrates how to upload image files to Google Cloud Storage, extract text from the images using the Google Cloud Vision API, translate the text using the Google Cloud Translation API, and save your translations back to Cloud Storage. Google Cloud Pub/Sub is used to queue various tasks and trigger the right Cloud Functions to carry them out.
Objectives
- Write and deploy several Background Cloud Functions.
- Upload images to Cloud Storage.
- Extract, translate and save text contained in uploaded images.
Costs
This tutorial uses billable components of Cloud Platform, including:
- Google Cloud Functions
- Google Cloud Pub/Sub
- Google Cloud Storage
- Google Cloud Translation API
- Google Cloud Vision API
Use the Pricing Calculator to generate a cost estimate based on your projected usage.
New Cloud Platform users might be eligible for a free trial.Before you begin
-
Sign in to your Google Account.
If you don't already have one, sign up for a new account.
-
Select or create a Google Cloud Platform project.
-
Make sure that billing is enabled for your Google Cloud Platform project.
- Enable the Cloud Functions, Cloud Pub/Sub, Cloud Storage, Cloud Translation, and Cloud Vision APIs.
-
Update
gcloudcomponents:gcloud components update
-
Prepare your development environment.
Node.js
Python
Visualizing the flow of data
The flow of data in the OCR tutorial application involves several steps:
- An image is uploaded to Cloud Storage with text in any language (text that appears in the image itself).
- A Cloud Function is triggered, which uses the Vision API to extract the text, and queues the text to be translated into the configured translation languages.
- For each queued translation, a Cloud Function is triggered which uses the Translation API to translate the text and queue it to be saved to Cloud Storage.
- For each translated text, a Cloud Function is triggered which saves the translated text to Cloud Storage.
It may help to visualize the steps:
Preparing the application
Create a Cloud Storage bucket to upload your images, where
YOUR_IMAGE_BUCKET_NAMEis a globally unique bucket name:gsutil mb gs://
YOUR_IMAGE_BUCKET_NAMECreate a Cloud Storage bucket to save the translations, where
YOUR_TEXT_BUCKET_NAMEis a globally unique bucket name:gsutil mb gs://
YOUR_TEXT_BUCKET_NAMEClone the sample app repository to your local machine:
Node.js
git clone https://github.com/GoogleCloudPlatform/nodejs-docs-samples.git
Alternatively, you can download the sample as a zip file and extract it.
Python
git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
Alternatively, you can download the sample as a zip file and extract it.
Change to the directory that contains the Cloud Functions sample code:
Node.js
cd nodejs-docs-samples/functions/ocr/app/
Python
cd python-docs-samples/functions/ocr/app/
Configure the app:
Node.js
Using theconfig.default.jsonfile as a template, create aconfig.jsonfile in theappdirectory with the following contents:{ "RESULT_TOPIC": "YOUR_RESULT_TOPIC_NAME", "RESULT_BUCKET": "YOUR_TEXT_BUCKET_NAME", "TRANSLATE_TOPIC": "YOUR_TRANSLATE_TOPIC_NAME", "TRANSLATE": true, "TO_LANG": ["en", "fr", "es", "ja", "ru"] }- Replace
YOUR_RESULT_TOPIC_NAMEwith a topic name to be used for saving results. - Replace
YOUR_TEXT_BUCKET_NAMEwith a bucket name used for saving results. - Replace
YOUR_TRANSLATE_TOPIC_NAMEwith a topic name to be used for translating results.
Python
Edit theconfig.jsonfile in theappdirectory to have the following contents:{ "RESULT_TOPIC": "YOUR_RESULT_TOPIC_NAME", "RESULT_BUCKET": "YOUR_TEXT_BUCKET_NAME", "TRANSLATE_TOPIC": "YOUR_TRANSLATE_TOPIC_NAME", "TRANSLATE": true, "TO_LANG": ["en", "fr", "es", "ja", "ru"] }- Replace
YOUR_RESULT_TOPIC_NAMEwith a topic name to be used for saving results. - Replace
YOUR_TEXT_BUCKET_NAMEwith a bucket name used for saving results. - Replace
YOUR_TRANSLATE_TOPIC_NAMEwith a topic name to be used for translating results.
- Replace
Understanding the code
Importing dependencies
The application must import several dependencies in order to communicate with Google Cloud Platform services:
Node.js
Python
Processing images
The following function reads an uploaded image file from Cloud Storage and calls a function to detect whether the image contains text:
Node.js
Python
The following function extracts text from the image using the Cloud Vision API and queues the text for translation:
Node.js
Python
Translating text
The following function translates the extracted text and queues the translated text to be saved back to Cloud Storage:
Node.js
Python
Saving the translations
Finally, the following function receives the translated text and saves it back to Cloud Storage:
Node.js
Python
Deploying the functions
This section describes how to deploy your functions.
To deploy the image processing function with a Cloud Storage trigger, run the following command in the
appdirectory:Node.js 8
gcloud functions deploy ocr-extract --runtime nodejs8 --trigger-bucket YOUR_IMAGE_BUCKET_NAME --entry-point processImage
Node.js 10 (Beta)
gcloud functions deploy ocr-extract --runtime nodejs10 --trigger-bucket YOUR_IMAGE_BUCKET_NAME --entry-point processImage
Node.js 6 (Deprecated)
gcloud functions deploy ocr-extract --runtime nodejs6 --trigger-bucket YOUR_IMAGE_BUCKET_NAME --entry-point processImage
Python
gcloud functions deploy ocr-extract --runtime python37 --trigger-bucket YOUR_IMAGE_BUCKET_NAME --entry-point process_image
where
YOUR_IMAGE_BUCKET_NAMEis the name of your Cloud Storage bucket where you will be uploading images.To deploy the text translation function with a Cloud Pub/Sub trigger, run the following command in the
appdirectory:Node.js 8
gcloud functions deploy ocr-translate --runtime nodejs8 --trigger-topic
YOUR_TRANSLATE_TOPIC_NAME--entry-point translateTextNode.js 10 (Beta)
gcloud functions deploy ocr-translate --runtime nodejs10 --trigger-topic
YOUR_TRANSLATE_TOPIC_NAME--entry-point translateTextNode.js 6 (Deprecated)
gcloud functions deploy ocr-translate --runtime nodejs6 --trigger-topic
YOUR_TRANSLATE_TOPIC_NAME--entry-point translateTextPython
gcloud functions deploy ocr-translate --runtime python37 --trigger-topic
YOUR_TRANSLATE_TOPIC_NAME--entry-point translate_textwhere
YOUR_TRANSLATE_TOPIC_NAMEis the name of your Cloud Pub/Sub topic with which translations will be triggered.To deploy the function that saves results to Cloud Storage with a Cloud Pub/Sub trigger, run the following command in the
appdirectory:Node.js 8
gcloud functions deploy ocr-save --runtime nodejs8 --trigger-topic YOUR_RESULT_TOPIC_NAME --entry-point saveResult
Node.js 10 (Beta)
gcloud functions deploy ocr-save --runtime nodejs10 --trigger-topic YOUR_RESULT_TOPIC_NAME --entry-point saveResult
Node.js 6 (Deprecated)
gcloud functions deploy ocr-save --runtime nodejs6 --trigger-topic YOUR_RESULT_TOPIC_NAME --entry-point saveResult
Python
gcloud functions deploy ocr-save --runtime python37 --trigger-topic YOUR_RESULT_TOPIC_NAME --entry-point save_result
where
YOUR_RESULT_TOPIC_NAMEis the name of your Cloud Pub/Sub topic with which saving of results will be triggered.
Uploading an image
Upload an image to your image Cloud Storage bucket:
gsutil cp
PATH_TO_IMAGEgs://YOUR_IMAGE_BUCKET_NAMEwhere
PATH_TO_IMAGEis a path to an image file (that contains text) on your local system.YOUR_IMAGE_BUCKET_NAMEis the name of the bucket where you are uploading images.
You can download one of the images from the sample project.
Watch the logs to be sure the executions have completed:
gcloud functions logs read --limit 100
You can view the saved translations in the Cloud Storage bucket specified by the
RESULT_BUCKETvalue in your configuration file.
Cleaning up
To avoid incurring charges to your Google Cloud Platform account for the resources used in this tutorial:
Deleting the project
The easiest way to eliminate billing is to delete the project that you created for the tutorial.
To delete the project:
- In the GCP Console, go to the Projects page.
- In the project list, select the project you want to delete and click Delete delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
Deleting the Cloud Functions
Deleting Cloud Functions does not remove any resources stored in Cloud Storage.
To delete a Cloud Function, run the following command:
gcloud functions delete NAME_OF_FUNCTION
where NAME_OF_FUNCTION is the name of the function to delete.
You can also delete Cloud Functions from the Google Cloud Platform Console.


