Recognise Text in Images using Computer Vision

The Computer Vision API in Azure Cognitive Service allows developers to increase content discoverability and automate text extraction by analyzing rich visual content in photos and live-videos without prior machine learning experience. Visual data processing can be used to label information with objects and concepts, extract language, provide image descriptions, censor content, and analyze people’s movement in physical spaces, among other things.

In this post ,we will discuss the Optical character recognition (OCR) service that allows developers to extract printed or handwritten text from images, such as photos of street signs and products, as well as from documents—invoices, bills, financial reports, articles, and more.

Process Flow

As shown in the below figure, we will create a Time Trigger Function that would get triggered at the scheduled time. The Function would read the stream data from an image in an Azure blob storage container and call Azure Cognitive Service – Computer Vision API. The API would recognize the text in the image and extract to return it as an output. This output will be uploaded to a file in the blob storage container.

Prerequisites

  1. An Azure Subscription
  2. Time series data in Azure Blob Storage

Provision Azure Cognitive Service – Computer Vision

Provision Computer Vision Service to invoke client APIs to recognize and extract text from an image in the Azure blob storage.

  • Login to Azure Portal
  • Search for “Computer Vision” and Click on Create
  • Provide Name, Region and Pricing Tier and Click on Review + Create
  • Go to Resource > Keys and Endpoint. Copy Values of Key and Endpoint. These values would be configured for calling the APIs

Create Azure Function

Create a time trigger Azure Function. This trigger will get invoked at a scheduled time to read images from blob storage for text recognition.

  • Open Visual Studio
  • Click on Create a new project
  • Search for Azure Functions Template
  • Select the template and click on Next
  • Provide Project Name and click on Create
  • Select “Timer Trigger“. Configure Schedule by providing CRON expression. Click on Create.

Add Cognitive Service Nuget Package

  • Right Click on Project > Manage Nuget Packages
  • Search for “Newtonsoft.Json”. Select and Click on Install
  • Search for “Computer Vision”. Select and Click on Install

Create Azure Cognitive Services – Computer Vision Client

  • Right Click on Project > Add > New Item…
  • Select “Class“, provide name “RecognizeText.cs” and click Add
  • Update the RecognizeText .cs with below Code.

In this code, the static RecognizeText class will be used to create a ComputerVision client object, recognize the text from image stream data and get the text from the stream data.

using System;
using System.Collections.Generic;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;
using System.IO;
using System.Threading.Tasks;

namespace DemoFuntionApp
{
    public static class RecognizeText
    {
       public static async Task<string> RunAsync(string endpoint, string key, Stream stream)
        {

            ComputerVisionClient computerVision = new ComputerVisionClient(new ApiKeyServiceClientCredentials(key))
            {
                Endpoint = endpoint
            };
            const int numberOfCharsInOperationId = 36;  

            return RecognizeTextFromStreamAsync(computerVision, stream, numberOfCharsInOperationId, TextRecognitionMode.Handwritten).Result; 
        }
       
        private static async Task<string> RecognizeTextFromStreamAsync(ComputerVisionClient computerVision, Stream imageStream, int numberOfCharsInOperationId, TextRecognitionMode textRecognitionMode)
        {
                RecognizeTextInStreamHeaders textHeaders = await computerVision.RecognizeTextInStreamAsync(imageStream, textRecognitionMode);
                return GetTextAsync(computerVision, textHeaders.OperationLocation, numberOfCharsInOperationId).Result;
            
        }

        private static async Task<string> GetTextAsync(ComputerVisionClient computerVision, string operationLocation, int numberOfCharsInOperationId)
        {
            string resultstring = "";
            string operationId = operationLocation.Substring(operationLocation.Length - numberOfCharsInOperationId);

            TextOperationResult result = await computerVision.GetTextOperationResultAsync(operationId);

            int i = 0;
            int maxRetries = 10;
            while ((result.Status == TextOperationStatusCodes.Running ||
                    result.Status == TextOperationStatusCodes.NotStarted) && i++ < maxRetries)
            {

                await Task.Delay(1000);
                result = await computerVision.GetTextOperationResultAsync(operationId);
            }
        
            var recResults = result.RecognitionResult;
            foreach (Line line in recResults.Lines)
            {
                foreach (Word word in line.Words)
                {
                    resultstring = resultstring + $"\nWord:\t{word.Text}" +
                                    $"\tLocation:\t {word.BoundingBox[0]}, " +
                                    $"{word.BoundingBox[1]}, {word.BoundingBox[2]}, {word.BoundingBox[3]}";
                }
                resultstring = resultstring + "\n";
            }
            return resultstring;
        }
    }
}

Update the Azure Function

  • Right Click on the Function and rename it.
  • Update below code in the Function.
  • Update blobconnection string, Azure cognitive service endpoint , subscription key and subscription region.

The below code triggers at a scheduled time. Reads the stream data from the image file in blob storage and calls the RunAsync method on the ComputerVisionClient class. The output of the method is written back to a file in the blob storage.

using System;
using System.IO;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Host;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Logging;
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob;

namespace DemoFuntionApp
{
    public static class ComputerVisionFn
    {
        [FunctionName("ComputerVisionFn")]
        public static void Run([BlobTrigger("<Blob Storage Container>/{name}", Connection = "blobconnstr")] Stream myBlob,
            [Blob("%<Blob Storage Container>%/{name}.txt", FileAccess.Write)] Stream validationOutput, string name, ILogger log,
             ExecutionContext context)
        {

            var config = new ConfigurationBuilder()
               .SetBasePath(context.FunctionAppDirectory)
               .AddJsonFile("local.settings.json", optional: true, reloadOnChange: true)
               .AddEnvironmentVariables()
               .Build();
            var Endpoint = config["ComputerVision_ENDPOINT"];
            var SubscriptionKey = config["ComputerVision_SUBSCRIPTION_KEY"];

            var output = RecognizeText.RunAsync(Endpoint, SubscriptionKey, myBlob).Result;
            using (var sw = new StreamWriter(validationOutput))
            {
                try
                {
                    sw.Write(output);
                    sw.Flush();
                }
                catch (Exception ex)
                {
                    log.LogInformation(ex.Message);
                }
            }
        }

        public static CloudBlobContainer GetContainerObj(string containerName)
        {
            CloudStorageAccount storageAccount = CloudStorageAccount.Parse("<Blob Storage Connection String>");

            CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
            CloudBlobContainer blobContainer = blobClient.GetContainerReference(containerName);

            return blobContainer;
        }

        private static string GetBlobSasUri(CloudBlobContainer container, string blobName, string policyName = null)
        {
            string sasBlobToken;

            CloudBlockBlob blob = container.GetBlockBlobReference(blobName);

            if (policyName == null)
            {

                SharedAccessBlobPolicy adHocSAS = new SharedAccessBlobPolicy()
                {
                    SharedAccessExpiryTime = DateTime.UtcNow.AddHours(24),
                    Permissions = SharedAccessBlobPermissions.Read | SharedAccessBlobPermissions.Write | SharedAccessBlobPermissions.Create
                };


                sasBlobToken = blob.GetSharedAccessSignature(adHocSAS);

                Console.WriteLine("SAS for blob (ad hoc): {0}", sasBlobToken);
                Console.WriteLine();
            }
            else
            {
                sasBlobToken = blob.GetSharedAccessSignature(null, policyName);
                Console.WriteLine("SAS for blob (stored access policy): {0}", sasBlobToken);
                Console.WriteLine();
            }


            return blob.Uri + sasBlobToken;
        }

    }
}

Azure cognitive service – Computer Vision API also provides many other capabilities including analyzing images to extract features and generate thumbnails. You can explore and use these methods using the framework we discussed in this post.



Categories: Azure

Tags: , , ,

Leave a Reply

%d bloggers like this: