Discovering the Computer Vision API

Over the last few years, cloud services have been evolving from simple storage/VM hosting platforms to smart compute platforms that can provide fundamental shifts to the way we deliver solutions. The latest service I have played with in that space is the Computer Vision API. Be ready to be amazed at its simplicity.

The objective of the Computer Vision API is rather simple: what does an image contain? A car? An animal? Both? Is it moving? If a person, a male or female, of what age approximately? The ability to understand what a picture contains, quickly, and obtain a response with tags that can provide near instant feedback on an image can be very powerful. The service can be used to inspect videos in near-time by extracting frames and sending them for analysis. And the service can also be used to recognize handwriting.

But even more powerful: it has become exceedingly simple to use. Just about any developer (including junior developers), using any programming language, can tap into this service.

In the example below I will use Fiddler (an HTTP Proxy tool) to send an image to the Computer Vision API, and inspect the response. I will send my own picture and see what comes back. First, you will need to create a Computer Vision API service in Microsoft Azure and extract the Subscription Keys (see this Microsoft link: https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/vision-api-how-to-topics/howtosubscribe).

Once you have the service created, you can use an HTTP utility that allows you send requests; I am using Fiddler.

Once Fiddler has started, stop capturing traffic (press F12) so you do not any other requests than your own. Then click on the Composer tab. Select the POST operation, and paste the following URL (NOTE: your service might be different depending on the region; see you service in Azure to obtain your service URI, and append /analyze/visualFeatures=Description,Tags):

https://eastus2.api.cognitive.microsoft.com/vision/v1.0/analyze?visualFeatures=Description,Tags

In the Headers section, add your Shared Access Key:

Ocp-Apim-Subscription-Key: <your key goes here>

Last but not least, choose a file; locate click on the Upload file... link and select a file to inspect. I chose my own picture...

Your fiddler request should look like this:

You are now ready to press Execute. Once completed, you will see an HTTP 200 result on the left pane; double-click on it. The call returns a JSON object that can easily be inspected with Fiddler as well. For example, the service estimates that there is a 99.99% change this picture is a male; however the service seems to believe I am wearing glasses as you can see from the description, which is incorrect. However the confidence of wearing glasses must be below 50% because it is not listed in the tags (the tags returned with 50%+ confidence are: man, person, wearing, suit, posing).

What I find the most amazing is the level of simplicity of this service. No programming was needed to obtain an analysis of a picture. The service call is so simple that anyone with basic web development experience can call this service programmatically and begin leveraging image analysis as part of their solution.