• 18-19 College Green, Dublin 2
  • 01 685 9088
  • info@cunninghamwebsolutions.com
  • cunninghamwebsolutions
    Cunningham Web Solutions
    • Home
    • About Us
    • Our Services
      • Web Design
      • Digital Marketing
      • SEO Services
      • E-commerce Websites
      • Website Redevelopment
      • Social Media Services
    • Digital Marketing
      • Adwords
      • Social Media Services
      • Email Marketing
      • Display Advertising
      • Remarketing
    • Portfolio
    • FAQ’s
    • Blog
    • Contact Us
    MENU CLOSE back  

    Powerful Image Analysis With Google Cloud Vision And Python

    You are here:
    1. Home
    2. Web Design
    3. Powerful Image Analysis With Google Cloud Vision And Python
    Thumbnail for 22256
    Smashing Cat, just preparing to do some magic stuff.

    Powerful Image Analysis With Google Cloud Vision And Python

    Powerful Image Analysis With Google Cloud Vision And Python

    Bartosz Biskupski

    2019-01-09T13:45:32+01:00
    2019-01-10T12:02:39+00:00

    Quite recently, I’ve built a web app to manage user’s personal expenses. Its main features are to scan shopping receipts and extract data for further processing. Google Vision API turned out to be a great tool to get a text from a photo. In this article, I will guide you through the development process with Python in a sample project.

    If you’re a novice, don’t worry. You will only need a very basic knowledge of this programming language — with no other skills required.

    Let’s get started, shall we?

    Never Heard Of Google Cloud Vision?

    It’s an API that allows developers to analyze the content of an image through extracted data. For this purpose, Google utilizes machine learning models trained on a large dataset of images. All of that is available with a single API request. The engine behind the API classifies images, detects objects, people’s faces, and recognizes printed words within images.

    To give you an example, let’s bring up the well-liked Giphy. They’ve adopted the API to extract caption data from GIFs, what resulted in significant improvement in user experience. Another example is realtor.com, which uses the Vision API’s OCR to extract text from images of For Sale signs taken on a mobile app to provide more details on the property.

    Machine Learning At A Glance

    Let’s start with answering the question many of you have probably heard before — what is the Machine Learning?

    The broad idea is to develop a programmable model that finds patterns in the data its given. The higher quality data you deliver and the better the design of the model you use, the smarter outcome will be produced. With ‘friendly machine learning’ (as Google calls their Machine Learning through API services), you can easily incorporate a chunk of Artificial Intelligence into your applications.

    Recommended reading: Getting Started With Machine Learning

    Ahoy! The hunt for shiny front-end & UX treasures has begun! Meet SmashingConf San Francisco 2019 🇺🇸 — a friendly conference on performance, refactoring, interface design patterns, animation and all the CSS/JS malarkey. Brad Frost, Sara Soueidan, Miriam Suzanne, Chris Coyier and many others. April 16–17. You can easily convince your boss, you know.

    Check the speakers ↬

    How To Get Started With Google Cloud

    Let’s start with the registration to Google Cloud. Google requires authentication, but it’s simple and painless — you’ll only need to store a JSON file that’s including API key, which you can get directly from the Google Cloud Platform.

    Download the file and add it’s path to environment variables:

    export GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/apikey.json
    

    Alternatively, in development, you can support yourself with the from_serivce_account_json() method, which I’ll describe further in this article. To learn more about authentication, check out Cloud’s official documentation.

    Google provides a Python package to deal with the API. Let’s add the latest version of google-cloud-vision==0.33 to your app. Time to code!

    How To Combine Google Cloud Vision With Python

    Firstly, let’s import classes from the library.

    from google.cloud import vision
    from google.cloud.vision import types
    

    When that’s taken care of, now you’ll need an instance of a client. To do so, you’re going to use a text recognition feature.

    client = vision.ImageAnnotatorClient()
    

    If you won’t store your credentials in environment variables, at this stage you can add it directly to the client.

    client = vision.ImageAnnotatorClient.from_service_account_file(
    '/path/to/apikey.json'
    )
    

    Assuming that you store images to be processed in a folder ‘images’ inside your project catalog, let’s open one of them.

    Image of receipt that could be processed by Google Cloud Vision

    An example of a simple receipt that could be processed by Google Cloud Vision. (Large preview)

    image_to_open = 'images/receipt.jpg'
    
    with open(image_to_open, 'rb') as image_file:
        content = image_file.read()
    

    Next step is to create a Vision object, which will allow you to send a request to proceed with text recognition.

    image = vision.types.Image(content=content)
    
    text_response = client.text_detection(image=image)
    

    The response consists of detected words stored as description keys, their location on the image, and a language prediction. For example, let’s take a closer look at the first word:

    [
    ...
    description: "SHOPPING"
    bounding_poly {
      vertices {
        x: 1327
        y: 1513
      }
      vertices {
        x: 1789
        y: 1345
      }
      vertices {
        x: 1821
        y: 1432
      }
      vertices {
        x: 1359
        y: 1600
      }
    }
    ...
    ]
    

    As you can see, to filter text only, you need to get a description “on all the elements”. Luckily, with help comes Python’s powerful list comprehension.

    texts = [text.description for text in text_response.text_annotations]
    
    ['SHOPPING STOREnREG 12-21n03:22 PMnCLERK 2n618n1 MISCn1 STUFFn$0.49n$7.99n$8.48n$0.74nSUBTOTALnTAXnTOTALnCASHn6n$9. 22n$10.00nCHANGEn$0.78nNO REFUNDSnNO EXCHANGESnNO RETURNSn', 'SHOPPING', 'STORE', 'REG', '12-21', '03:22', 'PM', 'CLERK', '2', '618', '1', 'MISC', '1', 'STUFF', '$0.49', '$7.99', '$8.48', '$0.74', 'SUBTOTAL', 'TAX', 'TOTAL', 'CASH', '6', '$9.', '22', '$10.00', 'CHANGE', '$0.78', 'NO', 'REFUNDS', 'NO', 'EXCHANGES', 'NO', 'RETURNS']
    

    If you look carefully, you can notice that the first element of the list contains all text detected in the image stored as a string, while the others are separated words. Let’s print it out.

    print(texts[0])
    
    SHOPPING STORE
    REG 12-21
    03:22 PM
    CLERK 2
    618
    1 MISC
    1 STUFF
    $0.49
    $7.99
    $8.48
    $0.74
    SUBTOTAL
    TAX
    TOTAL
    CASH
    6
    $9. 22
    $10.00
    CHANGE
    $0.78
    NO REFUNDS
    NO EXCHANGES
    NO RETURNS
    

    Pretty accurate, right? And obviously quite useful, so let’s play more.

    What Can You Get From Google Cloud Vision?

    As I’ve mentioned above, Google Cloud Vision it’s not only about recognizing text, but also it lets you discover faces, landmarks, image properties, and web connections. With that in mind, let’s find out what it can tell you about web associations of the image.

    web_response = client.web_detection(image=image)
    

    Okay Google, do you actually know what is shown on the image you received?

    web_content = web_response.web_detection
    web_content.best_guess_labels
    >>> [label: "Receipt"]
    

    Good job, Google! It’s a receipt indeed. But let’s give you a bit more exercise — can you see anything else? How about more predictions expressed in percentage?

    predictions = [
    (entity.description, '{:.2%}'.format(entity.score))) for entity in web_content.web_entities
    ]
    
    >>> [('Receipt', '70.26%'), ('Product design', '64.24%'), ('Money', '56.54%'), ('Shopping', '55.86%'), ('Design', '54.62%'), ('Brand', '54.01%'), ('Font', '53.20%'), ('Product', '51.55%'), ('Image', '38.82%')]
    

    Lots of valuable insights, well done, my almighty friend! Can you also find out where the image comes from and whether it has any copies?

    web_content.full_matching_images
     >>> [
    url: "http://www.rcapitalassociates.com/wp-content/uploads/2018/03/receipts.jpg", 
    url:"https://media.istockphoto.com/photos/shopping-receipt-picture-id901964616?k=6&m=901964616&s=612x612&w=0&h=RmFpYy9uDazil1H9aXkkrAOlCb0lQ-bHaFpdpl76o9A=", 
    url: "https://www.pakstat.com.au/site/assets/files/1172/shutterstock_573065707.500x500.jpg"
    ]
    

    I’m impressed. Thanks, Google! But one is not enough, can you please give me three examples of similar images?

    web_content.visually_similar_images[:3]
    >>>[
    url: "https://thumbs.dreamstime.com/z/shopping-receipt-paper-sales-isolated-white-background-85651861.jpg", 
    url: "https://thumbs.dreamstime.com/b/grocery-receipt-23403878.jpg", 
    url:"https://image.shutterstock.com/image-photo/closeup-grocery-shopping-receipt-260nw-95237158.jpg"
    ]
    

    Sweet! Well done.

    Is There Really An Artificial Intelligence In Google Cloud Vision?

    As you can see in the image below, dealing with receipts can get a bit emotional.

    Man screaming and looking stressed while holding a long receipt

    An example of stress you can experience while getting a receipt. (Large preview)

    Let’s have a look at what the Vision API can tell you about this photo.

    image_to_open = 'images/face.jpg'
    
    with open(image_to_open, 'rb') as image_file:
        content = image_file.read()
    image = vision.types.Image(content=content)
    
    face_response = client.face_detection(image=image)
    face_content = face_response.face_annotations
    
    face_content[0].detection_confidence
    >>> 0.5153166651725769
    

    Not too bad, the algorithm is more than 50% sure that there is a face in the picture. But can you learn anything about the emotions behind it?

    face_content[0]
    >>> [
    ...
    joy_likelihood: VERY_UNLIKELY
    sorrow_likelihood: VERY_UNLIKELY
    anger_likelihood: UNLIKELY
    surprise_likelihood: POSSIBLE
    under_exposed_likelihood: VERY_UNLIKELY
    blurred_likelihood: VERY_UNLIKELY
    headwear_likelihood: VERY_UNLIKELY
    ...
    ]
    

    Surprisingly, with a simple command, you can check the likeliness of some basic emotions as well as headwear or photo properties.

    When it comes to the detection of faces, I need to direct your attention to some of the potential issues you may encounter. You need to remember that you’re handing a photo over to a machine and although Google’s API utilizes models trained on huge datasets, it’s possible that it will return some unexpected and misleading results. Online you can find photos showing how easily artificial intelligence can be tricked when it comes to image analysis. Some of them can be found funny, but there is a fine line between innocent and offensive mistakes, especially when a mistake concerns a human face.

    With no doubt, Google Cloud Vision is a robust tool. Moreover, it’s fun to work with. API’s REST architecture and the widely available Python package make it even more accessible for everyone, regardless of how advanced you are in Python development. Just imagine how significantly you can improve your app by utilizing its capabilities!

    Recommended reading: Applications Of Machine Learning For Designers

    How Can You Broaden Your Knowledge On Google Cloud Vision

    The scope of possibilities to apply Google Cloud Vision service is practically endless. With Python Library available, you can utilize it in any project based on the language, whether it’s a web application or a scientific project. It can certainly help you bring out deeper interest in Machine Learning technologies.

    Google documentation provides some great ideas on how to apply the Vision API features in practice as well as gives you the possibility to learn more about the Machine Learning. I especially recommend to check out the guide on how to build an advanced image search app.

    One could say that what you’ve seen in this article is like magic. After all, who would’ve thought that a simple and easily accessible API is backed by such a powerful, scientific tool? All that’s left to do is write a few lines of code, unwind your imagination, and experience the boundless potential of image analysis.

    Smashing Editorial
    (rb, ra, il)

    From our sponsors: Powerful Image Analysis With Google Cloud Vision And Python

    Posted on 10th January 2019Web Design
    FacebookshareTwittertweetGoogle+share

    Related posts

    Archived
    22nd March 2023
    Archived
    18th March 2023
    Archived
    20th January 2023
    Thumbnail for 25788
    Handling Continuous Integration And Delivery With GitHub Actions
    19th October 2020
    Thumbnail for 25778
    A Monthly Update With New Guides And Community Resources
    19th October 2020
    Thumbnail for 25781
    Supercharge Testing React Applications With Wallaby.js
    19th October 2020
    Latest News
    • Archived
      22nd March 2023
    • Archived
      18th March 2023
    • Archived
      20th January 2023
    • 20201019 ML Brief
      19th October 2020
    • Thumbnail for 25788
      Handling Continuous Integration And Delivery With GitHub Actions
      19th October 2020
    • Thumbnail for 25786
      The Future of CX with Larry Ellison
      19th October 2020
    News Categories
    • Digital Marketing
    • Web Design

    Our services

    Website Design
    Website Design

    A website is an important part of any business. Professional website development is an essential element of a successful online business.

    We provide website design services for every type of website imaginable. We supply brochure websites, E-commerce websites, bespoke website design, custom website development and a range of website applications. We love developing websites, come and talk to us about your project and we will tailor make a solution to match your requirements.

    You can contact us by phone, email or send us a request through our online form and we can give you a call back.

    More Information

    Digital Marketing
    Digital Marketing

    Our digital marketeers have years of experience in developing and excuting digital marketing strategies. We can help you promote your business online with the most effective methods to achieve the greatest return for your marketing budget. We offer a full service with includes the following:

    1. Social Media Marketing

    2. Email & Newsletter Advertising

    3. PPC - Pay Per Click

    4. A range of other methods are available

    More Information

    SEO
    SEO Services

    SEO is an essential part of owning an online property. The higher up the search engines that your website appears, the more visitors you will have and therefore the greater the potential for more business and increased profits.

    We offer a range of SEO services and packages. Our packages are very popular due to the expanse of on-page and off-page SEO services that they cover. Contact us to discuss your website and the SEO services that would best suit to increase your websites ranking.

    More Information

    E-commerce
    E-commerce Websites

    E-commerce is a rapidly growing area with sales online increasing year on year. A professional E-commerce store online is essential to increase sales and is a reflection of your business to potential customers. We provide professional E-commerce websites custom built to meet our clients requirements.

    Starting to sell online can be a daunting task and we are here to make that journey as smooth as possible. When you work with Cunningham Web Solutions on your E-commerce website, you will benefit from the experience of our team and every detail from the website design to stock management is carefully planned and designed with you in mind.

    More Information

    Social Media Services
    Social Media Services

    Social Media is becoming an increasingly effective method of marketing online. The opportunities that social media marketing can offer are endless and when managed correctly can bring great benefits to every business.

    Social Media Marketing is a low cost form of advertising that continues to bring a very good ROI for our clients. In conjuction with excellent website development and SEO, social media marketing should be an essential part of every digital marketing strategy.

    We offer Social Media Management packages and we also offer Social Media Training to individuals and to companies. Contact us to find out more.

    More Information

    Cunningham Web Solutions
    © Copyright 2025 | Cunningham Web Solutions
    • Home
    • Our Services
    • FAQ's
    • Account Services
    • Privacy Policy
    • Contact Us