Machine Learning – Text Recognition API

ML text recognition

If you want to solve a real-world problem, then you need to use on-device machine learning when developing your iOS mobile app with UX Builder. These Visual and Natural APIs are often used to solve app challenges or to create brand new user experiences. These next few posts will cover in detail some of these ML Visual and Natural APIs. It is important to understand better the benefit and necessity to include these types of APIs when developing your first/next app.

This week’s blog is dedicated to a machine learning Text Recognition application programming interface that can acknowledge any type of text as long as it is written in any Latin character set. Also, this API offers you the possibility to use it for the automation of data-entering tasks such as processing credit cards, business cards, and receipts.

Key capabilities

As is the case with all the apps that we have on our phones, the same is with the ML APIs, we are always looking for the perks offered. Here is the list of competencies that text recognition offers you when using it in developing your iOS mobile app: 

  • Acknowledge text when presented in Latin-based languages. Supports the recognition of text when using Latin script.
  • Analyses the structure of the text. Helps the detection of words, lines, and paragraphs in the text.
  • Identify the language of the text. Easily identifies the language of the detected text.
  • Small application footprint. Offers you the possibility to create an application that wonโ€™t take up too much space on your computer, mobile, or tablet.
  • Real-time recognition. Enables you to recognize text time on several devices in real-time.

Text structure 

As I already mentioned, the text recognizer fragments the text into several forms: blocks, lines, and elements. (1) The block is adjoining text lines that often form a paragraph or column; (2) the lines are adjoining words on the same asix; while (3) the elements are adjoining set of alphanumeric characters โ€œwordsโ€ in most Latin languages. 

Source: http://developers.google.com/ml-kit/vision/text-recognition, 2022

When it comes to language support, text recognition as a machine learning API is able to recognize text in a variety of different scripts and languages. There are three categories of languages based on their level of usage: 

  1. Supported (prioritized, with  regularly evaluated performance against) 
  2. Experimental (under active development, but not regularly evaluated against), and 
  3. Mapped languages (supported by mapping them to the code of another language).

You can check the detailed lists of the three categories of languages on ML Kit. Also, they offer detailed step-by-step guides on using the Text Recognition API, currently promoting the beta version of Text Recognition v2.

Have any questions?