Comparing Firebase ML Kit’s Text Recognition on Android & iOS

Firebase’s ML Kit: Android vs iOS using DeltaML

In our post from about a month ago, we compared two of the major on-device text recognition SDKs on iOS: Firebase’s ML Kit & Tesseract OCR. The results were pretty one-sided, as ML Kit outclassed Tesseract on many of the predictions and showed much better accuracy even when it failed.

At the very end of our post, we showed a summary of results on Android devices. We saw that Tesseract OCR on Android wasn’t that far away from ML Kit. But we also saw that ML Kit on Android devices does not perform as well as it does on iOS devices. We shared details of our Android results here:

For comparing ML Kit’s performance on Android and iOS, let’s take a slightly different route than we took earlier. First we’ll segregate the dataset we had into chunks based on the content the images had. We can separate the set into 6 different subsets of data, as follows:

  1. Normal text
  2. Curved or tilted text
  3. Numeric images
  4. Images with low contrast or disturbance over text
  5. Text with unusual styles and/or fonts
  6. Text that was not easy for humans

We’ll use these subsets of our dataset in ML Kit to measure the difference in performance on both platforms. Here’s the summary of our results.

Normal text

These images have a good contrast level, and the text is presented in typical fonts that make them easiest of all to detect. Key findings:

  • Correctness is 20% less on Android.
  • Success rate is 50% on Android; 86% on iOS

Curved or tilted text

In these images, the text isn’t in a straight line; rather, it’s either been rotated or is in a curved shape. This makes it a bit more difficult to recognize correctly. Thus, there was a significant drop in correct recognition on both platforms. Key findings:

  • Correctness is 25% less on Android
  • Success rate is a mere 20% on Android; 42% on iOS.

Numeric images

These images contained numbers or special characters only, no alphabets. This gives us a good insight into cases where one has to recognize numeric strings—for example: license plates, lottery tickets, credit card numbers, etc. Key findings:

  • Correctness is 43% less on Android
  • Success rate is 27% on Android; 70% on iOS.

Images with low contrast or disturbance over text

We also found 141 images that had some disturbances inside the text (or with lower contrast). We didn’t expect very good performance on either platform, but iOS came out ahead again. Key findings:

  • Correctness was 33% less on Android
  • Success rate is 22% on Android; 58% on iOS.

Text with unusual styles and/or fonts

These images have text presented in uncommon fonts that make them tougher to detect with 100% accuracy. In most cases, the text was detected but wasn’t recognized perfectly. As seen below, the failure rate is higher than the success rate for both platforms. Additionally, the success rate for iOS is quite close to the rate of “No Results” on Android. Key findings:

  • Correctness is 34% less on Android
  • Success rate is 15% on Android; 44% on iOS.

Text that was not easy for humans

43 images have text that was not that easy for humans to read without some effort. Predictably, ML Kit had similar issues. So was the case for Firebase ML Kit as of now. Key findings:

  • Correctness is 35% less on Android
  • Success rate is a mere 2% on Android; 9% on iOS

Final words

These results clearly show that iOS is leading in the race to make mobile devices more AI efficient—at least in terms of text recognition. And it isn’t really close. As mentioned in Benchmarking TensorFlow Mobile on Android devices in production by Jameson Toole, one of the top-notch Android devices (Samsung Galaxy S9), is still 10X slower than the iPhone X and 100X slower than the new iPhone XS.

When it comes to on-device machine learning, the Android team really has to react quickly to stay in this race.

Discuss this post on Hacker News and Reddit.

Avatar photo


Our team has been at the forefront of Artificial Intelligence and Machine Learning research for more than 15 years and we're using our collective intelligence to help others learn, understand and grow using these new technologies in ethical and sustainable ways.

Comments 0 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *