Computer vision continues to be one of the fastest-growing fields in machine learning. Both web and mobile developers are harnessing the power of cameras as sensors and the hardware behind them to create unique and impactful user experiences.
Eye-D is one example of such a use case. Starting in 2012, the team at the Bangalore-based outfit has been working with computer vision to help the visually impaired live more independent and confident lives. Eye-D helps people with visual impairments be location-aware, explore and navigate nearby places of interest, evaluate surroundings with their smartphone cameras, and read printed text, to name just a few features.
And now the team has extended Eye-D’s reach and accessibility to more than 160 countries and 14 languages, with plans to continue growing. We had a chance to chat with Eye-D Founder Gaurav Mittal about the the app’s history, how it’s changed over the years, and more. Here’s our interview:
What’s your background and how did you come up with the idea for Eye-D?
Eye-D was conceptualized back in 2012 after we spent a day at the National Association for the Blind in Bangalore. We observed a very low technology footprint in the lives of people with visual impairments. The most intriguing part was that many people there are very technologically sound — they even have a computer lab where they write working code! So as engineers, we started thinking of ways we could empower the visually impaired with technology. These folks couldn’t see the world, but they weren’t blind to technology. Instead, they were only lacking solutions.
Initially the focus was on building a standalone hardware that would help someone with a visual impairment move around by providing alerts about obstacles. The hardware was then enhanced to process images using multiple cameras and sensors. However, as a product, it didn’t strike a chord with our users. By 2014 we started seeing smartphone adoption among the visually impaired community. This made us think more deeply about accessibility, and by then end of 2015, we launched the Eye-D app. Ever since, the apps (Free, Pro, iOS) have evolved to do much more than what we started off with, and now our apps are assisting visually impaired users in 160+ countries in 14 languages.
What’s does your tech stack look like, and what tools did you find helpful?
The Eye-D Apps are built to provide the same experience regardless of devices. The apps are powered by ML and AI and run with the help of multiple services that we’ve built over time on our servers, which are integrated into the app. The algorithms that power the app have evolved with the help of continuous user feedback.
We do a lot of work using TensorFlow, openCV, etc. Since we built the product for people with visual impairments, we’ve tried to ensure that the interface is friendly/accessible for that community. We also make sure that users get the desired result in the most intuitive way possible. Thus, the apps are optimized to work along with TalkBack on Android and VoiceOver on iOS.
What was the hardest part of building Eye-D?
The hardest part in developing any product is striking a balance between feasibility and user needs while also ensuring product-market fit. It took us significant time to arrive at what Eye-D is today. Our users have played a vital role in shaping the product, and without their help, it would have been tough. It’s been a rollercoaster ride since inception, and initially when we started working on the apps, limited processing power was a challenge on mobile devices. Internet connectivity was also an issue, but with time, the evolution from 2G to 3G to 4G has helped.
Do you have any advice for other developers who are looking to get started with machine learning?
We’d like to point out a few things that a new developer should keep in mind based on our experiences. First, develop clarity in your thoughts about what exactly you’re planning to achieve with machine learning.
Second, you don’t have to develop your model each and every time you want to test something. There are N number of models available online that you can apply transfer learning to in order to make them suitable for your project.
Third, if you’re planning to run the model on device, then do keep in mind these things: layers, battery consumption, processing power, etc. And there will be additional constraints if you’re trying to build something for a specific use case that requires acceleration.
Fourth, if you hit a hard wall and feel like the development is going nowhere, ask for feedback from the developer community. Finally, please always keep in mind the end user, as sometimes we tend to complicate things that are often simple.