Inspiration
Social Push There are 43 million blind people worldwide and 1 million in the U.S, alongside 250+ million suffering from severe vision impairments. The World Health Organization estimates that these impairments result in $411bn in annual productivity losses and experts have placed the market for vision-related assistive technologies at $4.2bn with a projected annual growth of 13.1%.
Personal impetus From a personal lens, Aksh grew up in rural India, which has the highest density of vision-related illnesses. With this perspective, he connected with Maya, who suffers from visual impairment, along with many others in her family. This project offered them an opportunity to contribute back to a community that had shaped them.
Market Opportunity Despite the large market, modern assistive tools are primitive and limited. Screen readers, braille displays, and speech recognition software, although useful textual inputs, are often bulky/immobile and fail to capture the full spectrum of visual stimuli we are exposed to. SafePath hopes to unlock blind people's lives in these two dimensions.
Timing We've seen two parallel movements in the last 2 years: the shift towards powerful generative AI systems and increasing funding towards more powerful edge devices like mobile phones, glasses, etc. The first enables unparalleled generalization of models on images/textual inputs it has never seen before. The second is opening new frontiers for a seamless user-experience, with our ultimate goal being to deploy our technology onto a frame of glasses.
Why Us? First, we lower blind people's costs drastically, from a few thousand dollars for current products to around $50/month for our app, and secondly, our app provides a mobile solution that allows them to engage with their environment seamlessly without needing external help.
What it does
Our backend features our ML suite, ApolloVision, which serves a suite of low-latency vision models and a powerful multi-modal model to serve as a real-time assistant for blind people. It serves two purposes. It's non-prompt based model which constantly surveys the environment for incoming dangers and obstacles using a vision model containing object and depth detection. We also allow passthrough into multi-modal cloud models, to which users can pose any specific questions about their environment and have the AI describe it to them, much like an AI nurse might, except that now the app is with the users 24/7. Our edge lies in erecting a narrow vertical with specialized features that makes this app a go-to product for visually challenged people. Some of these features include real-time object and depth recognition, motion detection, and multimodal Q/A, which allows our app to provide low-latency and precise danger detection. We are very excited to continue working on these features post-hackathon, aiming to further enhance safety and accessibility for blind individuals navigating busy environments.
How we built it
Our object detection models leverage OpenCV and YOLOv5 functionality. Our depth recognition module runs Midasv2 for performing monocular depth estimation from a single camera feed. Following insights from Aksh’s time at Tesla, we placed a lot of emphasis on working directly off camera feeds without the need for additional hardware support to keep the integration experience and costs for the end-user as low as possible. After fine-tuning on standard hardware, ApolloVision was optimized for latency and size and converted into a CoreML model, which runs on iOS devices. We currently support deployment on Macbook devices, web environments, and most importantly, the SafePath app.
Our Flutter iOS app has a clean, intuitive user interface that prioritizes accessibility for visually impaired users. Maya drew on her personal insights, stemming from her family's experiences with visual impairment, to ensure our design is inclusive and user-friendly. With audio responses (TTS) for every button click and bold, large color palettes, we enhance accessibility, making navigation effortless for all users.
Challenges we ran into
While Aksh and Chen were accustomed to ML development on conventional hardware devices, deployment on edge devices was quite challenging. We had to debug various compiling errors, rewrite parts of export modules, and optimize for latency, which was a very tricky experience.
On the app development front, we optimized for a long-term solution and therefore built out our core features in Flutter to ensure Android and iOS support. With most of our expertise lying in Swift, the experience was a challenge as integrating Swift to Flutter while ensuring compatibility was quite time-consuming.
Accomplishments that we're proud of
There were several milestones that were quite promising to hit. These included: seeing ApolloVision work on a local macbook device, integrating support for continuity camera, which we hope to switch with an actual camera later down the line, and finally seeing our models work on our iOS app, in a way that is relatively easy to ship out to customers.
What we learned
We learned how to optimize ML models for deployment on iOS devices, including reducing model size and optimizing for latency to ensure smooth performance.
What's next for SafePath
We’re really excited to take SafePath forward. We’re really excited about a future where blind people can have smart glasses to help them navigate the world, and we would love to explore more into that next. Additionally, in addition to the social impact, we did some preliminary market estimates and are optimistic about the potential from a business standpoint, which will allow our technology to scale and reach new heights. Some details included below:
With a hefty cost averaging around $3500-$8000, outdated assistive technologies such braille computers are still expected to emerge as key revenue powerhouses in the current 5 billion dollar assistive technology market. As the predicted amount of blind people more than doubles and the market is expected to grow into 13.9 billion by 2030, we believe there’s a highly untapped potential for disrupting this market.
We’re going to price our device at $50/month, bringing us $600/year per customer. Assuming full market penetration i.e capturing 1mn users in the US, this would mean an yearly revenue of $600 million/yr. Further, the CDC anticipates blindness growing 2 fold to become one of the top 10 illnesses in the US by 2030, this estimation totals our blindness clientele up to 3 million, increasing our revenue to ~$1.2bn/year just within the US itself.
Our always-on model runs on the user’s device and doesn’t have extra costs for us. Thus our main cost is running inference on our cloud models, which ends up being around 0.2cents/call. Margins depending on use case vary from 40$/month if a user prompts the glasses 10 times/hour and drop to $2.5/month if a user asks 50 questions/hour. In a nominal scenario, we expect around 30 questions/hour, or 20$/month, which is roughly a 40% profit margin. So scaling up, this is around a $240mn/yr profit to capture currently and $480mn/yr by 2050.
Log in or sign up for Devpost to join the conversation.