Our client was a small, government-backed startup looking to expand multi-platform screen reading capabilities to the Kazakh language on consumer devices.
The difficulty of the project was coming from the tight deadline, which this time did not allow for the use of custom models, or even re-training existing ones. The available, pre-trained Kazakh models were mostly available in the form of PyTorch checkpoints; hence converting them to more efficient runtime formats, e.g. ONNX and CoreML, proved to be a challenge.
As an additional problem, albeit open-source application-layer solutions were available to start working with, but many of them relied on outdated build systems and incomplete dependencies, so bringing them back to life also turned out to be difficult.
In the end, our team managed to deliver good enough quality solutions to Android and Windows with efficient real-time models, whilst for iOS devices, the same model turned out to be too memory intensive as a part of the screen reader framework; hence, we had to operate a standalone application in the end.
This was due to an artificial limitation of AVSpeechSynthesisProviderAudioUnit memory requirements that does not allow for using memory that is otherwise available on the device, should it exceed around 80MB of usage. Our team proposed future steps to overcome that in a possible continuation project.
The Impact of Conversational AI on the Insurance Industry
25/04/2024
The Ultimate ChatGPT Cheat Sheet: Crafting Effective Prompts
24/04/2024
AI in Customer Service: Efficiency and Personalisation
27/02/2024
How can artificial intelligence replace virtual assistants?
23/02/2024
Microsoft's AI Journey from Bing to Copilot
8/02/2024
Amazon's AI Banter: Your Shopping Questions Just Got Witty!
17/01/2024
AI Chatbots in Health: The Virtual Doctor Dilemma
12/01/2024
Microsoft's new button for AI chatbot
4/01/2024
AI performs speech recognition
19/12/2023
AI chatbots solve mathematical problems beyond human capacity
18/12/2023