Our client was an avid user and developer of practical AI applications, to the extent that the cost of inference became a leading issue for their firm. To tackle it at a more strategic level, they asked us to model the performance implications of different topologies on certain GPU parameters.
We embarked on a mission to recreate certain popular operations used in AI models, e.g., different kinds of convolutions, and provided the client with a step-by-step explanation of how things are executed on a low level.
The resulting demonstrational software was entirely done with Python and OpenCL. The performance model also included tooling for predicting inference performance for a given topology and another tool to measure the characteristics of a given (OpenCL-capable) GPU.
Although the model was designed in a way so that it could hold a certain level of predictive power too, ultimately, its main use was the point of view it conveyed. Hence, along with the reports and workshops delivered, its main use turned out to be for internal educational purposes for the client’s team.
The Power of Generative AI in Customer Service
17/05/2024
AI Revolutionising Fashion & Beauty: From Virtual Try-Ons to Trend Forecasting
16/05/2024
Understanding AI Memory: Exploring the Neural Network Recall
15/05/2024
Can Artificial Intelligence Write TV Show Scripts?
14/05/2024
Smart Farming: How AI is Transforming Livestock Management
13/05/2024
What can you do with CoreML?
10/05/2024
How AI can Read our Psyche
9/05/2024
AI in Archaeology: Advancements and Applications
8/05/2024
The Pros and Cons of MLOps Tools
7/05/2024
The AI Innovations Behind Smart Retail
6/05/2024