I believe one flavor of software that will become more common over time is software that uses (multiple) fine-tuned pretrained AI models as libraries to provide stochastic reasoning capabilities intended to be run on edge computing applications. I realized that was exactly what I was doing while recently building with MediaPipe and Unity, where the developer downloads the literal bytes of a pretrained fine-tuned model whose purpose is to decipher facial and pose information from image input and offer it to the software developer as structured data. I use it to puppeteer a virtual avatar but one can fully imagine what affordances other AI models might allow a software developer to use in their applications.
What do I mean when I say “stochastic reasoning” and as opposed to what? I simply mean to point out that what is afforded by these special AI model driven libraries like MediaPipe is not entirely deterministic like traditional libraries; hypothetically, there exists an image that isn’t anything like the corpus of trained data and the model may or may not be able to correctly parse out the facial or pose data. That uncertainty adds some randomness to if your software will work as expected. Other models may have varying degrees of confidence depending on a multitude of factors.
I emphasize in my initial statement that the software will be intended to run on edge compute platforms. I specially am thinking about software intended to be run “locally” on laptops, desktops, phones, IoT devices like traditional smart home appliances to more extravagant robotic devices be they flying, walking, (and) or rolling in real-time. I can imagine a drone being able to know what tree, plant, and bird through a few low resource models that’s been optimized for this task. Not having to wait for a round trip of a network packet when interacting with an online API is key to the kinds of software solutions I’m thinking about, admittedly also adding another constraint on when its a good idea to want to do this. I can’t tell you how many robotic projects I’ve seen have the dreaded OpenAI API delay that needs to be edited out on social media platforms.
Through my study of machine learning (special thanks to the HuggingFace and FastAI communities), I’ve begun thinking about what software opportunities exist where using a pretrained fine-tuned model may be the correct tool for the job. I’m still internally sketching out what the parameters might be but rest assured its a fun thing that’s recently taken my imagination.
What do you think? Have you been able to find some answers on when to apply a fine-tuned pretrained AI model to a software solution? Have you downloaded a model found on HuggingFace Hub to deploy in one of your software solutions? Any experience with a unique use in either a real-time “game” application or a drone/robotics application? Leave a comment as I’m extremely curious what other people think on the subject.