Uses for Machine Learning Agents 🤖

Clearly I’ve been on HuggingFace (🤗) lately. I wanted to share some of the thoughts I’ve had on the topic of applied machine learning (ML) based on some of my research. I’m trying to think up use cases for applied machine learning solutions.

Inverted Problem Solving of Deep Reinforcement Learning

Something I find fascinating with Deep Reinforcement Learning, a ML solution, is how it inverts how one might normally describe a solution to a problem.

Normally a programmer would understand the problem and design a solution only then describe the solution to the computer, such as when a game agent programmer describes a behavior tree to give a game NPC a set of gameplay characteristics (e.g. Halo Grunt might flee when their assigned Elite dies). The human solves the problem.

With ML techniques like reinforcement learning there is a different paradigm. Programmers describe reward functions, a set of actions the agent can do, and provide AI with data about the problem to allow the ML agent to solve the implied problem. Instead of programmers telling the computer how to solve a problem we let the computer find its own solution.

ML Agent on Historic Stock Exchange Data?

I’ve before written python scripts before to back trade on historic US stock exchange data. I was limited by my ability to describe trading patterns using what I knew (behavior trees 🌲). I was slowed by my ability to learn and understand stock trading patterns.

Now that I have ML solutions in my toolbox I will revisit my trading programs 🙂.

Reward function is to obviously increase the amount of money, and to minimize losses. Actions might be to buy and sell stocks, optionally buy and sell options. Back trade on either the entire stock market exchange data or only back trade on the last 5 or 10 years, whichever you may think would create a more advantageous trader. Allow the stock trading ML agent to either make recommendations to keep a human in the loop, or more desirably to trade on its own with just an audit trail of what happens.

There might also be some controls on the ML trading bot reacting to live stock exchange data such as a kill switch if too much money is lost to limit exposure.

This then led me to think about production data from things like maintaining an “always available” business service like service a web service and completing work as asynchronous jobs.

ML Agents on Live Production Data?

Applications and application runtimes describe a typical generic web service, and often times these solutions involve asynchronous background jobs that work on batches of production data. Each technology running emits a constant stream of logs. These logs include system logs (systemd), host resource usages, http logs (nginx, apache, haproxy, IIS, etc), application runtime logs (php, node, python), application logs (the unique things that you the programmer want your programs to do), name anything that exists as a part of your solution (email, sms, printing) it will produce some form of telemetry. This is not meant to be a comprehensive list of what data might be available to use.

The available actions you may want a ML agent to do would be similar to what you’d expect a human system administrator to be able to do. Read through logs, check application logs to check for application runtime abnormalities (sometimes a deployed bug), rollback deployments, restart services on the fleet, restart hosts on the fleet, get help if things aren’t improved and production conditions proceed to be unfavorable.

The goal of the solution would be to back train an ML agent to use historic production data to “know” how to handle live production data. As improvements to the ML agent are made, programmers can back train the ML agents with new available actions to determine fitness for duty. The ability for the ML agent to call for help allows for a human to possibly remain in the loop. The ML agent should output its own telemetry (logs & metrics) to allow auditing.

Thus describes a possible ML solution to the problem of maintaining production.

In my musings I’ve thought I wonder if an ML agent would shut off access to production for deployments when it ultimately learned that changes are what ultimately destabilizes the environment. An amusing thought.

Conclusion

There are obvious applications for trained ML agents to solve problems in many different contexts. Its up to programmers to understand how to describe proper reward functions, interpret agent actions into true environment side effects (like restarting a host), and pipe in the appropriate production data. The stock trading ML agent and the system administrator ML agent are two examples we’ve explored together.

Leave a comment

Your email address will not be published. Required fields are marked *