For Pradeep Varakantham, there’s one scene in The Matrix that stands out clearly in his mind. In a dim room, hero Neo lies back in a black leather chair that’s hooked up to a computer. His acquaintance slips a slim disc into the machine, and Neo twitches in pain as the programme it contains — a combat training one with knowledge of jujitsu, tae kwon do, drunken boxing, and the like — is uploaded into his brain. When it’s all over, Neo’s eyes bolt wide open, and he announces in amazement to his mentor Morpheus: “I know kung fu.”
Varakantham, a professor of computer science at Singapore Management University, explains his fascination with the scene: “It’s the sci-fi version of what we want to do with our research — to explore how AI training can help non-expert humans learn something quickly.”
“People have been trying to make AI systems smarter and smarter and we’ve come up with clever algorithms to make them do automated things in a better and better way,” says Varakantham. He cites the example of Google’s AlphaGo computer programme, which “has achieved superhuman performance and beaten all the best Go players on the planet.”
“But these systems were not helping humans,” he says. “So we’ve been thinking how to make that happen, to see whether AI can help improve people’s abilities and teach them how to learn things like Go or chess faster. That’s the question we want to answer.”
To pursue the topic, Varakantham— applied for an AI Singapore Research Grant. In 2021, the pair was awarded a four-year grant for their project titled “Trust to Train and Train to Trust: Agent Training Programs for Safety-Critical Environments.”
A sequential approach
Developing an AI-based training system is something that has been a long time coming for Varakantham, who runs the Collaborative, Robust and Explainable AI-based Decision-making Lab (CARE.AI Lab) at SMU. “I’ve been working on the methods and models to address this training problem for 15 years or so,” he says.
The approach Varakantham adopts involves the use of games to train people on new skills through a process called reinforcement learning. This is a step-based approach which involves assessing a person’s performance and training progress at each stage, and using that to tailor the next training phase accordingly.
“You show someone something and you assess whether they have understood it or you look at how they are reacting to it, and then you either explain something new or you go back and explain something again,” explains Varakantham. “So it’s sequential.”
To illustrate his point further, he uses the popular Nintendo game Super Mario. “Imagine the first level is about jumping around, dodging obstacles, gaining awards, and so on. Let’s say you perform poorly. Then the next level should be something that is slightly lower in complexity or focuses on things that you didn’t do very well. But if you do well, then you increase the level.”
“So different levels are being generated based on the ability of the person, and the game is being generated in such a way so as to train you on how to play the game,” he says. In computer science parlay, this involves the use of sequential decision-making models.
Using such reinforcement learning-based trainers is “a common method” for teaching, says Varakantham. Plus the field of ‘environment generation’ or ‘level generation’ in the computer science world isn’t exactly new.
“But understanding a person’s ability and then generating a scenario based on that assessed ability, and then doing it continuously — that kind of model hasn’t been done before,” he says. “We are the first people working on this, and that is our contribution.”
Training bears fruit
Varakantham and his team spent the initial stages of the grant refining the models of the AI-based trainer. They then incorporated this into a Super Mario-like game over the next three to four months. In April this year, they invited nearly 250 people to test their game.
The study participants were split into three groups: those in the first group learnt how to play the game using AI-based sequential training techniques; participants in the second group were presented with random training scenarios that didn’t take into account their previous performance; while the third received no training at all. After an hour comprising 10 rounds of training (or no training), participants were tasked with completing new, slightly harder levels to test how good their training was.
The results were incredibly positive. “The training was clearly much better than the other setups — between 20 and 25% better than the random scenario, which in turn was better than no training,” says Varakantham, whose team is currently in the midst of publishing their work.
The findings, he says, were a “pleasant surprise.”
“We thought that when we do experiments with humans, it would be messy or that things would overlap and so on,” he explains. “But here, we see that it was pretty clear in terms of the improvement that we got — the results are actually statistically significant. We’re very happy with that.”
Training ambulance crew
Varakantham and his team are now working on a game to help train people aboard ambulances, which he hopes the Singapore Civil Defence Force (SCDF) will use in the near future. “These are people who have some amount of medical knowledge but not necessarily in an ambulance or emergency response context, so you have to cross-train them to assist the paramedics,” explains Varakantham, who has worked closely with the SCDF for many years.
He describes his new game as akin to training pilots using a flight simulator. Trainees are first given reading materials to learn the theoretical side of things, before being asked to play the game, which is set in the back of an ambulance. The goal is to stablise the patient before the ambulance arrives at the hospital and the trainee has to figure out the best path of action given the symptoms the patient presents — severe burns, bleeding, cardiac arrest, and so on.
“What medical equipment should they use? Where can they find that in the ambulance? When is the right time to use it?” are some factors trainees have to consider during the game, says Varakantham.
The AI-based training system offers many benefits. For one, it “reminds people of all the things they have to read and remember in a more accessible and fun manner,” he says. “The system can assess where the knowledge gaps are and generate scenarios to remind trainees what needs to be done in certain instances. So it’s like a trigger.”
For another, the AI trainer can simulate a host of scenarios — more than training with a physical manikin ever could. “Those are all fixed and maybe you have 10 scenarios you can play with,” says Varakantham. Instead, his game employs data taken from up to 90 emergency scenarios found on the internet, making for a varied gameplay.
“This also ensures a trainee has seen all kinds of challenging situations at least once and you’re not generating the same type of situation many times,” he adds. This aspect is crucial as it helps avoid unwanted side effects to the training, for example, “creating a bias so the person always thinks only certain kinds of scenarios occur.”
“If you get 25 sunburn cases and only one cardiac arrest, it’s probably not good as your trainee paramedic might forget how to deal with scenarios they’ve only seen once,” says Varakantham. “So you have to be mindful that there is no lopsided training happening.” To ensure the AI trainer doesn’t generate a particular scenario too often, his team use a method called constrained reinforcement learning.
In the future, Varakantham wants to improve his AI trainer further to accommodate multiple trainings whereby a team of people work together fulfilling different roles — for instance, a paramedic and an assistant working alongside each other in an ambulance. He also plans to incorporate more complex training set-ups into his models, where “the situations are longer and there are a lot more complexities assessing what the person knows.”
Reflecting on his work, Varakantham sees AI-based trainers as having broad applications in the future. “You can use it in many different set-ups, for instance training nurses on the safety of patients or construction workers on how to deal with different scenarios in a safe manner,” he says. “The game itself is a symbolism for many training tasks.”