Some of the most striking developments in AI in recent years have involved training artificially intelligent agents to exceed human performance in games. DeepMind’s triumph in Go is the most famous example, but there have been successes in games as diverse as Starcraft II and chess. Until now, though, agents have had to be trained to play specific games, one at a time. DeepMind just released a paper
on “generally capable agents”, which pushes beyond this: they train AI agents to learn to play arbitrary games, including games they’ve never encountered before, in an open ended environment. This video
summarises some of the results.
This is potentially a big deal. Being able to train AI agents with general and adaptable capabilities is likely an important milestone along the path to artificial general intelligence. The paper has generated a lot of chatter in the AI safety world - see this comment thread
for lots of discussion, some quite technical. I thought this analogy
was potentially quite helpful:
This is the GPT-1 of agent/goal-directed AGI; it is the proof of concept. Two more papers down the line… and we’ll have the agent/goal-directed AGI equivalent of GPT-3
(See here for our previous discussions of GPT-2/GPT-3 and its importance in AI)
Not everyone is so impressed. It’s worth reading Rohin Shah’s write up
in the excellent AI Alignment newsletter. He notes that the capabilities of these agents may be less generalisable outside their specific training environment than it appears.
One thing I found striking as I read the paper was how many training “behaviours” the successful AIs had in common with the top performing athletes discussed in the section above: the importance of training in multiple sports, the importance of self-play, the lack of immediate progress, etc. This may be an echo without meaning, but I wonder if there is something interesting there. Certainly, this paper represents a line of research worth keeping an eye on.