AI agents that use tools

under science research AI chronicle AI chronicle Read time: 3 min.

Image generated with ChatGPT, with the prompt "Please generate an image of an AI using a mechanical tool, such as a wrench. Please make the robot look rather friendly. Also, please make the image square"

Modern AIs acquire new capabilities by combining tools to perform a complex task, controlling them like an agent. Unlike traditional programming, they define the sequences of actions themselves.

Note

This post was originally published in French as part of my scientific chronicle in Les Echos.

Modern AIs are increasingly using tools. For example, if you ask a conversational AI to solve a complicated equation, the AI alone cannot do it. This is not surprising: there is no general mathematical formula. But if this AI knows how to use numerical equation-solving routines, it quickly gives us the answer. For example, “Le Chat” from Mistral generates a small program that uses the “Python” language and its numerical routines to solve our problem. The difficulty here is to generate the program that calls the right routines. This ability is an extension of conversational AI models that know how to answer questions by generating text. Here, the text is computer code and not English.

By controlling the computer, the AI “acts”. That’s why it is said to be an “agent”. By coupling with other systems, agentic AIs develop new capabilities. The most powerful ones can then combine different tools by leveraging their complementarities. These agent systems are currently progressing very quickly, but they remind us of what we have always done in computer science: any complicated system is assembled from multiple routines, each with a specific functionality. Writing a computer program is precisely describing how we are going to call these routines to solve a problem. And yet, without the recent advances in AI, we have to specify all the steps, whereas agent AIs take a given goal and will themselves produce these steps. The difficulty then becomes to break down a task into sub-tasks, which is called planning, a difficult problem.

In modern AIs, these planning skills are learned. The systems improve through trial and error: we give the AI lots of tasks to solve and the AI tries sequences of sub-tasks, deciding to use one tool or another. If it succeeds in the final task, it learns that the sequence of tool use was a good sequence for the task. This is called reinforcement learning, whose main inventors received the Turing Prize this year, the Nobel Prize of computer science.

Another major driver of progress for agent AIs is the powerful ability of analogy and associative memory of language models. These language skills enable them to start from problems specified by the user in plain English, with an open vocabulary. They draw their strategies to use tools from a great knowledge of similar problems, but also know how to adapt these strategies to the intermediate responses of the tools. They can also interact with systems that are much more complex and indeterminate than computer routines. For example, an AI can go and fetch information on the internet, or even ask a human.

Agent AIs open new perspectives. But they also greatly increase computing costs, as they iterate over sub-tasks. Computing costs must be kept in mind, as they are an important hurdle to democratization of AI.

AI chronicles

Find all my AI chronicles here

The goals of these “AI chronicles” is to introduce concepts of AI to a broader public, staying at a very very high level.

Go Top