"What a time to be alive!" exclaims Károly Zsolnai-Fehér in many of his videos on the Two Minute Papers video channel.
Indeed, it is a remarkable time to be alive! For me, the recent developments in AI have been a constant source of inspiration and challenge. Notably, a trip down memory lane occurred when the research field of "multiagent systems" resurfaced, powered by LLMs, and swiftly transitioned into development. My excitement stems from having written my doctoral dissertation on multiagent systems.
What are multiagent systems?
Intelligent agents represent a significant domain within Artificial Intelligence. Some argue that AI is fundamentally the study and development of intelligent agents. An intelligent agent can be any software, hardware, or a combination thereof, positioned within an environment. It perceives its surroundings via sensors and acts upon them through actuators. The intelligence of an agent, a topic that often sparks debate, is usually measured by a performance metric that the agent aims to maximize.
This definition is deliberately broad, encompassing everything from robots that mow lawns to software daemons mining databases for pertinent texts, illustrating the spectrum of agent use cases.
Moreover, agents come in various forms. Some are purely reactive, responding to stimuli with appropriate actions. Others are goal-driven, capable of formulating, maintaining, and executing complex plans.
The most thrilling development, however, is in multiagent systems (MAS). A MAS consists of a diverse group of agents with varying goals, behaviors, and capabilities. These agents cooperate to achieve a common goal in a mutually beneficial manner. Incorporating a human agent elevates this to the pinnacle of human-machine interaction, where humans or groups thereof enhance their productivity and creativity with Artificial Intelligence.
What makes LLM-powered agents tick?
The field of multiagent systems, which spans decades, traditionally relied on symbolic AI to construct the 'brains' of agents. This approach, emphasizing symbolic level operations for optimal behavior, often necessitated logic programming for theorem proving and logical reasoning.
The landscape has shifted with the advent of LLM-powered agents, brimming with promise. Now, Large Language Models serve as the reasoning engines for individual agents. For instance, utilizing ChatGPT via its API for mental tasks is a common application. However, I recommend exploring open-source, on-premise Language Models, fortunately now available in various specializations.
Components of LLM-powered agents
In LLM-powered agents, the LLM is central, but not the entirety. Functionality is typically divided into three components: memory, planning, and tool use.
Memory is the simplest component. LLMs possess a working memory, or context, such as ChatGPT's chat history. This context is limited in size, necessitating selective memory retention.
A robust long-term memory is achieved through vector databases. Imagine a traditional database table storing texts (memories) alongside their embeddings—a vector representing the text's coordinates in latent space. This geometric property, where semantically similar texts are proximate, enables efficient retrieval of relevant information to be injected into current prompts.
Planning involves devising a route from the agent's current state to a desired goal state, breaking down large tasks into manageable subgoals. LLMs have demonstrated capability as planners, generating and refining plans until they meet a quality threshold for execution.
Finally, tool use enhances LLM capabilities by integrating external APIs or tools, such as calculators, to supplement the LLM's inherent functionalities.
Examples
A standout example is the Generative Agents Simulation, a virtual sociological experiment with 25 characters, each controlled by an LLM-powered agent with unique traits. Set in a sandbox environment, the simulation's objective is to organize a Valentine's party, a goal achieved through dynamic interaction and collaboration.
GPT Engineer represents another impressive application, wherein a team of agents collaborates on software development tasks. This setup dramatically reduces the time and cost of tasks that would otherwise take much longer if done manually, despite the challenges in fully automating LLM-agent-based software engineering.
Lastly, the Scientific Discovery Agent in chemistry showcases LLM agents equipped with expert-designed tools for organic synthesis, drug discovery, and materials design, highlighting the potential for innovation in scientific research.
Conclusion
While the concept of utilizing multiagent systems for autonomous problem-solving or enhancing human creativity is not new, the integration of LLMs as reasoning engines is revolutionary. This approach simplifies the design process, making it more accessible and easier to debug, albeit with increased ambiguity in implementation. However, the promising early applications indicate that LLM-powered multiagent systems could significantly impact Artificial Intelligence.
Resources
- LLM Powered Autonomous Agents, blog article, https://lilianweng.github.io/posts/2023-06-23-agent/
- Generative Agents: Interactive Simulacra of Human Behavior, https://arxiv.org/abs/2304.03442
- GPT Engineer: https://github.com/gpt-engineer-org/gpt-engineer
- Emergent autonomous scientific research capabilities of large language models https://arxiv.org/abs/2304.05332
This article is a guest article, written by Dr. Tristan Behrens. Tristan researches the creative aspects of AI, and advises companies how they can leverage LLM and NLP. All opinions are his own.
About Scalytics
We enable you to make data-driven decisions in minutes, not days
Scalytics is powered by Apache Wayang, and we're proud to support the project. You can check out their public GitHub repo right here. If you're enjoying our software, show your love and support - a star ⭐ would mean a lot!
If you need professional support from our team of industry leading experts, you can always reach out to us via Slack or Email.