RAG Revolution: Enhancing Language Models with Knowledge Graphs and Multi-Agent Systems

Exploring the Evolution of Retrieval Augmented Generation: From Vector Databases to Knowledge Graphs and Multi-Agent Systems.

Jamie Horsnell

6/26/20243 min read

Retrieval augmented generation (RAG) was introduced in 2020 and has since become a core pillar of GenAIOps for enterprise applications. It overcomes significant hurdles of language models, such as limited knowledge and hallucination, by grounding the LLM in facts to support generating a response.

It achieves this through chunking and encoding textual data (turning text into numbers so the computer can understand and process it), indexing it for efficient retrieval, encoding the query, and semantically searching to fetch relevant passages. These retrieved passages are then combined with the query and taken to the LLM to generate a response.

The primary implementation of RAG has been through vector databases, marking a significant advancement for LLM applications. However, RAG using vector databases comes with certain limitations. Converting complex text into single vectors and treating each chunk in isolation can hinder deep reasoning capabilities. Additionally, the reliance on accurately interpreting the semantic intent of the query places greater emphasis on high-quality prompting to generate relevant outputs.

Efforts to address these limitations have included optimizing chunk sizes, rewriting queries, and employing parent retrievers. While these strategies have shown subtle improvements, they haven’t led to any substantial breakthroughs.

Knowledge Graphing (KG) addresses these limitations by capturing the interrelations between the chunks of information, allowing for multi-hop reasoning and enabling the computer to understand the relationships and context between different pieces of information.

How?

In a process similar to how vector databases work, Knowledge Graphing chunks the initial text, then applies an LLM to identify entities and relationships within these chunks. Entities are recognized as key concepts or objects (nodes) in the text, while relationships describe connections between these entities. After identification, the LLM can be used again to map these entities and relationships according to predefined ontologies, creating a structured representation of knowledge with much greater accuracy than traditional NER networks due to the capabilities of recent models such as GPT-4 to understand the semantics of relationships.

Defining the ontology is a crucial step. This refers to a structured framework/model that defines the entities, concepts, and relationships within a specific domain of knowledge. Humans define the foundational ingredients of the ontology or KG, allowing the LLM to process the entities and relationships from large volumes of text. It is essential for enterprise applications to carefully plan the relevant entities and relationships central to the query problem to ensure the accuracy and relevance of generated insights, which ultimately determine the overall effectiveness of the output.

Once the initial KG is built, you can further enhance the initial connections through hybrid learning, deepening the understanding and relationships between the nodes and chunks of information by using graph machine-learning techniques like semantic aggregation and hierarchical layering. This allows enterprise AI applications to be more sophisticated, explainable, and contextually relevant.

However, several issues come with knowledge graphs, such as constructing and maintaining them (especially with large datasets), which can be extremely difficult and complex. You also have to deal with noise management, ensuring the removal of irrelevant information and redundant data, alongside high computational costs and slower performance.

Furthermore, using LLMs to recognize entities and map their relationships brings its own set of issues. Hallucinations can occur by misidentifying entities or miscalculating the strength of complex entity relationships, highlighting the importance of a sophisticated framework for building KGs and keeping a human in the loop to ensure reliability.

Another popular approach to breaking down complex queries and tasks is using multi-agent AI systems. This involves multiple interacting agents, each with specialized tasks or goals, collaborating to solve complex problems or simulate behaviors. These agents can dynamically interact and adjust their actions based on interactions with other agents and the environment, helping to provide causation/effect links in more natural environments.

For example, let's say you want to implement a multi-agent AI system to manage and optimize traffic flow in real time. You would have an agent to track the traffic lights, an agent for each car, sensory agents, and a central coordination agent. The traffic light agents adjust signal timings based on real-time conditions and coordinate with nearby lights to reduce congestion. Vehicle agents optimize routes by receiving and acting on traffic data. Sensor agents detect incidents like accidents and communicate with traffic lights and vehicle agents to reroute traffic accordingly. The central coordination agent analyzes all this information together to make proactive adjustments to traffic signals and routes to prevent congestion.

While this approach may not be quite as precise as KGs, multi-agent systems offer more flexibility in handling dynamic, complex environments and are much easier to scale by adding automation agents. However, KGs offer more efficient retrieval and enhanced contextual understanding.

There are several hurdles to overcome when using KGs or GraphRAG, but it promises a future of explainable and more precise models. Assessing the tradeoff of quality for cost, speed, and complexity for your enterprise application is essential when determining which RAG technique to use for your application.