RIG & RAG: Grounding AI in Reality with Super-Trustworthy Data
Manage episode 443721458 series 3605861
Google Research has developed a new set of open models, known as DataGemma, that aim to ground large language models (LLMs) in real-world data using Google's Data Commons knowledge graph. DataGemma's primary goal is to improve the factuality and trustworthiness of LLMs by mitigating the risk of hallucinations, which occur when LLMs generate incorrect or misleading information. The models leverage Data Commons' natural language interface to access and incorporate real-world data into LLM responses. This is achieved through two methods: Retrieval Interleaved Generation (RIG) and Retrieval Augmented Generation (RAG). RIG fine-tunes the model to identify statistics within its responses and annotate them with a call to Data Commons, while RAG retrieves relevant information from Data Commons before the LLM generates text, providing a factual foundation for its response. DataGemma is an important step towards developing more grounded and reliable AI systems.
Read more here: https://research.google/blog/grounding-ai-in-reality-with-a-little-help-from-data-commons/
38 episoade