Retrieval-Augmented Generation (RAG) is an AI architecture that enhances large language models (LLMs) by combining real-time information retrieval with generative reasoning. Instead of relying solely on pre-trained model knowledge, RAG systems query external data sources, retrieve relevant content, and feed it into the model’s prompt context to generate accurate, up-to-date, and domain-specific responses.