Vertex AI RAG Engine:开发者工具

一月 15, 2025

Generative AI and Large Language Models (LLMs) are transforming industries, but two key challenges can hinder enterprise adoption: hallucinations (generating incorrect or nonsensical information) and limited knowledge beyond their training data. Retrieval Augmented Generation (RAG) and grounding offer solutions by connecting LLMs to external data sources, enabling them to access up-to-date information and generate more factual and relevant responses.

This post explores Vertex AI RAG Engine and how it empowers software and AI developers to build robust, grounded generative AI applications.


What is RAG and why do you need it?

RAG retrieves relevant information from a knowledge base and feeds it to an LLM, allowing it to generate more accurate and informed responses. This contrasts with relying solely on the LLM's pre-trained knowledge, which can be outdated or incomplete. RAG is essential for building enterprise-grade Gen AI applications that require:

  • Accuracy: Minimizing hallucinations and ensuring responses are factually grounded.

  • Up-to-date Information: Accessing the latest data and insights.

  • Domain Expertise: Leveraging specialized knowledge bases for specific use cases.


RAG vs Grounding vs Search

  • RAG: a technique to retrieve and provide relevant information to LLMs to generate responses. The information can include fresh information, topic and context, or ground truth.

  • Grounding: Ensure the reliability and trustworthiness of AI-generated content by anchoring it to verified sources of information. Grounding may use RAG as a technique.

  • Search: an approach to quickly find and deliver relevant information from a data source based on text or multi-modal queries powered by advanced AI models.


Introducing Vertex AI RAG Engine

Vertex AI RAG Engine is a managed orchestration service, streamlining the complex process of retrieving relevant information and feeding it to an LLM. This allows developers to focus on building their applications rather than managing infrastructure.

Diagram of Vertex RAG architecture

Vertex AI RAG Engine 的主要优势:

  • Ease of Use: Get started quickly with a simple API, enabling rapid prototyping and experimentation.

  • Managed Orchestration: Handles the complexities of data retrieval and LLM integration, freeing developers from infrastructure management.

  • 自定义和开源支持:可以从各种解析、分块、注释、嵌入、矢量存储和开源模型中进行选择,也可以自定义组件。

  • 高质量的 Google 组件:利用 Google 的尖端技术实现最佳性能。

  • 集成灵活性:连接至 Pinecone、Weaviate 等各种矢量数据库,或使用 Vertex AI 进行矢量搜索。


Vertex AI RAG:一系列解决方案

Google Cloud 提供多种 RAG 和校验解决方案,可满足不同程度的复杂度和定制需求:

  • Vertex AI Search: A fully managed search engine and retriever API ideal for complex enterprise use cases requiring high out-of-the-box quality, scalability, and fine-grained access controls. It simplifies connecting to diverse enterprise data sources and enables searching across multiple sources.

  • Fully DIY RAG: For developers seeking complete control, Vertex AI provides individual component APIs (e.g., Text Embedding API, Ranking API, Grounding on Vertex AI) to build custom RAG pipelines. This approach offers maximum flexibility but requires significant development effort. Use this if you need very specific customizations or want to integrate with existing RAG frameworks.

  • Vertex AI RAG Engine: The sweet spot for developers seeking a balance between ease of use and customization. It empowers rapid prototyping and development without sacrificing flexibility.


RAG Engine 的常见行业用例:

  1. Financial Services: Personalized Investment Advice & Risk Assessment:

问题:财务顾问需要快速综合大量信息(客户资料、市场数据、监管文件和内部研究),以量身定制投资建议,提供准确的风险评估。手动审核所有信息既耗时又容易出错。

RAG Engine 解决方案:RAG Engine 可以提取并索引相关数据源。随后,财务顾问可以在系统中查询客户的具体资料和投资目标。RAG Engine 将根据相关文件(包括支持建议的引文),并基于证据提供简明的回复。这提高了顾问的工作效率,降低了人为错误风险,并增强了建议的个性化程度。该系统还可以根据注入数据中的信息,标记潜在利益冲突或监管违规行为。


2. Healthcare: Accelerated Drug Discovery & Personalized Treatment Plans:

问题:药物研发和个性化医疗重度依赖大量临床试验数据集、研究论文、患者记录和遗传信息的分析工作。筛选这些数据以确定潜在药物靶点、预测患者对治疗的反应,或制定个性化治疗计划都具有极大挑战性。

RAG Engine 解决方案:通过采取适当的隐私和安全措施,RAG 引擎可以提取和索引大量生物医学文献和患者数据。随后,研究人员可以提出复杂问题,如“药物 X 对基因型 Y 患者有什么潜在副作用?”RAG Engine 将综合各种信息源的相关信息,为研究人员提供手动搜索可能错过的洞察。对于临床医生来说,该引擎可以根据患者的独特特征和病史,在相关研究的证据支持下,帮助生成建议的个性化治疗计划。


3. Legal: Enhanced Due Diligence and Contract Review:

问题:法律专业人员在尽职调查流程、合同谈判和诉讼期间花费了大量时间审阅文件。查找相关条款、识别潜在风险,并确保遵守法规十分耗时,且需要深厚的专业知识。

RAG Engine 解决方案:RAG Engine 可以提取并索引法律文件、判例法和监管信息。法律专业人员可以查询系统,查找合同中的特定条款,识别潜在法律风险,并研究相关先例。该引擎可突出不一致问题、潜在责任和相关判例法,显著加快审阅流程并提高准确性。因此,法律专业人员可以加快完成工作,降低法律风险,并更有效地利用法律专业知识。


Getting started with Vertex AI RAG Engine

Google 提供丰富的资源来帮助您入门,其中包括:


Build grounded generative AI

Vertex AI's RAG Engine and suite of grounding solutions empower developers to build more reliable, factual, and insightful generative AI applications. By leveraging these tools, you can unlock the full potential of LLMs and overcome the challenges of hallucinations and limited knowledge, paving the way for wider enterprise adoption of generative AI. Choose the solution that best fits your needs and start building the next generation of intelligent applications.