Vertex AI 搜索检索器¶
本笔记本将指导您如何设置一个能够从 Vertex AI 搜索数据存储中获取数据的检索器
前提条件¶
- 创建 Google Cloud 项目
- 配置 Vertex AI Search 数据存储
- 启用 Vertex AI API
安装库¶
In [ ]:
Copied!
%pip install llama-index-retrievers-vertexai-search
%pip install llama-index-retrievers-vertexai-search
重启当前运行时¶
要在此 Jupyter 运行时中使用新安装的软件包,必须重启运行时。您可以通过运行下方单元格来实现,这将重启当前内核。
In [ ]:
Copied!
# Colab only
# Automatically restart kernel after installs so that your environment can access the new packages
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)
# Colab only
# Automatically restart kernel after installs so that your environment can access the new packages
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)
验证笔记本环境(仅限 Colab)¶
如果您在 Google Colab 上运行此笔记本,需要先验证您的环境。请执行下方新增的代码单元来完成验证。若您使用的是 Vertex AI Workbench,则无需此步骤。
In [ ]:
Copied!
# Colab only
import sys
if "google.colab" in sys.modules:
from google.colab import auth
auth.authenticate_user()
# Colab only
import sys
if "google.colab" in sys.modules:
from google.colab import auth
auth.authenticate_user()
In [ ]:
Copied!
# If you're using JupyterLab instance, uncomment and run the below code.
#!gcloud auth login
# If you're using JupyterLab instance, uncomment and run the below code.
#!gcloud auth login
In [ ]:
Copied!
from llama_index.retrievers.vertexai_search import VertexAISearchRetriever
# Please note it's underscore '_' in vertexai_search
from llama_index.retrievers.vertexai_search import VertexAISearchRetriever
# Please note it's underscore '_' in vertexai_search
设置 Google Cloud 项目信息并初始化 Vertex AI SDK¶
要开始使用 Vertex AI,您必须拥有现有的 Google Cloud 项目并启用 Vertex AI API。
了解更多关于设置项目和开发环境的信息。
In [ ]:
Copied!
PROJECT_ID = "{your project id}" # @param {type:"string"}
LOCATION = "us-central1" # @param {type:"string"}
import vertexai
vertexai.init(project=PROJECT_ID, location=LOCATION)
PROJECT_ID = "{your project id}" # @param {type:"string"}
LOCATION = "us-central1" # @param {type:"string"}
import vertexai
vertexai.init(project=PROJECT_ID, location=LOCATION)
测试结构化数据存储¶
In [ ]:
Copied!
DATA_STORE_ID = "{your id}" # @param {type:"string"}
LOCATION_ID = "global"
DATA_STORE_ID = "{your id}" # @param {type:"string"}
LOCATION_ID = "global"
In [ ]:
Copied!
struct_retriever = VertexAISearchRetriever(
project_id=PROJECT_ID,
data_store_id=DATA_STORE_ID,
location_id=LOCATION_ID,
engine_data_type=1,
)
struct_retriever = VertexAISearchRetriever(
project_id=PROJECT_ID,
data_store_id=DATA_STORE_ID,
location_id=LOCATION_ID,
engine_data_type=1,
)
In [ ]:
Copied!
query = "harry potter"
retrieved_results = struct_retriever.retrieve(query)
query = "harry potter"
retrieved_results = struct_retriever.retrieve(query)
In [ ]:
Copied!
print(retrieved_results[0])
print(retrieved_results[0])
测试非结构化数据存储¶
In [ ]:
Copied!
DATA_STORE_ID = "{your id}"
LOCATION_ID = "global"
DATA_STORE_ID = "{your id}"
LOCATION_ID = "global"
In [ ]:
Copied!
unstruct_retriever = VertexAISearchRetriever(
project_id=PROJECT_ID,
data_store_id=DATA_STORE_ID,
location_id=LOCATION_ID,
engine_data_type=0,
)
unstruct_retriever = VertexAISearchRetriever(
project_id=PROJECT_ID,
data_store_id=DATA_STORE_ID,
location_id=LOCATION_ID,
engine_data_type=0,
)
In [ ]:
Copied!
query = "alphabet 2018 earning"
retrieved_results2 = unstruct_retriever.retrieve(query)
query = "alphabet 2018 earning"
retrieved_results2 = unstruct_retriever.retrieve(query)
In [ ]:
Copied!
print(retrieved_results2[0])
print(retrieved_results2[0])
测试网站数据存储¶
In [ ]:
Copied!
DATA_STORE_ID = "{your id}"
LOCATION_ID = "global"
website_retriever = VertexAISearchRetriever(
project_id=PROJECT_ID,
data_store_id=DATA_STORE_ID,
location_id=LOCATION_ID,
engine_data_type=2,
)
DATA_STORE_ID = "{your id}"
LOCATION_ID = "global"
website_retriever = VertexAISearchRetriever(
project_id=PROJECT_ID,
data_store_id=DATA_STORE_ID,
location_id=LOCATION_ID,
engine_data_type=2,
)
In [ ]:
Copied!
query = "what's diamaxol"
retrieved_results3 = website_retriever.retrieve(query)
query = "what's diamaxol"
retrieved_results3 = website_retriever.retrieve(query)
In [ ]:
Copied!
print(retrieved_results3[0])
print(retrieved_results3[0])
在查询引擎中的使用¶
In [ ]:
Copied!
# import modules needed
from llama_index.core import Settings
from llama_index.llms.vertex import Vertex
from llama_index.embeddings.vertex import VertexTextEmbedding
# import modules needed
from llama_index.core import Settings
from llama_index.llms.vertex import Vertex
from llama_index.embeddings.vertex import VertexTextEmbedding
In [ ]:
Copied!
vertex_gemini = Vertex(
model="gemini-1.5-pro",
temperature=0,
context_window=100000,
additional_kwargs={},
)
# setup the index/query llm
Settings.llm = vertex_gemini
vertex_gemini = Vertex(
model="gemini-1.5-pro",
temperature=0,
context_window=100000,
additional_kwargs={},
)
# setup the index/query llm
Settings.llm = vertex_gemini
In [ ]:
Copied!
from llama_index.core.query_engine import RetrieverQueryEngine
query_engine = RetrieverQueryEngine.from_args(struct_retriever)
from llama_index.core.query_engine import RetrieverQueryEngine
query_engine = RetrieverQueryEngine.from_args(struct_retriever)
In [ ]:
Copied!
response = query_engine.query("Tell me about harry potter")
print(str(response))
response = query_engine.query("Tell me about harry potter")
print(str(response))