Vespa 向量存储演示¶
如果您在 Colab 上打开此 Notebook,可能需要安装 LlamaIndex 🦙。
In [ ]:
Copied!
%pip install llama-index-vector-stores-vespa llama-index pyvespa
%pip install llama-index-vector-stores-vespa llama-index pyvespa
设置 API 密钥¶
In [ ]:
Copied!
import os
import openai
os.environ["OPENAI_API_KEY"] = "sk-..."
openai.api_key = os.environ["OPENAI_API_KEY"]
import os
import openai
os.environ["OPENAI_API_KEY"] = "sk-..."
openai.api_key = os.environ["OPENAI_API_KEY"]
加载文档并构建 VectorStoreIndex¶
In [ ]:
Copied!
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.vespa import VespaVectorStore
from IPython.display import Markdown, display
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.vespa import VespaVectorStore
from IPython.display import Markdown, display
定义示例数据¶
让我们插入一些文档。
In [ ]:
Copied!
from llama_index.core.schema import TextNode
nodes = [
TextNode(
text="The Shawshank Redemption",
metadata={
"author": "Stephen King",
"theme": "Friendship",
"year": 1994,
},
),
TextNode(
text="The Godfather",
metadata={
"director": "Francis Ford Coppola",
"theme": "Mafia",
"year": 1972,
},
),
TextNode(
text="Inception",
metadata={
"director": "Christopher Nolan",
"theme": "Fiction",
"year": 2010,
},
),
TextNode(
text="To Kill a Mockingbird",
metadata={
"author": "Harper Lee",
"theme": "Mafia",
"year": 1960,
},
),
TextNode(
text="1984",
metadata={
"author": "George Orwell",
"theme": "Totalitarianism",
"year": 1949,
},
),
TextNode(
text="The Great Gatsby",
metadata={
"author": "F. Scott Fitzgerald",
"theme": "The American Dream",
"year": 1925,
},
),
TextNode(
text="Harry Potter and the Sorcerer's Stone",
metadata={
"author": "J.K. Rowling",
"theme": "Fiction",
"year": 1997,
},
),
]
from llama_index.core.schema import TextNode
nodes = [
TextNode(
text="The Shawshank Redemption",
metadata={
"author": "Stephen King",
"theme": "Friendship",
"year": 1994,
},
),
TextNode(
text="The Godfather",
metadata={
"director": "Francis Ford Coppola",
"theme": "Mafia",
"year": 1972,
},
),
TextNode(
text="Inception",
metadata={
"director": "Christopher Nolan",
"theme": "Fiction",
"year": 2010,
},
),
TextNode(
text="To Kill a Mockingbird",
metadata={
"author": "Harper Lee",
"theme": "Mafia",
"year": 1960,
},
),
TextNode(
text="1984",
metadata={
"author": "George Orwell",
"theme": "Totalitarianism",
"year": 1949,
},
),
TextNode(
text="The Great Gatsby",
metadata={
"author": "F. Scott Fitzgerald",
"theme": "The American Dream",
"year": 1925,
},
),
TextNode(
text="Harry Potter and the Sorcerer's Stone",
metadata={
"author": "J.K. Rowling",
"theme": "Fiction",
"year": 1997,
},
),
]
初始化 VespaVectorStore¶
为了让入门过程更加简单,我们提供了一个模板化的 Vespa 应用程序,该程序将在初始化向量存储时自动部署。
这是一个高度抽象的解决方案,您拥有无限可能来根据需求定制 Vespa 应用程序。不过现在,让我们保持简单,先使用默认模板进行初始化。
In [ ]:
Copied!
from llama_index.core import StorageContext
vector_store = VespaVectorStore()
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex(nodes, storage_context=storage_context)
from llama_index.core import StorageContext
vector_store = VespaVectorStore()
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex(nodes, storage_context=storage_context)
删除文档¶
In [ ]:
Copied!
node_to_delete = nodes[0].node_id
node_to_delete
node_to_delete = nodes[0].node_id
node_to_delete
In [ ]:
Copied!
vector_store.delete(ref_doc_id=node_to_delete)
vector_store.delete(ref_doc_id=node_to_delete)
查询¶
In [ ]:
Copied!
from llama_index.core.vector_stores.types import (
VectorStoreQuery,
VectorStoreQueryMode,
)
from llama_index.core.vector_stores.types import (
VectorStoreQuery,
VectorStoreQueryMode,
)
In [ ]:
Copied!
query = VectorStoreQuery(
query_str="Great Gatsby",
mode=VectorStoreQueryMode.TEXT_SEARCH,
similarity_top_k=1,
)
result = vector_store.query(query)
query = VectorStoreQuery(
query_str="Great Gatsby",
mode=VectorStoreQueryMode.TEXT_SEARCH,
similarity_top_k=1,
)
result = vector_store.query(query)
In [ ]:
Copied!
result
result
作为检索器¶
默认查询模式(文本搜索)¶
In [ ]:
Copied!
retriever = index.as_retriever(vector_store_query_mode="default")
results = retriever.retrieve("Who directed inception?")
display(Markdown(f"**Retrieved nodes:**\n {results}"))
retriever = index.as_retriever(vector_store_query_mode="default")
results = retriever.retrieve("Who directed inception?")
display(Markdown(f"**Retrieved nodes:**\n {results}"))
In [ ]:
Copied!
retriever = index.as_retriever(vector_store_query_mode="semantic_hybrid")
results = retriever.retrieve("Who wrote Harry Potter?")
display(Markdown(f"**Retrieved nodes:**\n {results}"))
retriever = index.as_retriever(vector_store_query_mode="semantic_hybrid")
results = retriever.retrieve("Who wrote Harry Potter?")
display(Markdown(f"**Retrieved nodes:**\n {results}"))
作为查询引擎¶
In [ ]:
Copied!
query_engine = index.as_query_engine()
response = query_engine.query("Who directed inception?")
display(Markdown(f"**Response:** {response}"))
query_engine = index.as_query_engine()
response = query_engine.query("Who directed inception?")
display(Markdown(f"**Response:** {response}"))
In [ ]:
Copied!
query_engine = index.as_query_engine(
vector_store_query_mode="semantic_hybrid", verbose=True
)
response = query_engine.query(
"When was the book about the wizard boy published and what was it called?"
)
display(Markdown(f"**Response:** {response}"))
display(Markdown(f"**Sources:** {response.source_nodes}"))
query_engine = index.as_query_engine(
vector_store_query_mode="semantic_hybrid", verbose=True
)
response = query_engine.query(
"When was the book about the wizard boy published and what was it called?"
)
display(Markdown(f"**Response:** {response}"))
display(Markdown(f"**Sources:** {response.source_nodes}"))
使用元数据过滤器¶
注意:此元数据过滤功能由 llama-index 实现,在 vespa 系统外部完成。如需使用原生且性能更优的过滤方案,应当采用 Vespa 自身的过滤能力。
更多信息请参阅 Vespa 官方文档。
In [ ]:
Copied!
from llama_index.core.vector_stores import (
FilterOperator,
FilterCondition,
MetadataFilter,
MetadataFilters,
)
# Let's define a filter that will only allow nodes that has the theme "Fiction" OR is published after 1997
filters = MetadataFilters(
filters=[
MetadataFilter(key="theme", value="Fiction"),
MetadataFilter(key="year", value=1997, operator=FilterOperator.GT),
],
condition=FilterCondition.OR,
)
retriever = index.as_retriever(filters=filters)
result = retriever.retrieve("Harry Potter")
display(Markdown(f"**Result:** {result}"))
from llama_index.core.vector_stores import (
FilterOperator,
FilterCondition,
MetadataFilter,
MetadataFilters,
)
# Let's define a filter that will only allow nodes that has the theme "Fiction" OR is published after 1997
filters = MetadataFilters(
filters=[
MetadataFilter(key="theme", value="Fiction"),
MetadataFilter(key="year", value=1997, operator=FilterOperator.GT),
],
condition=FilterCondition.OR,
)
retriever = index.as_retriever(filters=filters)
result = retriever.retrieve("Harry Potter")
display(Markdown(f"**Result:** {result}"))
该集成的抽象层级¶
为了让入门变得极其简单,我们提供了一个 Vespa 应用模板,该模板会在初始化向量存储时自动部署。这降低了首次设置 Vespa 的部分复杂度,但对于实际生产环境,我们强烈建议您阅读 Vespa 文档 并根据需求定制应用。
模板说明¶
提供的 Vespa 应用模板如下所示:
from vespa.package import (
ApplicationPackage,
Field,
Schema,
Document,
HNSW,
RankProfile,
Component,
Parameter,
FieldSet,
GlobalPhaseRanking,
Function,
)
hybrid_template = ApplicationPackage(
name="hybridsearch",
schema=[
Schema(
name="doc",
document=Document(
fields=[
Field(name="id", type="string", indexing=["summary"]),
Field(name="metadata", type="string", indexing=["summary"]),
Field(
name="text",
type="string",
indexing=["index", "summary"],
index="enable-bm25",
bolding=True,
),
Field(
name="embedding",
type="tensor<float>(x[384])",
indexing=[
"input text",
"embed",
"index",
"attribute",
],
ann=HNSW(distance_metric="angular"),
is_document_field=False,
),
]
),
fieldsets=[FieldSet(name="default", fields=["text", "metadata"])],
rank_profiles=[
RankProfile(
name="bm25",
inputs=[("query(q)", "tensor<float>(x[384])")],
functions=[Function(name="bm25sum", expression="bm25(text)")],
first_phase="bm25sum",
),
RankProfile(
name="semantic",
inputs=[("query(q)", "tensor<float>(x[384])")],
first_phase="closeness(field, embedding)",
),
RankProfile(
name="fusion",
inherits="bm25",
inputs=[("query(q)", "tensor<float>(x[384])")],
first_phase="closeness(field, embedding)",
global_phase=GlobalPhaseRanking(
expression="reciprocal_rank_fusion(bm25sum, closeness(field, embedding))",
rerank_count=1000,
),
),
],
)
],
components=[
Component(
id="e5",
type="hugging-face-embedder",
parameters=[
Parameter(
"transformer-model",
{
"url": "https://github.com/vespa-engine/sample-apps/raw/master/simple-semantic-search/model/e5-small-v2-int8.onnx"
},
),
Parameter(
"tokenizer-model",
{
"url": "https://raw.githubusercontent.com/vespa-engine/sample-apps/master/simple-semantic-search/model/tokenizer.json"
},
),
],
)
],
)
请注意,字段 id
、metadata
、text
和 embedding
是集成运行的必要条件。
Schema 名称必须为 doc
,排序策略必须命名为 bm25
、semantic
和 fusion
。
除此之外,您可以根据需求自由修改,例如更换嵌入模型、添加更多字段或调整排序表达式。
更多细节请参阅 Pyvespa 示例笔记本关于 混合搜索 的部分。