In [ ]:
Copied!
import os
import openai
os.environ["OPENAI_API_KEY"] = "sk-..."
openai.api_key = os.environ["OPENAI_API_KEY"]
import os
import openai
os.environ["OPENAI_API_KEY"] = "sk-..."
openai.api_key = os.environ["OPENAI_API_KEY"]
安装配置¶
在本笔记本中,我们将使用文档中两个非常相似的页面,每个页面分别存储在不同的索引中。
In [ ]:
Copied!
from llama_index.core import SimpleDirectoryReader
documents_1 = SimpleDirectoryReader(
input_files=["../../community/integrations/vector_stores.md"]
).load_data()
documents_2 = SimpleDirectoryReader(
input_files=["../../module_guides/storing/vector_stores.md"]
).load_data()
from llama_index.core import SimpleDirectoryReader
documents_1 = SimpleDirectoryReader(
input_files=["../../community/integrations/vector_stores.md"]
).load_data()
documents_2 = SimpleDirectoryReader(
input_files=["../../module_guides/storing/vector_stores.md"]
).load_data()
In [ ]:
Copied!
from llama_index.core import VectorStoreIndex
index_1 = VectorStoreIndex.from_documents(documents_1)
index_2 = VectorStoreIndex.from_documents(documents_2)
from llama_index.core import VectorStoreIndex
index_1 = VectorStoreIndex.from_documents(documents_1)
index_2 = VectorStoreIndex.from_documents(documents_2)
融合索引!¶
在此步骤中,我们将多个索引融合为一个统一的检索器。该检索器还会通过生成与原始问题相关的额外查询来增强查询效果,并汇总检索结果。
此配置将执行4次查询:一次使用原始查询,另外生成3个相关查询。
默认情况下,系统使用以下提示模板生成额外查询:
QUERY_GEN_PROMPT = (
"You are a helpful assistant that generates multiple search queries based on a "
"single input query. Generate {num_queries} search queries, one on each line, "
"related to the following input query:\n"
"Query: {query}\n"
"Queries:\n"
)
In [ ]:
Copied!
from llama_index.core.retrievers import QueryFusionRetriever
retriever = QueryFusionRetriever(
[index_1.as_retriever(), index_2.as_retriever()],
similarity_top_k=2,
num_queries=4, # set this to 1 to disable query generation
use_async=True,
verbose=True,
# query_gen_prompt="...", # we could override the query generation prompt here
)
from llama_index.core.retrievers import QueryFusionRetriever
retriever = QueryFusionRetriever(
[index_1.as_retriever(), index_2.as_retriever()],
similarity_top_k=2,
num_queries=4, # set this to 1 to disable query generation
use_async=True,
verbose=True,
# query_gen_prompt="...", # we could override the query generation prompt here
)
In [ ]:
Copied!
# apply nested async to run in a notebook
import nest_asyncio
nest_asyncio.apply()
# apply nested async to run in a notebook
import nest_asyncio
nest_asyncio.apply()
In [ ]:
Copied!
nodes_with_scores = retriever.retrieve("How do I setup a chroma vector store?")
nodes_with_scores = retriever.retrieve("How do I setup a chroma vector store?")
Generated queries: 1. What are the steps to set up a chroma vector store? 2. Best practices for configuring a chroma vector store 3. Troubleshooting common issues when setting up a chroma vector store
In [ ]:
Copied!
for node in nodes_with_scores:
print(f"Score: {node.score:.2f} - {node.text[:100]}...")
for node in nodes_with_scores:
print(f"Score: {node.score:.2f} - {node.text[:100]}...")
Score: 0.78 - # Vector Stores Vector stores contain embedding vectors of ingested document chunks (and sometimes ... Score: 0.78 - # Using Vector Stores LlamaIndex offers multiple integration points with vector stores / vector dat...
在查询引擎中使用!¶
现在,我们可以将检索器接入查询引擎,用于合成自然语言响应。
In [ ]:
Copied!
from llama_index.core.query_engine import RetrieverQueryEngine
query_engine = RetrieverQueryEngine.from_args(retriever)
from llama_index.core.query_engine import RetrieverQueryEngine
query_engine = RetrieverQueryEngine.from_args(retriever)
In [ ]:
Copied!
response = query_engine.query(
"How do I setup a chroma vector store? Can you give an example?"
)
response = query_engine.query(
"How do I setup a chroma vector store? Can you give an example?"
)
Generated queries: 1. How to set up a chroma vector store? 2. Step-by-step guide for creating a chroma vector store. 3. Examples of chroma vector store setups and configurations.
In [ ]:
Copied!
from llama_index.core.response.notebook_utils import display_response
display_response(response)
from llama_index.core.response.notebook_utils import display_response
display_response(response)
Final Response: To set up a Chroma vector store, you need to follow these steps:
- Import the necessary libraries:
import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore
- Create a Chroma client:
chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection("quickstart")
- Construct the vector store:
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
Here's an example of how to set up a Chroma vector store using the above steps:
import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore
# Creating a Chroma client
# EphemeralClient operates purely in-memory, PersistentClient will also save to disk
chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection("quickstart")
# construct vector store
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
This example demonstrates how to create a Chroma client, create a collection named "quickstart", and then construct a Chroma vector store using that collection.