常见问题解答 (FAQ)#

Tip

如果尚未完成，请先安装 LlamaIndex 并完成入门教程。若遇到不熟悉的术语，请查阅核心概念。

本节将从入门示例的代码出发，展示最常见的定制化方案：

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

"我需要将文档分割为更小的片段"#

# 全局设置
from llama_index.core import Settings

Settings.chunk_size = 512

# 局部设置
from llama_index.core.node_parser import SentenceSplitter

index = VectorStoreIndex.from_documents(
    documents, transformations=[SentenceSplitter(chunk_size=512)]
)

"我想使用不同的向量数据库"#

首先安装目标向量数据库。例如使用 Chroma 时可通过 pip 安装：

pip install llama-index-vector-stores-chroma

更多集成方案请访问 LlamaHub。

然后在代码中调用：

import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext

chroma_client = chromadb.PersistentClient()
chroma_collection = chroma_client.create_collection("quickstart")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

StorageContext 定义了文档、嵌入向量和索引的存储后端。了解更多关于存储和定制存储的内容。

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

"查询时需要获取更多上下文"#

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(similarity_top_k=5)
response = query_engine.query("What did the author do growing up?")
print(response)

as_query_engine 会在索引基础上构建默认的 retriever 和 query engine。通过关键字参数可配置检索器和查询引擎。此处我们将检索器配置为返回5个最相似文档（默认值为2）。了解更多关于检索器和查询引擎的内容。

"我想使用不同的LLM模型"#

# 全局设置
from llama_index.core import Settings
from llama_index.llms.ollama import Ollama

Settings.llm = Ollama(
    model="mistral",
    request_timeout=60.0,
    # 手动设置上下文窗口以控制内存占用
    context_window=8000,
)

# 局部设置
index.as_query_engine(
    llm=Ollama(
        model="mistral",
        request_timeout=60.0,
        # 手动设置上下文窗口以控制内存占用
        context_window=8000,
    )
)

了解更多关于定制LLM的内容。

"我想使用不同的响应模式"#

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(response_mode="tree_summarize")
response = query_engine.query("What did the author do growing up?")
print(response)

了解更多关于查询引擎和响应模式的内容。

"我需要流式传输响应结果"#

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("What did the author do growing up?")
response.print_response_stream()

了解更多关于流式响应的内容。

"我需要聊天机器人而非问答模式"#

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_chat_engine()
response = query_engine.chat("What did the author do growing up?")
print(response)

response = query_engine.chat("Oh interesting, tell me more.")
print(response)

了解更多关于聊天引擎的内容。

后续步骤#

需要全面了解（几乎）所有可配置项？开始学习理解 LlamaIndex。
需要深入了解特定模块？查看组件指南。