如何在 LlamaIndex 中使用 UpTrain¶
问题:主要存在两个问题:
- 大多数大型语言模型训练所用的数据并不能代表其实际应用场景中的数据。这会导致训练分布与测试分布不匹配,从而可能引发性能不佳的问题。
- 大型语言模型生成的结果并不总是可靠的。其响应可能与提示不相关、不符合预期语气或上下文,甚至可能包含冒犯性内容等。
解决方案:上述两个问题分别由两种工具解决,我们将展示如何结合使用它们:
- LlamaIndex 通过支持使用针对自有数据微调的检索器进行检索增强生成(RAG),从而解决第一个问题。这允许您使用自有数据微调检索器,然后利用该检索器执行 RAG。
- UpTrain 通过支持对生成响应进行评估来解决第二个问题。这有助于确保响应与提示相关、符合预期语气或上下文,且不包含冒犯性内容等。
安装 UpTrain 和 LlamaIndex¶
%pip install -qU uptrain llama-index
Note: you may need to restart the kernel to use updated packages.
导入所需库¶
import httpx
import os
import openai
import pandas as pd
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from uptrain import Evals, EvalLlamaIndex, Settings as UpTrainSettings
/Users/dhruvchawla/Work/llama_index/venv/lib/python3.11/site-packages/lazy_loader/__init__.py:185: RuntimeWarning: subpackages can technically be lazily loaded, but it causes the package to be eagerly loaded even if it is already lazily loaded.So, you probably shouldn't use subpackages with this lazy feature. warnings.warn(msg, RuntimeWarning)
为查询引擎创建数据集文件夹¶
您可以使用手头的任何文档来完成此操作。在本教程中,我们将使用从维基百科提取的纽约市相关数据。我们仅会向文件夹中添加一个文档,但您可以根据需要添加任意数量的文档。
url = "https://uptrain-assets.s3.ap-south-1.amazonaws.com/data/nyc_text.txt"
if not os.path.exists("nyc_wikipedia"):
os.makedirs("nyc_wikipedia")
dataset_path = os.path.join("./nyc_wikipedia", "nyc_text.txt")
if not os.path.exists(dataset_path):
r = httpx.get(url)
with open(dataset_path, "wb") as f:
f.write(r.content)
生成查询列表¶
在生成响应之前,我们需要先创建一组查询。由于该查询引擎是基于纽约市数据进行训练的,因此我们将创建一系列与纽约市相关的查询。
data = [
{"question": "What is the population of New York City?"},
{"question": "What is the area of New York City?"},
{"question": "What is the largest borough in New York City?"},
{"question": "What is the average temperature in New York City?"},
{"question": "What is the main airport in New York City?"},
{"question": "What is the famous landmark in New York City?"},
{"question": "What is the official language of New York City?"},
{"question": "What is the currency used in New York City?"},
{"question": "What is the time zone of New York City?"},
{"question": "What is the famous sports team in New York City?"},
]
本笔记本使用 OpenAI API 为提示生成文本,并创建向量存储索引。因此,请将 openai.api_key 设置为您的 OpenAI API 密钥。
openai.api_key = "sk-************************" # your OpenAI API key
使用 LlamaIndex 创建查询引擎¶
让我们通过 LLamaIndex 创建一个向量存储索引,然后将其作为查询引擎来检索文档中的相关章节。
Settings.chunk_size = 512
documents = SimpleDirectoryReader("./nyc_wikipedia/").load_data()
vector_index = VectorStoreIndex.from_documents(
documents,
)
query_engine = vector_index.as_query_engine()
安装配置¶
UpTrain 为您提供以下功能:
- 具备高级钻取和筛选选项的分析看板
- 失败案例中的关键洞察与共性主题分析
- 生产数据的可观测性与实时监控
- 通过与 CI/CD 流水线无缝集成的回归测试
您可以通过以下两种方式选择使用 UpTrain 进行评估:
方案一:使用 UpTrain 开源软件(OSS)进行评估¶
您可以使用开源评估服务来评估模型。这种情况下,您需要提供一个 OpenAI API 密钥。您可以通过此链接获取密钥。
若要在 UpTrain 仪表盘中查看评估结果,您需要通过终端运行以下命令进行设置:
git clone https://github.com/uptrain-ai/uptrain
cd uptrain
bash run_uptrain.sh
这将在您的本地机器上启动 UpTrain 仪表盘,您可以通过 http://localhost:3000/dashboard 访问。
注意: project_name 将作为项目名称,所有评估结果都会在 UpTrain 仪表盘中以此名称显示。
settings = UpTrainSettings(
openai_api_key=openai.api_key,
)
创建 EvalLlamaIndex 对象¶
完成查询引擎的创建后,我们可以利用它来构建 EvalLlamaIndex 对象。该对象将用于生成查询对应的响应结果。
llamaindex_object = EvalLlamaIndex(
settings=settings, query_engine=query_engine
)
results = llamaindex_object.evaluate(
project_name="uptrain-llama-index",
evaluation_name="nyc_wikipedia", # adding project and evaluation names allow you to track the results in the UpTrain dashboard
data=data,
checks=[Evals.CONTEXT_RELEVANCE, Evals.RESPONSE_CONCISENESS],
)
100%|██████████| 10/10 [00:02<00:00, 3.94it/s] 100%|██████████| 10/10 [00:03<00:00, 3.12it/s]
pd.DataFrame(results)
| question | response | context | score_context_relevance | explanation_context_relevance | score_response_conciseness | explanation_response_conciseness | |
|---|---|---|---|---|---|---|---|
| 0 | What is the population of New York City? | The population of New York City is 8,804,190 a... | === Population density ===\n\nIn 2020, the cit... | None | None | None | None |
| 1 | What is the area of New York City? | New York City has a total area of 468.484 squa... | Some of the natural relief in topography has b... | None | None | None | None |
| 2 | What is the largest borough in New York City? | Queens is the largest borough in New York City. | ==== Brooklyn ====\nBrooklyn (Kings County), o... | None | None | None | None |
| 3 | What is the average temperature in New York City? | The average temperature in New York City is 33... | Similarly, readings of 0 °F (−18 °C) are also ... | None | None | None | None |
| 4 | What is the main airport in New York City? | John F. Kennedy International Airport | along the Northeast Corridor, and long-distanc... | None | None | None | None |
| 5 | What is the famous landmark in New York City? | The famous landmark in New York City is the St... | The settlement was named New Amsterdam (Dutch:... | None | None | None | None |
| 6 | What is the official language of New York City? | As many as 800 languages are spoken in New Yor... | === Accent and dialect ===\n\nThe New York are... | None | None | None | None |
| 7 | What is the currency used in New York City? | The currency used in New York City is the US D... | === Real estate ===\n\nReal estate is a major ... | None | None | None | None |
| 8 | What is the time zone of New York City? | Eastern Standard Time (EST) | According to the New York City Comptroller, wo... | None | None | None | None |
| 9 | What is the famous sports team in New York City? | The famous sports team in New York City is the... | ==== Soccer ====\nIn soccer, New York City is ... | None | None | None | None |
方案二:使用 UpTrain 托管服务和仪表盘进行评估¶
您也可以选择使用 UpTrain 的托管服务来评估模型。通过此处创建免费 UpTrain 账户即可获得试用额度。如需更多试用积分,可在此预约与 UpTrain 维护团队的会议。
使用托管服务的优势包括:
- 无需在本地机器上搭建 UpTrain 仪表盘
- 可直接使用多种大语言模型,无需配置其 API 密钥
完成评估后,您可以在 UpTrain 仪表盘 https://dashboard.uptrain.ai/dashboard 查看结果
注意: project_name 将作为项目名称显示在 UpTrain 仪表盘中,所有相关评估结果都会归类在该项目下。
UPTRAIN_API_KEY = "up-**********************" # your UpTrain API key
# We use `uptrain_access_token` parameter instead of 'openai_api_key' in settings in this case
settings = UpTrainSettings(
uptrain_access_token=UPTRAIN_API_KEY,
)
创建 EvalLlamaIndex 对象¶
在完成查询引擎的创建后,我们可以利用它来构建 EvalLlamaIndex 对象。该对象将用于生成查询对应的响应结果。
llamaindex_object = EvalLlamaIndex(
settings=settings, query_engine=query_engine
)
results = llamaindex_object.evaluate(
project_name="uptrain-llama-index",
evaluation_name="nyc_wikipedia", # adding project and evaluation names allow you to track the results in the UpTrain dashboard
data=data,
checks=[Evals.CONTEXT_RELEVANCE, Evals.RESPONSE_CONCISENESS],
)
2024-01-23 18:36:57.815 | INFO | uptrain.framework.remote:log_and_evaluate:507 - Sending evaluation request for rows 0 to <50 to the Uptrain server
pd.DataFrame(results)
| question | response | context | score_context_relevance | explanation_context_relevance | score_response_conciseness | explanation_response_conciseness | |
|---|---|---|---|---|---|---|---|
| 0 | What is the population of New York City? | The population of New York City is 8,804,190 a... | New York, often called New York City or NYC, i... | 1.0 | The question asks for the population of New Yo... | 1.0 | The question asks for the population of New Yo... |
| 1 | What is the area of New York City? | The area of New York City is 468.484 square mi... | New York, often called New York City or NYC, i... | 1.0 | Step 1: The question asks for the area of New ... | 1.0 | The question asks for the area of New York Cit... |
| 2 | What is the largest borough in New York City? | Queens is the largest borough in New York City. | ==== Brooklyn ====\nBrooklyn (Kings County), o... | 0.5 | Step 1: The question is asking for the largest... | 1.0 | The question asks for the largest borough in N... |
| 3 | What is the average temperature in New York City? | The average temperature in New York City is 57... | Similarly, readings of 0 °F (−18 °C) are also ... | 0.5 | The question asks for the average temperature ... | 1.0 | The question asks for the average temperature ... |
| 4 | What is the main airport in New York City? | The main airport in New York City is John F. K... | along the Northeast Corridor, and long-distanc... | 1.0 | The question is "What is the main airport in N... | 1.0 | The question asks for the main airport in New ... |
| 5 | What is the famous landmark in New York City? | The famous landmark in New York City is the Em... | A record 66.6 million tourists visited New Yor... | 1.0 | The question asks for the famous landmark in N... | 1.0 | The question asks for the famous landmark in N... |
| 6 | What is the official language of New York City? | The official language of New York City is not ... | === Accent and dialect ===\n\nThe New York are... | 0.0 | The question is asking for the official langua... | 0.0 | The question asks for the official language of... |
| 7 | What is the currency used in New York City? | The currency used in New York City is the Unit... | === Real estate ===\n\nReal estate is a major ... | 0.0 | The question is "What is the currency used in ... | 1.0 | The question asks specifically for the currenc... |
| 8 | What is the time zone of New York City? | Eastern Standard Time (EST) | According to the New York City Comptroller, wo... | 0.0 | The question is "What is the time zone of New ... | 1.0 | The question asks for the time zone of New Yor... |
| 9 | What is the famous sports team in New York City? | The famous sports team in New York City is the... | ==== Baseball ====\nNew York has been describe... | 1.0 | The question asks for the famous sports team i... | 1.0 | The question asks for the famous sports team i... |

