NVIDIA NIM 微服务¶
llama-index-postprocessor-nvidia-rerank
软件包包含与 NVIDIA NIM 推理微服务模型构建应用的 LlamaIndex 集成组件。NIM 支持来自社区及 NVIDIA 的跨领域模型,包括对话、嵌入和重排序模型。这些模型经过 NVIDIA 优化,可在 NVIDIA 加速基础设施上实现最佳性能,并以 NIM 形式部署——即开箱即用的预构建容器,通过单一命令即可在 NVIDIA 加速基础设施上随处部署。
可通过 NVIDIA API 目录 测试 NVIDIA 托管的 NIM 部署。测试完成后,企业可使用 NVIDIA AI Enterprise 许可证从 NVIDIA API 目录导出 NIM,并在本地或云端运行,从而完全掌控自身知识产权和 AI 应用的所有权。
NIM 按模型打包为容器镜像,通过 NVIDIA NGC 目录以 NGC 容器镜像形式分发。其核心是为 AI 模型推理提供简单、一致且熟悉的 API 接口。
NVIDIA 重排序连接器¶
本示例演示如何通过 NVIDIARerank
类,使用 LlamaIndex 与支持的 NVIDIA 检索问答排序模型 进行交互,实现 检索增强生成。
多源结果融合¶
考虑一个包含语义存储(如 VectorStoreIndex)和 BM25 存储的数据管道。
每个存储会独立执行查询,并返回各自认为高度相关的结果。而判断这些结果的整体相关性,正是重排序发挥作用的地方。
请参照高级用法 - 混合检索器 + 重排序用例,将重排序器替换为——
安装¶
%pip install --upgrade --quiet llama-index-postprocessor-nvidia-rerank llama-index-llms-nvidia llama-index-readers-file
import getpass
import os
# del os.environ['NVIDIA_API_KEY'] ## delete key and reset
if os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
print("Valid NVIDIA_API_KEY already in environment. Delete to reset")
else:
nvapi_key = getpass.getpass("NVAPI Key (starts with nvapi-): ")
assert nvapi_key.startswith(
"nvapi-"
), f"{nvapi_key[:5]}... is not a valid key"
os.environ["NVIDIA_API_KEY"] = nvapi_key
使用 API 目录¶
from llama_index.postprocessor.nvidia_rerank import NVIDIARerank
from llama_index.core import SimpleDirectoryReader, Settings, VectorStoreIndex
from llama_index.embeddings.nvidia import NVIDIAEmbedding
from llama_index.llms.nvidia import NVIDIA
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import Settings
import os
reranker = NVIDIARerank(top_n=4)
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
!mkdir data
!wget "https://www.dropbox.com/scl/fi/p33j9112y0ysgwg77fdjz/2021_Housing_Inventory.pdf?rlkey=yyok6bb18s5o31snjd2dxkxz3&dl=0" -O "data/housing_data.pdf"
mkdir: cannot create directory ‘data’: File exists --2024-07-03 10:33:17-- https://www.dropbox.com/scl/fi/p33j9112y0ysgwg77fdjz/2021_Housing_Inventory.pdf?rlkey=yyok6bb18s5o31snjd2dxkxz3&dl=0 Resolving www.dropbox.com (www.dropbox.com)... 162.125.81.18, 2620:100:6031:18::a27d:5112 Connecting to www.dropbox.com (www.dropbox.com)|162.125.81.18|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com/cd/0/inline/CV9Hy3nIrjnOf-Fqsgd-YhHcMaj0AHvOQaE1b4sdiKnOBqZL_u9ml6dAGctGxr5I79yD_kI8BNwDtFl_ll_sdfdt0iXcIYosfxaPr2NdbkRAMR6vg9UXuCU8kNEFi0D3Grs/file# [following] --2024-07-03 10:33:18-- https://uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com/cd/0/inline/CV9Hy3nIrjnOf-Fqsgd-YhHcMaj0AHvOQaE1b4sdiKnOBqZL_u9ml6dAGctGxr5I79yD_kI8BNwDtFl_ll_sdfdt0iXcIYosfxaPr2NdbkRAMR6vg9UXuCU8kNEFi0D3Grs/file Resolving uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com (uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com)... 162.125.81.15, 2620:100:6031:15::a27d:510f Connecting to uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com (uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com)|162.125.81.15|:443... connected. HTTP request sent, awaiting response... 302 Found Location: /cd/0/inline2/CV9Ugj_mK7TSMb3sw_BdQFrj2rzx-SI2cfGU7-VF4bcW3PdhxO4qw--AXQKUidWtDL_54rViwvbaBGHMvtMEAK_lCIwXXj5XwkKpJKTmP0mDrz8eU2qu0FGyi4uOGfO7TeNLFMFY_bBGUMHMatvKJVPF59Ps94-8LC40ba-Cgv2YKZtcU-UjFpLh-Fnf6emkG-c8eUWB2uKPX_Lx0E4hCENQEPOGOfMhDHU0DC8k6khZiilmLtjXsDJ0H4y3efQ-Fz-VsWCC2FcoGpDcxXGu1Ysp5-mP2eHpH3qOx20d2IrndwN4RGLAqzR6cfsOHPMvoYPyLjOW1322t1O46mXqcjv94OPEEIIHI-2K8xL4pBjLUQ/file [following] --2024-07-03 10:33:18-- https://uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com/cd/0/inline2/CV9Ugj_mK7TSMb3sw_BdQFrj2rzx-SI2cfGU7-VF4bcW3PdhxO4qw--AXQKUidWtDL_54rViwvbaBGHMvtMEAK_lCIwXXj5XwkKpJKTmP0mDrz8eU2qu0FGyi4uOGfO7TeNLFMFY_bBGUMHMatvKJVPF59Ps94-8LC40ba-Cgv2YKZtcU-UjFpLh-Fnf6emkG-c8eUWB2uKPX_Lx0E4hCENQEPOGOfMhDHU0DC8k6khZiilmLtjXsDJ0H4y3efQ-Fz-VsWCC2FcoGpDcxXGu1Ysp5-mP2eHpH3qOx20d2IrndwN4RGLAqzR6cfsOHPMvoYPyLjOW1322t1O46mXqcjv94OPEEIIHI-2K8xL4pBjLUQ/file Reusing existing connection to uc471d2c8af935aa4ab2f86937a6.dl.dropboxusercontent.com:443. HTTP request sent, awaiting response... 200 OK Length: 4808625 (4.6M) [application/pdf] Saving to: ‘data/housing_data.pdf’ data/housing_data.p 100%[===================>] 4.58M 2.68MB/s in 1.7s 2024-07-03 10:33:21 (2.68 MB/s) - ‘data/housing_data.pdf’ saved [4808625/4808625]
Settings.text_splitter = SentenceSplitter(chunk_size=500)
documents = SimpleDirectoryReader("./data").load_data()
Settings.embed_model = NVIDIAEmbedding(model="NV-Embed-QA", truncate="END")
index = VectorStoreIndex.from_documents(documents)
Settings.llm = NVIDIA()
query_engine = index.as_query_engine(
similarity_top_k=20, node_postprocessors=[reranker]
)
response = query_engine.query(
"What was the net gain in housing units in the Mission in 2021?"
)
print(response)
The net gain in housing units in the Mission in 2021 was not specified in the provided context information.
使用 NVIDIA NIMs¶
除了连接托管的 NVIDIA NIMs 服务外,该连接器还可用于连接本地微服务实例。这帮助您在需要时将应用程序本地化运行。
有关如何设置本地微服务实例的说明,请参阅 https://developer.nvidia.com/blog/nvidia-nim-offers-optimized-inference-microservices-for-deploying-ai-models-at-scale/
from llama_index.llms.nvidia import NVIDIA
# connect to a rerank NIM running at localhost:1976
reranker = NVIDIARerank(base_url="http://localhost:1976/v1")