IBM watsonx.ai¶
WatsonxLLM 是 IBM watsonx.ai 基础模型的封装器。
这些示例旨在展示如何通过 LlamaIndex LLMs API 与 watsonx.ai 模型进行交互。
环境配置¶
安装 llama-index-llms-ibm 包:
In [ ]:
Copied!
!pip install -qU llama-index-llms-ibm
!pip install -qU llama-index-llms-ibm
以下单元格定义了使用 watsonx 基础模型推理功能所需的凭证信息。
操作: 提供 IBM Cloud 用户 API 密钥。详情请参阅 管理用户 API 密钥。
In [ ]:
Copied!
import os
from getpass import getpass
watsonx_api_key = getpass()
os.environ["WATSONX_APIKEY"] = watsonx_api_key
import os
from getpass import getpass
watsonx_api_key = getpass()
os.environ["WATSONX_APIKEY"] = watsonx_api_key
此外,您还可以通过环境变量传递额外的密钥:
In [ ]:
Copied!
import os
os.environ["WATSONX_URL"] = "your service instance url"
os.environ["WATSONX_TOKEN"] = "your token for accessing the CPD cluster"
os.environ["WATSONX_PASSWORD"] = "your password for accessing the CPD cluster"
os.environ["WATSONX_USERNAME"] = "your username for accessing the CPD cluster"
os.environ[
"WATSONX_INSTANCE_ID"
] = "your instance_id for accessing the CPD cluster"
import os
os.environ["WATSONX_URL"] = "your service instance url"
os.environ["WATSONX_TOKEN"] = "your token for accessing the CPD cluster"
os.environ["WATSONX_PASSWORD"] = "your password for accessing the CPD cluster"
os.environ["WATSONX_USERNAME"] = "your username for accessing the CPD cluster"
os.environ[
"WATSONX_INSTANCE_ID"
] = "your instance_id for accessing the CPD cluster"
In [ ]:
Copied!
temperature = 0.5
max_new_tokens = 50
additional_params = {
"decoding_method": "sample",
"min_new_tokens": 1,
"top_k": 50,
"top_p": 1,
}
temperature = 0.5
max_new_tokens = 50
additional_params = {
"decoding_method": "sample",
"min_new_tokens": 1,
"top_k": 50,
"top_p": 1,
}
使用先前设置的参数初始化 WatsonxLLM 类。
注意:
- 为提供 API 调用的上下文,必须传递
project_id或space_id。获取项目或空间 ID 的方法:打开您的项目或空间,进入管理选项卡,点击常规。更多信息请参阅:项目文档 或 部署空间文档。 - 根据您所配置服务实例的区域,使用 watsonx.ai API 认证 中列出的任一 URL。
本示例将使用 project_id 和达拉斯 URL。
您需要指定用于推理的 model_id。所有可用模型列表可在支持的基座模型中查看。
In [ ]:
Copied!
from llama_index.llms.ibm import WatsonxLLM
watsonx_llm = WatsonxLLM(
model_id="ibm/granite-13b-instruct-v2",
url="https://us-south.ml.cloud.ibm.com",
project_id="PASTE YOUR PROJECT_ID HERE",
temperature=temperature,
max_new_tokens=max_new_tokens,
additional_params=additional_params,
)
from llama_index.llms.ibm import WatsonxLLM
watsonx_llm = WatsonxLLM(
model_id="ibm/granite-13b-instruct-v2",
url="https://us-south.ml.cloud.ibm.com",
project_id="PASTE YOUR PROJECT_ID HERE",
temperature=temperature,
max_new_tokens=max_new_tokens,
additional_params=additional_params,
)
或者,您也可以使用 Cloud Pak for Data 凭证。详情请参阅 watsonx.ai 软件设置。
In [ ]:
Copied!
watsonx_llm = WatsonxLLM(
model_id="ibm/granite-13b-instruct-v2",
url="PASTE YOUR URL HERE",
username="PASTE YOUR USERNAME HERE",
password="PASTE YOUR PASSWORD HERE",
instance_id="openshift",
version="4.8",
project_id="PASTE YOUR PROJECT_ID HERE",
temperature=temperature,
max_new_tokens=max_new_tokens,
additional_params=additional_params,
)
watsonx_llm = WatsonxLLM(
model_id="ibm/granite-13b-instruct-v2",
url="PASTE YOUR URL HERE",
username="PASTE YOUR USERNAME HERE",
password="PASTE YOUR PASSWORD HERE",
instance_id="openshift",
version="4.8",
project_id="PASTE YOUR PROJECT_ID HERE",
temperature=temperature,
max_new_tokens=max_new_tokens,
additional_params=additional_params,
)
除了使用 model_id,您也可以传入之前调优模型的 deployment_id。完整的模型调优流程详见使用 TuneExperiment 和 PromptTuner。
In [ ]:
Copied!
watsonx_llm = WatsonxLLM(
deployment_id="PASTE YOUR DEPLOYMENT_ID HERE",
url="https://us-south.ml.cloud.ibm.com",
project_id="PASTE YOUR PROJECT_ID HERE",
temperature=temperature,
max_new_tokens=max_new_tokens,
additional_params=additional_params,
)
watsonx_llm = WatsonxLLM(
deployment_id="PASTE YOUR DEPLOYMENT_ID HERE",
url="https://us-south.ml.cloud.ibm.com",
project_id="PASTE YOUR PROJECT_ID HERE",
temperature=temperature,
max_new_tokens=max_new_tokens,
additional_params=additional_params,
)
创建补全¶
直接使用字符串类型的提示调用模型:
In [ ]:
Copied!
response = watsonx_llm.complete("What is a Generative AI?")
print(response)
response = watsonx_llm.complete("What is a Generative AI?")
print(response)
A generative AI is a computer program that can create new text, images, or other types of content. These programs are trained on large datasets of existing content, and they use that data to generate new content that is similar to the training data.
从 CompletionResponse 中,您还可以获取服务返回的原始响应:
In [ ]:
Copied!
print(response.raw)
print(response.raw)
{'model_id': 'ibm/granite-13b-instruct-v2', 'created_at': '2024-05-20T07:11:57.984Z', 'results': [{'generated_text': 'A generative AI is a computer program that can create new text, images, or other types of content. These programs are trained on large datasets of existing content, and they use that data to generate new content that is similar to the training data.', 'generated_token_count': 50, 'input_token_count': 7, 'stop_reason': 'max_tokens', 'seed': 494448017}]}
您也可以调用提供提示模板的模型:
In [ ]:
Copied!
from llama_index.core import PromptTemplate
template = "What is {object} and how does it work?"
prompt_template = PromptTemplate(template=template)
prompt = prompt_template.format(object="a loan")
response = watsonx_llm.complete(prompt)
print(response)
from llama_index.core import PromptTemplate
template = "What is {object} and how does it work?"
prompt_template = PromptTemplate(template=template)
prompt = prompt_template.format(object="a loan")
response = watsonx_llm.complete(prompt)
print(response)
A loan is a sum of money that is borrowed to buy something, such as a house or a car. The borrower must repay the loan plus interest. The interest is a fee charged for using the money. The interest rate is the amount of
通过消息列表调用 chat 功能¶
通过提供消息列表来创建 chat 补全:
In [ ]:
Copied!
from llama_index.core.llms import ChatMessage
messages = [
ChatMessage(role="system", content="You are an AI assistant"),
ChatMessage(role="user", content="Who are you?"),
]
response = watsonx_llm.chat(
messages, max_new_tokens=20, decoding_method="greedy"
)
print(response)
from llama_index.core.llms import ChatMessage
messages = [
ChatMessage(role="system", content="You are an AI assistant"),
ChatMessage(role="user", content="Who are you?"),
]
response = watsonx_llm.chat(
messages, max_new_tokens=20, decoding_method="greedy"
)
print(response)
assistant: I am an AI assistant.
请注意,我们将 max_new_tokens 参数更改为 20,并将 decoding_method 参数设置为 greedy。
流式传输模型输出¶
流式传输模型的响应:
In [ ]:
Copied!
for chunk in watsonx_llm.stream_complete(
"Describe your favorite city and why it is your favorite."
):
print(chunk.delta, end="")
for chunk in watsonx_llm.stream_complete(
"Describe your favorite city and why it is your favorite."
):
print(chunk.delta, end="")
I like New York because it is the city of dreams. You can achieve anything you want here.
同样地,要流式传输 chat 补全结果,请使用以下代码:
In [ ]:
Copied!
messages = [
ChatMessage(role="system", content="You are an AI assistant"),
ChatMessage(role="user", content="Who are you?"),
]
for chunk in watsonx_llm.stream_chat(messages):
print(chunk.delta, end="")
messages = [
ChatMessage(role="system", content="You are an AI assistant"),
ChatMessage(role="user", content="Who are you?"),
]
for chunk in watsonx_llm.stream_chat(messages):
print(chunk.delta, end="")
I am an AI assistant.