RunGPT¶
RunGPT 是一个开源的云原生大规模多模态模型(LMMs)服务框架。该框架旨在简化大型语言模型在分布式 GPU 集群上的部署与管理,致力于成为集中化、易访问的一站式解决方案,汇集优化大规模多模态模型的技术,并使其对所有人易于使用。目前 RunGPT 已支持 LLaMA、Pythia、StableLM、Vicuna、MOSS 等多种大型语言模型,同时还支持 MiniGPT-4 和 OpenFlamingo 等大型多模态模型。
安装配置¶
如果您在 Colab 上打开此 Notebook,可能需要安装 LlamaIndex 🦙。
In [ ]:
Copied!
%pip install llama-index-llms-rungpt
%pip install llama-index-llms-rungpt
In [ ]:
Copied!
!pip install llama-index
!pip install llama-index
你需要在 Python 环境中通过 pip install 安装 rungpt 包
In [ ]:
Copied!
!pip install rungpt
!pip install rungpt
成功安装后,RunGPT 支持的模型可通过一行命令部署。该操作会从开源平台下载目标语言模型,并将其作为服务部署在本地端口,支持通过 http 或 grpc 请求访问。建议不要在 Jupyter Book 中执行此命令,而应在命令行终端运行。
In [ ]:
Copied!
!rungpt serve decapoda-research/llama-7b-hf --precision fp16 --device_map balanced
!rungpt serve decapoda-research/llama-7b-hf --precision fp16 --device_map balanced
In [ ]:
Copied!
from llama_index.llms.rungpt import RunGptLLM
llm = RunGptLLM()
promot = "What public transportation might be available in a city?"
response = llm.complete(promot)
from llama_index.llms.rungpt import RunGptLLM
llm = RunGptLLM()
promot = "What public transportation might be available in a city?"
response = llm.complete(promot)
In [ ]:
Copied!
print(response)
print(response)
I don't want to go to work, so what should I do? I have a job interview on Monday. What can I wear that will make me look professional but not too stuffy or boring?
使用消息列表调用 chat 方法¶
In [ ]:
Copied!
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.llms.rungpt import RunGptLLM
messages = [
ChatMessage(
role=MessageRole.USER,
content="Now, I want you to do some math for me.",
),
ChatMessage(
role=MessageRole.ASSISTANT, content="Sure, I would like to help you."
),
ChatMessage(
role=MessageRole.USER,
content="How many points determine a straight line?",
),
]
llm = RunGptLLM()
response = llm.chat(messages=messages, temperature=0.8, max_tokens=15)
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.llms.rungpt import RunGptLLM
messages = [
ChatMessage(
role=MessageRole.USER,
content="Now, I want you to do some math for me.",
),
ChatMessage(
role=MessageRole.ASSISTANT, content="Sure, I would like to help you."
),
ChatMessage(
role=MessageRole.USER,
content="How many points determine a straight line?",
),
]
llm = RunGptLLM()
response = llm.chat(messages=messages, temperature=0.8, max_tokens=15)
In [ ]:
Copied!
print(response)
print(response)
流式传输¶
使用 stream_complete 端点
In [ ]:
Copied!
promot = "What public transportation might be available in a city?"
response = RunGptLLM().stream_complete(promot)
for item in response:
print(item.text)
promot = "What public transportation might be available in a city?"
response = RunGptLLM().stream_complete(promot)
for item in response:
print(item.text)
使用 stream_chat 端点
In [ ]:
Copied!
from llama_index.llms.rungpt import RunGptLLM
messages = [
ChatMessage(
role=MessageRole.USER,
content="Now, I want you to do some math for me.",
),
ChatMessage(
role=MessageRole.ASSISTANT, content="Sure, I would like to help you."
),
ChatMessage(
role=MessageRole.USER,
content="How many points determine a straight line?",
),
]
response = RunGptLLM().stream_chat(messages=messages)
from llama_index.llms.rungpt import RunGptLLM
messages = [
ChatMessage(
role=MessageRole.USER,
content="Now, I want you to do some math for me.",
),
ChatMessage(
role=MessageRole.ASSISTANT, content="Sure, I would like to help you."
),
ChatMessage(
role=MessageRole.USER,
content="How many points determine a straight line?",
),
]
response = RunGptLLM().stream_chat(messages=messages)
In [ ]:
Copied!
for item in response:
print(item.message)
for item in response:
print(item.message)