RunGPT¶

RunGPT 是一个开源的云原生大规模多模态模型（LMMs）服务框架。该框架旨在简化大型语言模型在分布式 GPU 集群上的部署与管理，致力于成为集中化、易访问的一站式解决方案，汇集优化大规模多模态模型的技术，并使其对所有人易于使用。目前 RunGPT 已支持 LLaMA、Pythia、StableLM、Vicuna、MOSS 等多种大型语言模型，同时还支持 MiniGPT-4 和 OpenFlamingo 等大型多模态模型。

安装配置¶

如果您在 Colab 上打开此 Notebook，可能需要安装 LlamaIndex 🦙。

In [ ]:

Copied!

%pip install llama-index-llms-rungpt
%pip install llama-index-llms-rungpt

In [ ]:

Copied!

!pip install llama-index
!pip install llama-index

你需要在 Python 环境中通过 pip install 安装 rungpt 包

In [ ]:

Copied!

!pip install rungpt
!pip install rungpt

成功安装后，RunGPT 支持的模型可通过一行命令部署。该操作会从开源平台下载目标语言模型，并将其作为服务部署在本地端口，支持通过 http 或 grpc 请求访问。建议不要在 Jupyter Book 中执行此命令，而应在命令行终端运行。

In [ ]:

Copied!

!rungpt serve decapoda-research/llama-7b-hf --precision fp16 --device_map balanced
!rungpt serve decapoda-research/llama-7b-hf --precision fp16 --device_map balanced

基本用法¶

调用 `complete` 方法传入提示词¶

In [ ]:

Copied!

from llama_index.llms.rungpt import RunGptLLM

llm = RunGptLLM()
promot = "What public transportation might be available in a city?"
response = llm.complete(promot)
from llama_index.llms.rungpt import RunGptLLM

llm = RunGptLLM()
promot = "What public transportation might be available in a city?"
response = llm.complete(promot)

In [ ]:

Copied!

print(response)
print(response)

I don't want to go to work, so what should I do?
I have a job interview on Monday. What can I wear that will make me look professional but not too stuffy or boring?

使用消息列表调用 `chat` 方法¶

In [ ]:

Copied!





from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.llms.rungpt import RunGptLLM

messages = [
    ChatMessage(
        role=MessageRole.USER,
        content="Now, I want you to do some math for me.",
    ),
    ChatMessage(
        role=MessageRole.ASSISTANT, content="Sure, I would like to help you."
    ),
    ChatMessage(
        role=MessageRole.USER,
        content="How many points determine a straight line?",
    ),
]
llm = RunGptLLM()
response = llm.chat(messages=messages, temperature=0.8, max_tokens=15)
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.llms.rungpt import RunGptLLM

messages = [
    ChatMessage(
        role=MessageRole.USER,
        content="Now, I want you to do some math for me.",
    ),
    ChatMessage(
        role=MessageRole.ASSISTANT, content="Sure, I would like to help you."
    ),
    ChatMessage(
        role=MessageRole.USER,
        content="How many points determine a straight line?",
    ),
]
llm = RunGptLLM()
response = llm.chat(messages=messages, temperature=0.8, max_tokens=15)

In [ ]:

Copied!

print(response)
print(response)

流式传输¶

使用 stream_complete 端点

In [ ]:

Copied!





promot = "What public transportation might be available in a city?"
response = RunGptLLM().stream_complete(promot)
for item in response:
    print(item.text)
promot = "What public transportation might be available in a city?"
response = RunGptLLM().stream_complete(promot)
for item in response:
    print(item.text)

使用 stream_chat 端点

In [ ]:

Copied!





from llama_index.llms.rungpt import RunGptLLM

messages = [
    ChatMessage(
        role=MessageRole.USER,
        content="Now, I want you to do some math for me.",
    ),
    ChatMessage(
        role=MessageRole.ASSISTANT, content="Sure, I would like to help you."
    ),
    ChatMessage(
        role=MessageRole.USER,
        content="How many points determine a straight line?",
    ),
]
response = RunGptLLM().stream_chat(messages=messages)
from llama_index.llms.rungpt import RunGptLLM

messages = [
    ChatMessage(
        role=MessageRole.USER,
        content="Now, I want you to do some math for me.",
    ),
    ChatMessage(
        role=MessageRole.ASSISTANT, content="Sure, I would like to help you."
    ),
    ChatMessage(
        role=MessageRole.USER,
        content="How many points determine a straight line?",
    ),
]
response = RunGptLLM().stream_chat(messages=messages)

In [ ]:

Copied!

for item in response:
    print(item.message)
for item in response:
    print(item.message)

RunGPT¶

安装配置¶

基本用法¶

调用 complete 方法传入提示词¶

使用消息列表调用 chat 方法¶

流式传输¶

调用 `complete` 方法传入提示词¶

使用消息列表调用 `chat` 方法¶