简单可组合内存¶

注意： 此内存示例已弃用，推荐使用更新且更灵活的 Memory 类。详见最新文档。

本笔记本将演示如何向智能体注入多个记忆源。具体而言，我们使用由 primary_memory（主记忆）和若干潜在次级记忆源（存储在 secondary_memory_sources 中）组成的 SimpleComposableMemory。核心区别在于：primary_memory 将作为智能体的主要聊天缓冲区，而来自 secondary_memory_sources 的任何检索消息仅会注入到系统提示消息中。

多记忆源适用于需要结合长期记忆（如 VectorMemory）与默认 ChatMemoryBuffer 的场景。通过本笔记本您将看到，使用 SimpleComposableMemory 可高效地将长期记忆中的目标消息"加载"到主内存（即 ChatMemoryBuffer）中。

`SimpleComposableMemory` 如何工作？¶

我们从 SimpleComposableMemory 的基本用法开始。这里我们构建了一个 VectorMemory 以及一个默认的 ChatMemoryBuffer。VectorMemory 将作为次要记忆源，而 ChatMemoryBuffer 则作为主要记忆源。要实例化一个 SimpleComposableMemory 对象，我们需要提供一个 primary_memory 参数以及（可选的）一个 secondary_memory_sources 列表。

SimpleComposableMemoryIllustration

In [ ]:

Copied!





from llama_index.core.memory import (
    VectorMemory,
    SimpleComposableMemory,
    ChatMemoryBuffer,
)
from llama_index.core.llms import ChatMessage
from llama_index.embeddings.openai import OpenAIEmbedding

vector_memory = VectorMemory.from_defaults(
    vector_store=None,  # leave as None to use default in-memory vector store
    embed_model=OpenAIEmbedding(),
    retriever_kwargs={"similarity_top_k": 1},
)

# let's set some initial messages in our secondary vector memory
msgs = [
    ChatMessage.from_str("You are a SOMEWHAT helpful assistant.", "system"),
    ChatMessage.from_str("Bob likes burgers.", "user"),
    ChatMessage.from_str("Indeed, Bob likes apples.", "assistant"),
    ChatMessage.from_str("Alice likes apples.", "user"),
]
vector_memory.set(msgs)

chat_memory_buffer = ChatMemoryBuffer.from_defaults()

composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=chat_memory_buffer,
    secondary_memory_sources=[vector_memory],
)
from llama_index.core.memory import (
    VectorMemory,
    SimpleComposableMemory,
    ChatMemoryBuffer,
)
from llama_index.core.llms import ChatMessage
from llama_index.embeddings.openai import OpenAIEmbedding

vector_memory = VectorMemory.from_defaults(
    vector_store=None,  # leave as None to use default in-memory vector store
    embed_model=OpenAIEmbedding(),
    retriever_kwargs={"similarity_top_k": 1},
)

# let's set some initial messages in our secondary vector memory
msgs = [
    ChatMessage.from_str("You are a SOMEWHAT helpful assistant.", "system"),
    ChatMessage.from_str("Bob likes burgers.", "user"),
    ChatMessage.from_str("Indeed, Bob likes apples.", "assistant"),
    ChatMessage.from_str("Alice likes apples.", "user"),
]
vector_memory.set(msgs)

chat_memory_buffer = ChatMemoryBuffer.from_defaults()

composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=chat_memory_buffer,
    secondary_memory_sources=[vector_memory],
)

In [ ]:

Copied!

composable_memory.primary_memory
composable_memory.primary_memory

Out[ ]:

ChatMemoryBuffer(chat_store=SimpleChatStore(store={}), chat_store_key='chat_history', token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'))

In [ ]:

Copied!

composable_memory.secondary_memory_sources
composable_memory.secondary_memory_sources

Out[ ]:

[VectorMemory(vector_index=<llama_index.core.indices.vector_store.base.VectorStoreIndex object at 0x137b912a0>, retriever_kwargs={'similarity_top_k': 1}, batch_by_user_message=True, cur_batch_textnode=TextNode(id_='288b0ef3-570e-4698-a1ae-b3531df66361', embedding=None, metadata={'sub_dicts': [{'role': <MessageRole.USER: 'user'>, 'content': 'Alice likes apples.', 'additional_kwargs': {}}]}, excluded_embed_metadata_keys=['sub_dicts'], excluded_llm_metadata_keys=['sub_dicts'], relationships={}, text='Alice likes apples.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'))]

将消息 `put()` 存入内存¶

由于 SimpleComposableMemory 本身是 BaseMemory 的子类，我们添加消息的方式与其他内存模块相同。需要注意的是，对于 SimpleComposableMemory，调用 .put() 实际上会触发所有内存源的 .put() 方法。换句话说，消息会被同时添加到 primary 和 secondary 存储源中。

In [ ]:

Copied!





msgs = [
    ChatMessage.from_str("You are a REALLY helpful assistant.", "system"),
    ChatMessage.from_str("Jerry likes juice.", "user"),
]
msgs = [
    ChatMessage.from_str("You are a REALLY helpful assistant.", "system"),
    ChatMessage.from_str("Jerry likes juice.", "user"),
]

In [ ]:

Copied!

# load into all memory sources modules"
for m in msgs:
    composable_memory.put(m)
# load into all memory sources modules"
for m in msgs:
    composable_memory.put(m)

从内存中`get()`消息¶

当调用 .get() 方法时，我们会同样执行 primary 内存和所有 secondary 来源的 .get() 方法。这将生成一系列消息列表，我们需要将这些消息"组合"成一组合理的统一消息（以便传递给下游的智能体）。在此过程中通常需要特别注意，确保最终的消息序列既合理又符合大语言模型提供商的聊天API规范。

对于 SimpleComposableMemory，我们会将 secondary 来源的消息注入到 primary 内存的系统消息中。primary 来源的其余消息历史保持不变，这种组合方式就是最终返回的结果。

In [ ]:

Copied!

msgs = composable_memory.get("What does Bob like?")
msgs
msgs = composable_memory.get("What does Bob like?")
msgs

Out[ ]:

[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Bob likes burgers.\n\tASSISTANT: Indeed, Bob likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={}),
 ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]

In [ ]:

Copied!

# see the memory injected into the system message of the primary memory
print(msgs[0])
# see the memory injected into the system message of the primary memory
print(msgs[0])

system: You are a REALLY helpful assistant.

Below are a set of relevant dialogues retrieved from potentially several memory sources:

=====Relevant messages from memory source 1=====

	USER: Bob likes burgers.
	ASSISTANT: Indeed, Bob likes apples.

=====End of relevant messages from memory source 1======

This is the end of the retrieved message dialogues.

连续调用 `get()` 方法¶

连续调用 get() 会直接替换系统提示中已加载的 secondary 内存消息。

In [ ]:

Copied!

msgs = composable_memory.get("What does Alice like?")
msgs
msgs = composable_memory.get("What does Alice like?")
msgs

Out[ ]:

[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Alice likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={}),
 ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]

In [ ]:

Copied!

# see the memory injected into the system message of the primary memory
print(msgs[0])
# see the memory injected into the system message of the primary memory
print(msgs[0])

system: You are a REALLY helpful assistant.

Below are a set of relevant dialogues retrieved from potentially several memory sources:

=====Relevant messages from memory source 1=====

	USER: Alice likes apples.

=====End of relevant messages from memory source 1======

This is the end of the retrieved message dialogues.

如果 `get()` 检索到已存在于 `primary` 内存中的 `secondary` 消息会怎样？¶

当从 secondary 内存检索到的消息已存在于 primary 内存时，这些冗余的次要消息将不会被添加到系统消息中。在以下示例中，消息"Jerry likes juice."已被 put 到所有内存源，因此系统消息不会发生改变。

In [ ]:

Copied!

msgs = composable_memory.get("What does Jerry like?")
msgs
msgs = composable_memory.get("What does Jerry like?")
msgs

Out[ ]:

[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.', additional_kwargs={}),
 ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]

如何 `reset` 内存¶

与其他方法 put() 和 get() 类似，调用 reset() 会同时在 primary 和 secondary 存储源上执行 reset()。若只需重置 primary，则应仅调用其自身的 reset() 方法。

`reset()` 仅重置主内存¶

In [ ]:

Copied!

composable_memory.primary_memory.reset()
composable_memory.primary_memory.reset()

In [ ]:

Copied!

composable_memory.primary_memory.get()
composable_memory.primary_memory.get()

Out[ ]:

[]

In [ ]:

Copied!

composable_memory.secondary_memory_sources[0].get("What does Alice like?")
composable_memory.secondary_memory_sources[0].get("What does Alice like?")

Out[ ]:

[ChatMessage(role=<MessageRole.USER: 'user'>, content='Alice likes apples.', additional_kwargs={})]

重置所有内存源 `reset()`¶

In [ ]:

Copied!

composable_memory.reset()
composable_memory.reset()

In [ ]:

Copied!

composable_memory.primary_memory.get()
composable_memory.primary_memory.get()

Out[ ]:

[]

In [ ]:

Copied!

composable_memory.secondary_memory_sources[0].get("What does Alice like?")
composable_memory.secondary_memory_sources[0].get("What does Alice like?")

Out[ ]:

[]

在智能体中使用 `SimpleComposableMemory`¶

这里我们将使用一个带有智能体的 SimpleComposableMemory，并演示如何利用次要的长期记忆存储源，将某个智能体对话中的消息作为另一个智能体会话对话的一部分来使用。

In [ ]:

Copied!

from llama_index.llms.openai import OpenAI
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import FunctionCallingAgent

import nest_asyncio

nest_asyncio.apply()
from llama_index.llms.openai import OpenAI
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import FunctionCallingAgent

import nest_asyncio

nest_asyncio.apply()

定义内存模块¶

In [ ]:

Copied!





vector_memory = VectorMemory.from_defaults(
    vector_store=None,  # leave as None to use default in-memory vector store
    embed_model=OpenAIEmbedding(),
    retriever_kwargs={"similarity_top_k": 2},
)

chat_memory_buffer = ChatMemoryBuffer.from_defaults()

composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=chat_memory_buffer,
    secondary_memory_sources=[vector_memory],
)
vector_memory = VectorMemory.from_defaults(
    vector_store=None,  # leave as None to use default in-memory vector store
    embed_model=OpenAIEmbedding(),
    retriever_kwargs={"similarity_top_k": 2},
)

chat_memory_buffer = ChatMemoryBuffer.from_defaults()

composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=chat_memory_buffer,
    secondary_memory_sources=[vector_memory],
)

定义我们的智能体¶

In [ ]:

Copied!

def multiply(a: int, b: int) -> int:
    """Multiply two integers and returns the result integer"""
    return a * b

def mystery(a: int, b: int) -> int:
    """Mystery function on two numbers"""
    return a**2 - b**2

multiply_tool = FunctionTool.from_defaults(fn=multiply)
mystery_tool = FunctionTool.from_defaults(fn=mystery)
def multiply(a: int, b: int) -> int:
    """Multiply two integers and returns the result integer"""
    return a * b

def mystery(a: int, b: int) -> int:
    """Mystery function on two numbers"""
    return a**2 - b**2

multiply_tool = FunctionTool.from_defaults(fn=multiply)
mystery_tool = FunctionTool.from_defaults(fn=mystery)

In [ ]:

Copied!





llm = OpenAI(model="gpt-3.5-turbo-0613")
agent = FunctionCallingAgent.from_tools(
    [multiply_tool, mystery_tool],
    llm=llm,
    memory=composable_memory,
    verbose=True,
)
llm = OpenAI(model="gpt-3.5-turbo-0613")
agent = FunctionCallingAgent.from_tools(
    [multiply_tool, mystery_tool],
    llm=llm,
    memory=composable_memory,
    verbose=True,
)

执行某些函数调用¶

当调用 .chat() 方法时，消息会被存入可组合内存。根据前文所述，这意味着所有消息会同时存入 primary 和 secondary 两个存储源。

In [ ]:

Copied!

response = agent.chat("What is the mystery function on 5 and 6?")
response = agent.chat("What is the mystery function on 5 and 6?")

Added user message to memory: What is the mystery function on 5 and 6?
=== Calling Function ===
Calling function: mystery with args: {"a": 5, "b": 6}
=== Function Output ===
-11
=== LLM Response ===
The mystery function on 5 and 6 returns -11.

In [ ]:

Copied!

response = agent.chat("What happens if you multiply 2 and 3?")
response = agent.chat("What happens if you multiply 2 and 3?")

Added user message to memory: What happens if you multiply 2 and 3?
=== Calling Function ===
Calling function: multiply with args: {"a": 2, "b": 3}
=== Function Output ===
6
=== LLM Response ===
If you multiply 2 and 3, the result is 6.

新建代理会话¶

既然我们已经将消息添加到 vector_memory 中，现在可以观察使用该内存与不使用时的效果差异。具体而言，我们会要求新代理"回忆"函数调用的输出结果，而非重新计算。

一个没有过往记忆的智能体¶

In [ ]:

Copied!





llm = OpenAI(model="gpt-3.5-turbo-0613")
agent_without_memory = FunctionCallingAgent.from_tools(
    [multiply_tool, mystery_tool], llm=llm, verbose=True
)
llm = OpenAI(model="gpt-3.5-turbo-0613")
agent_without_memory = FunctionCallingAgent.from_tools(
    [multiply_tool, mystery_tool], llm=llm, verbose=True
)

In [ ]:

Copied!

response = agent_without_memory.chat(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
)
response = agent_without_memory.chat(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
)

Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.
=== LLM Response ===
I'm sorry, but I don't have access to the previous output of the mystery function on 5 and 6.

拥有过往记忆的智能体¶

我们观察到，无法访问过往记忆的智能体无法完成任务。接下来这个智能体将实际接收我们之前的长期记忆（即 vector_memory）。请注意，我们甚至使用了全新的 ChatMemoryBuffer，这意味着该智能体不存在任何 chat_history。尽管如此，它仍能从长期记忆中检索所需的过往对话记录。

In [ ]:

Copied!





llm = OpenAI(model="gpt-3.5-turbo-0613")

composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=ChatMemoryBuffer.from_defaults(),
    secondary_memory_sources=[
        vector_memory.copy(
            deep=True
        )  # using a copy here for illustration purposes
        # later will use original vector_memory again
    ],
)

agent_with_memory = FunctionCallingAgent.from_tools(
    [multiply_tool, mystery_tool],
    llm=llm,
    memory=composable_memory,
    verbose=True,
)
llm = OpenAI(model="gpt-3.5-turbo-0613")

composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=ChatMemoryBuffer.from_defaults(),
    secondary_memory_sources=[
        vector_memory.copy(
            deep=True
        )  # using a copy here for illustration purposes
        # later will use original vector_memory again
    ],
)

agent_with_memory = FunctionCallingAgent.from_tools(
    [multiply_tool, mystery_tool],
    llm=llm,
    memory=composable_memory,
    verbose=True,
)

In [ ]:

Copied!

agent_with_memory.chat_history  # an empty chat history
agent_with_memory.chat_history  # an empty chat history

Out[ ]:

[]

In [ ]:

Copied!

response = agent_with_memory.chat(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
)
response = agent_with_memory.chat(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
)

Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.
=== LLM Response ===
The output of the mystery function on 5 and 6 is -11.

In [ ]:

Copied!

response = agent_with_memory.chat(
    "What was the output of the multiply function on 2 and 3 again? Don't recompute."
)
response = agent_with_memory.chat(
    "What was the output of the multiply function on 2 and 3 again? Don't recompute."
)

Added user message to memory: What was the output of the multiply function on 2 and 3 again? Don't recompute.
=== LLM Response ===
The output of the multiply function on 2 and 3 is 6.

In [ ]:

Copied!

agent_with_memory.chat_history
agent_with_memory.chat_history

Out[ ]:

[ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the mystery function on 5 and 6 again? Don't recompute.", additional_kwargs={}),
 ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the mystery function on 5 and 6 is -11.', additional_kwargs={}),
 ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the multiply function on 2 and 3 again? Don't recompute.", additional_kwargs={}),
 ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the multiply function on 2 and 3 is 6.', additional_kwargs={})]

`.chat(user_input)` 的底层运行机制¶

在底层实现中，.chat(user_input) 调用实际上会以 user_input 作为参数调用内存的 .get() 方法。正如我们在前一节了解到的，这将最终返回 primary 和所有 secondary 内存源的组合结果。这些组合后的消息会作为聊天历史记录传递给 LLM 的聊天 API。

In [ ]:

Copied!





composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=ChatMemoryBuffer.from_defaults(),
    secondary_memory_sources=[
        vector_memory.copy(
            deep=True
        )  # copy for illustrative purposes to explain what
        # happened under the hood from previous subsection
    ],
)
agent_with_memory = agent_worker.as_agent(memory=composable_memory)
composable_memory = SimpleComposableMemory.from_defaults(
    primary_memory=ChatMemoryBuffer.from_defaults(),
    secondary_memory_sources=[
        vector_memory.copy(
            deep=True
        )  # copy for illustrative purposes to explain what
        # happened under the hood from previous subsection
    ],
)
agent_with_memory = agent_worker.as_agent(memory=composable_memory)

In [ ]:

Copied!

agent_with_memory.memory.get(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
)
agent_with_memory.memory.get(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
)

Out[ ]:

[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: What is the mystery function on 5 and 6?\n\tASSISTANT: None\n\tTOOL: -11\n\tASSISTANT: The mystery function on 5 and 6 returns -11.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={})]

In [ ]:

Copied!





print(
    agent_with_memory.memory.get(
        "What was the output of the mystery function on 5 and 6 again? Don't recompute."
    )[0]
)
print(
    agent_with_memory.memory.get(
        "What was the output of the mystery function on 5 and 6 again? Don't recompute."
    )[0]
)

system: You are a helpful assistant.

Below are a set of relevant dialogues retrieved from potentially several memory sources:

=====Relevant messages from memory source 1=====

	USER: What is the mystery function on 5 and 6?
	ASSISTANT: None
	TOOL: -11
	ASSISTANT: The mystery function on 5 and 6 returns -11.

=====End of relevant messages from memory source 1======

This is the end of the retrieved message dialogues.

简单可组合内存¶

SimpleComposableMemory 如何工作？¶

将消息 put() 存入内存¶

从内存中get()消息¶

连续调用 get() 方法¶

如果 get() 检索到已存在于 primary 内存中的 secondary 消息会怎样？¶

如何 reset 内存¶

reset() 仅重置主内存¶

重置所有内存源 reset()¶

在智能体中使用 SimpleComposableMemory¶

定义内存模块¶

定义我们的智能体¶

执行某些函数调用¶

新建代理会话¶

一个没有过往记忆的智能体¶

拥有过往记忆的智能体¶

.chat(user_input) 的底层运行机制¶

`SimpleComposableMemory` 如何工作？¶

将消息 `put()` 存入内存¶

从内存中`get()`消息¶

连续调用 `get()` 方法¶

如果 `get()` 检索到已存在于 `primary` 内存中的 `secondary` 消息会怎样？¶

如何 `reset` 内存¶

`reset()` 仅重置主内存¶

重置所有内存源 `reset()`¶

在智能体中使用 `SimpleComposableMemory`¶

`.chat(user_input)` 的底层运行机制¶