简单可组合内存¶
注意: 此内存示例已弃用,推荐使用更新且更灵活的 Memory
类。详见最新文档。
本笔记本将演示如何向智能体注入多个记忆源。具体而言,我们使用由 primary_memory
(主记忆)和若干潜在次级记忆源(存储在 secondary_memory_sources
中)组成的 SimpleComposableMemory
。核心区别在于:primary_memory
将作为智能体的主要聊天缓冲区,而来自 secondary_memory_sources
的任何检索消息仅会注入到系统提示消息中。
多记忆源适用于需要结合长期记忆(如 VectorMemory
)与默认 ChatMemoryBuffer
的场景。通过本笔记本您将看到,使用 SimpleComposableMemory
可高效地将长期记忆中的目标消息"加载"到主内存(即 ChatMemoryBuffer
)中。
SimpleComposableMemory
如何工作?¶
我们从 SimpleComposableMemory
的基本用法开始。这里我们构建了一个 VectorMemory
以及一个默认的 ChatMemoryBuffer
。VectorMemory
将作为次要记忆源,而 ChatMemoryBuffer
则作为主要记忆源。要实例化一个 SimpleComposableMemory
对象,我们需要提供一个 primary_memory
参数以及(可选的)一个 secondary_memory_sources
列表。
from llama_index.core.memory import (
VectorMemory,
SimpleComposableMemory,
ChatMemoryBuffer,
)
from llama_index.core.llms import ChatMessage
from llama_index.embeddings.openai import OpenAIEmbedding
vector_memory = VectorMemory.from_defaults(
vector_store=None, # leave as None to use default in-memory vector store
embed_model=OpenAIEmbedding(),
retriever_kwargs={"similarity_top_k": 1},
)
# let's set some initial messages in our secondary vector memory
msgs = [
ChatMessage.from_str("You are a SOMEWHAT helpful assistant.", "system"),
ChatMessage.from_str("Bob likes burgers.", "user"),
ChatMessage.from_str("Indeed, Bob likes apples.", "assistant"),
ChatMessage.from_str("Alice likes apples.", "user"),
]
vector_memory.set(msgs)
chat_memory_buffer = ChatMemoryBuffer.from_defaults()
composable_memory = SimpleComposableMemory.from_defaults(
primary_memory=chat_memory_buffer,
secondary_memory_sources=[vector_memory],
)
composable_memory.primary_memory
ChatMemoryBuffer(chat_store=SimpleChatStore(store={}), chat_store_key='chat_history', token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'))
composable_memory.secondary_memory_sources
[VectorMemory(vector_index=<llama_index.core.indices.vector_store.base.VectorStoreIndex object at 0x137b912a0>, retriever_kwargs={'similarity_top_k': 1}, batch_by_user_message=True, cur_batch_textnode=TextNode(id_='288b0ef3-570e-4698-a1ae-b3531df66361', embedding=None, metadata={'sub_dicts': [{'role': <MessageRole.USER: 'user'>, 'content': 'Alice likes apples.', 'additional_kwargs': {}}]}, excluded_embed_metadata_keys=['sub_dicts'], excluded_llm_metadata_keys=['sub_dicts'], relationships={}, text='Alice likes apples.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'))]
将消息 put()
存入内存¶
由于 SimpleComposableMemory
本身是 BaseMemory
的子类,我们添加消息的方式与其他内存模块相同。需要注意的是,对于 SimpleComposableMemory
,调用 .put()
实际上会触发所有内存源的 .put()
方法。换句话说,消息会被同时添加到 primary
和 secondary
存储源中。
msgs = [
ChatMessage.from_str("You are a REALLY helpful assistant.", "system"),
ChatMessage.from_str("Jerry likes juice.", "user"),
]
# load into all memory sources modules"
for m in msgs:
composable_memory.put(m)
从内存中get()
消息¶
当调用 .get()
方法时,我们会同样执行 primary
内存和所有 secondary
来源的 .get()
方法。这将生成一系列消息列表,我们需要将这些消息"组合"成一组合理的统一消息(以便传递给下游的智能体)。在此过程中通常需要特别注意,确保最终的消息序列既合理又符合大语言模型提供商的聊天API规范。
对于 SimpleComposableMemory
,我们会将 secondary
来源的消息注入到 primary
内存的系统消息中。primary
来源的其余消息历史保持不变,这种组合方式就是最终返回的结果。
msgs = composable_memory.get("What does Bob like?")
msgs
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Bob likes burgers.\n\tASSISTANT: Indeed, Bob likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
# see the memory injected into the system message of the primary memory
print(msgs[0])
system: You are a REALLY helpful assistant. Below are a set of relevant dialogues retrieved from potentially several memory sources: =====Relevant messages from memory source 1===== USER: Bob likes burgers. ASSISTANT: Indeed, Bob likes apples. =====End of relevant messages from memory source 1====== This is the end of the retrieved message dialogues.
连续调用 get()
方法¶
连续调用 get()
会直接替换系统提示中已加载的 secondary
内存消息。
msgs = composable_memory.get("What does Alice like?")
msgs
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Alice likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
# see the memory injected into the system message of the primary memory
print(msgs[0])
system: You are a REALLY helpful assistant. Below are a set of relevant dialogues retrieved from potentially several memory sources: =====Relevant messages from memory source 1===== USER: Alice likes apples. =====End of relevant messages from memory source 1====== This is the end of the retrieved message dialogues.
如果 get()
检索到已存在于 primary
内存中的 secondary
消息会怎样?¶
当从 secondary
内存检索到的消息已存在于 primary
内存时,这些冗余的次要消息将不会被添加到系统消息中。在以下示例中,消息"Jerry likes juice."已被 put
到所有内存源,因此系统消息不会发生改变。
msgs = composable_memory.get("What does Jerry like?")
msgs
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.', additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
如何 reset
内存¶
与其他方法 put()
和 get()
类似,调用 reset()
会同时在 primary
和 secondary
存储源上执行 reset()
。若只需重置 primary
,则应仅调用其自身的 reset()
方法。
reset()
仅重置主内存¶
composable_memory.primary_memory.reset()
composable_memory.primary_memory.get()
[]
composable_memory.secondary_memory_sources[0].get("What does Alice like?")
[ChatMessage(role=<MessageRole.USER: 'user'>, content='Alice likes apples.', additional_kwargs={})]
重置所有内存源 reset()
¶
composable_memory.reset()
composable_memory.primary_memory.get()
[]
composable_memory.secondary_memory_sources[0].get("What does Alice like?")
[]
在智能体中使用 SimpleComposableMemory
¶
这里我们将使用一个带有智能体的 SimpleComposableMemory
,并演示如何利用次要的长期记忆存储源,将某个智能体对话中的消息作为另一个智能体会话对话的一部分来使用。
from llama_index.llms.openai import OpenAI
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import FunctionCallingAgent
import nest_asyncio
nest_asyncio.apply()
定义内存模块¶
vector_memory = VectorMemory.from_defaults(
vector_store=None, # leave as None to use default in-memory vector store
embed_model=OpenAIEmbedding(),
retriever_kwargs={"similarity_top_k": 2},
)
chat_memory_buffer = ChatMemoryBuffer.from_defaults()
composable_memory = SimpleComposableMemory.from_defaults(
primary_memory=chat_memory_buffer,
secondary_memory_sources=[vector_memory],
)
定义我们的智能体¶
def multiply(a: int, b: int) -> int:
"""Multiply two integers and returns the result integer"""
return a * b
def mystery(a: int, b: int) -> int:
"""Mystery function on two numbers"""
return a**2 - b**2
multiply_tool = FunctionTool.from_defaults(fn=multiply)
mystery_tool = FunctionTool.from_defaults(fn=mystery)
llm = OpenAI(model="gpt-3.5-turbo-0613")
agent = FunctionCallingAgent.from_tools(
[multiply_tool, mystery_tool],
llm=llm,
memory=composable_memory,
verbose=True,
)
执行某些函数调用¶
当调用 .chat()
方法时,消息会被存入可组合内存。根据前文所述,这意味着所有消息会同时存入 primary
和 secondary
两个存储源。
response = agent.chat("What is the mystery function on 5 and 6?")
Added user message to memory: What is the mystery function on 5 and 6? === Calling Function === Calling function: mystery with args: {"a": 5, "b": 6} === Function Output === -11 === LLM Response === The mystery function on 5 and 6 returns -11.
response = agent.chat("What happens if you multiply 2 and 3?")
Added user message to memory: What happens if you multiply 2 and 3? === Calling Function === Calling function: multiply with args: {"a": 2, "b": 3} === Function Output === 6 === LLM Response === If you multiply 2 and 3, the result is 6.
新建代理会话¶
既然我们已经将消息添加到 vector_memory
中,现在可以观察使用该内存与不使用时的效果差异。具体而言,我们会要求新代理"回忆"函数调用的输出结果,而非重新计算。
一个没有过往记忆的智能体¶
llm = OpenAI(model="gpt-3.5-turbo-0613")
agent_without_memory = FunctionCallingAgent.from_tools(
[multiply_tool, mystery_tool], llm=llm, verbose=True
)
response = agent_without_memory.chat(
"What was the output of the mystery function on 5 and 6 again? Don't recompute."
)
Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute. === LLM Response === I'm sorry, but I don't have access to the previous output of the mystery function on 5 and 6.
拥有过往记忆的智能体¶
我们观察到,无法访问过往记忆的智能体无法完成任务。接下来这个智能体将实际接收我们之前的长期记忆(即 vector_memory
)。请注意,我们甚至使用了全新的 ChatMemoryBuffer
,这意味着该智能体不存在任何 chat_history
。尽管如此,它仍能从长期记忆中检索所需的过往对话记录。
llm = OpenAI(model="gpt-3.5-turbo-0613")
composable_memory = SimpleComposableMemory.from_defaults(
primary_memory=ChatMemoryBuffer.from_defaults(),
secondary_memory_sources=[
vector_memory.copy(
deep=True
) # using a copy here for illustration purposes
# later will use original vector_memory again
],
)
agent_with_memory = FunctionCallingAgent.from_tools(
[multiply_tool, mystery_tool],
llm=llm,
memory=composable_memory,
verbose=True,
)
agent_with_memory.chat_history # an empty chat history
[]
response = agent_with_memory.chat(
"What was the output of the mystery function on 5 and 6 again? Don't recompute."
)
Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute. === LLM Response === The output of the mystery function on 5 and 6 is -11.
response = agent_with_memory.chat(
"What was the output of the multiply function on 2 and 3 again? Don't recompute."
)
Added user message to memory: What was the output of the multiply function on 2 and 3 again? Don't recompute. === LLM Response === The output of the multiply function on 2 and 3 is 6.
agent_with_memory.chat_history
[ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the mystery function on 5 and 6 again? Don't recompute.", additional_kwargs={}), ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the mystery function on 5 and 6 is -11.', additional_kwargs={}), ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the multiply function on 2 and 3 again? Don't recompute.", additional_kwargs={}), ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the multiply function on 2 and 3 is 6.', additional_kwargs={})]
.chat(user_input)
的底层运行机制¶
在底层实现中,.chat(user_input)
调用实际上会以 user_input
作为参数调用内存的 .get()
方法。正如我们在前一节了解到的,这将最终返回 primary
和所有 secondary
内存源的组合结果。这些组合后的消息会作为聊天历史记录传递给 LLM 的聊天 API。
composable_memory = SimpleComposableMemory.from_defaults(
primary_memory=ChatMemoryBuffer.from_defaults(),
secondary_memory_sources=[
vector_memory.copy(
deep=True
) # copy for illustrative purposes to explain what
# happened under the hood from previous subsection
],
)
agent_with_memory = agent_worker.as_agent(memory=composable_memory)
agent_with_memory.memory.get(
"What was the output of the mystery function on 5 and 6 again? Don't recompute."
)
[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: What is the mystery function on 5 and 6?\n\tASSISTANT: None\n\tTOOL: -11\n\tASSISTANT: The mystery function on 5 and 6 returns -11.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={})]
print(
agent_with_memory.memory.get(
"What was the output of the mystery function on 5 and 6 again? Don't recompute."
)[0]
)
system: You are a helpful assistant. Below are a set of relevant dialogues retrieved from potentially several memory sources: =====Relevant messages from memory source 1===== USER: What is the mystery function on 5 and 6? ASSISTANT: None TOOL: -11 ASSISTANT: The mystery function on 5 and 6 returns -11. =====End of relevant messages from memory source 1====== This is the end of the retrieved message dialogues.