高级提示词技巧(变量映射与函数)¶
本笔记本将展示一些高级提示词技术。这些功能允许您定义更具定制性/表现力的提示词、复用现有提示词,同时还能用更少的代码行数实现特定操作。
我们将演示以下功能:
- 部分格式化
- 提示模板变量映射
- 提示函数映射
- 动态少样本示例
In [ ]:
Copied!
%pip install llama-index-llms-openai
%pip install llama-index-llms-openai
1. 部分格式化¶
部分格式化 (partial_format) 功能允许您对提示进行部分格式化,填充部分变量同时保留其他变量供后续填充。
这是一个便捷的功能,您无需在传递过程中一直维护所有必需的提示变量直到执行 format 操作,可以在变量可用时进行部分格式化。
此操作将创建提示模板的副本。
In [ ]:
Copied!
from llama_index.core.prompts import RichPromptTemplate
qa_prompt_tmpl_str = """\
Context information is below.
---------------------
{{ context_str }}
---------------------
Given the context information and not prior knowledge, answer the query.
Please write the answer in the style of {{ tone_name }}
Query: {{ query_str }}
Answer: \
"""
prompt_tmpl = RichPromptTemplate(qa_prompt_tmpl_str)
from llama_index.core.prompts import RichPromptTemplate
qa_prompt_tmpl_str = """\
Context information is below.
---------------------
{{ context_str }}
---------------------
Given the context information and not prior knowledge, answer the query.
Please write the answer in the style of {{ tone_name }}
Query: {{ query_str }}
Answer: \
"""
prompt_tmpl = RichPromptTemplate(qa_prompt_tmpl_str)
In [ ]:
Copied!
partial_prompt_tmpl = prompt_tmpl.partial_format(tone_name="Shakespeare")
partial_prompt_tmpl = prompt_tmpl.partial_format(tone_name="Shakespeare")
In [ ]:
Copied!
partial_prompt_tmpl.kwargs
partial_prompt_tmpl.kwargs
Out[ ]:
{'tone_name': 'Shakespeare'}
In [ ]:
Copied!
fmt_prompt = partial_prompt_tmpl.format(
context_str="In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters",
query_str="How many params does llama 2 have",
)
print(fmt_prompt)
fmt_prompt = partial_prompt_tmpl.format(
context_str="In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters",
query_str="How many params does llama 2 have",
)
print(fmt_prompt)
Context information is below. --------------------- In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters --------------------- Given the context information and not prior knowledge, answer the query. Please write the answer in the style of Shakespeare Query: How many params does llama 2 have Answer:
我们也可以使用 format_messages 将提示格式化为 ChatMessage 对象。
In [ ]:
Copied!
fmt_prompt = partial_prompt_tmpl.format_messages(
context_str="In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters",
query_str="How many params does llama 2 have",
)
print(fmt_prompt)
fmt_prompt = partial_prompt_tmpl.format_messages(
context_str="In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters",
query_str="How many params does llama 2 have",
)
print(fmt_prompt)
[ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Context information is below.'), TextBlock(block_type='text', text='---------------------'), TextBlock(block_type='text', text='In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters'), TextBlock(block_type='text', text='---------------------'), TextBlock(block_type='text', text='Given the context information and not prior knowledge, answer the query.'), TextBlock(block_type='text', text='Please write the answer in the style of Shakespeare'), TextBlock(block_type='text', text='Query: How many params does llama 2 have'), TextBlock(block_type='text', text='Answer:')])]
2. 提示模板变量映射¶
模板变量映射功能允许您指定从"预期"提示键(例如响应合成所需的 context_str 和 query_str)到实际模板中键名的映射关系。
通过此功能,您可以复用现有的字符串模板,而无需繁琐地修改模板变量。
In [ ]:
Copied!
from llama_index.core.prompts import RichPromptTemplate
# NOTE: here notice we use `my_context` and `my_query` as template variables
qa_prompt_tmpl_str = """\
Context information is below.
---------------------
{{ my_context }}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {{ my_query }}
Answer: \
"""
template_var_mappings = {"context_str": "my_context", "query_str": "my_query"}
prompt_tmpl = RichPromptTemplate(
qa_prompt_tmpl_str, template_var_mappings=template_var_mappings
)
from llama_index.core.prompts import RichPromptTemplate
# NOTE: here notice we use `my_context` and `my_query` as template variables
qa_prompt_tmpl_str = """\
Context information is below.
---------------------
{{ my_context }}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {{ my_query }}
Answer: \
"""
template_var_mappings = {"context_str": "my_context", "query_str": "my_query"}
prompt_tmpl = RichPromptTemplate(
qa_prompt_tmpl_str, template_var_mappings=template_var_mappings
)
In [ ]:
Copied!
fmt_prompt = prompt_tmpl.format(
context_str="In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters",
query_str="How many params does llama 2 have",
)
print(fmt_prompt)
fmt_prompt = prompt_tmpl.format(
context_str="In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters",
query_str="How many params does llama 2 have",
)
print(fmt_prompt)
Context information is below. --------------------- In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters --------------------- Given the context information and not prior knowledge, answer the query. Query: How many params does llama 2 have Answer:
3. 提示词函数映射¶
您还可以将函数作为模板变量传入,而非固定值。
这使您能够在查询时动态注入某些依赖于其他值的变量。
以下是一些基础示例。我们会在《RAG提示工程指南》中展示更多高级示例(例如小样本示例)。
In [ ]:
Copied!
from llama_index.core.prompts import RichPromptTemplate
qa_prompt_tmpl_str = """\
Context information is below.
---------------------
{{ context_str }}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {{ query_str }}
Answer: \
"""
def format_context_fn(**kwargs):
# format context with bullet points
context_list = kwargs["context_str"].split("\n\n")
fmtted_context = "\n\n".join([f"- {c}" for c in context_list])
return fmtted_context
prompt_tmpl = RichPromptTemplate(
qa_prompt_tmpl_str, function_mappings={"context_str": format_context_fn}
)
from llama_index.core.prompts import RichPromptTemplate
qa_prompt_tmpl_str = """\
Context information is below.
---------------------
{{ context_str }}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {{ query_str }}
Answer: \
"""
def format_context_fn(**kwargs):
# format context with bullet points
context_list = kwargs["context_str"].split("\n\n")
fmtted_context = "\n\n".join([f"- {c}" for c in context_list])
return fmtted_context
prompt_tmpl = RichPromptTemplate(
qa_prompt_tmpl_str, function_mappings={"context_str": format_context_fn}
)
In [ ]:
Copied!
context_str = """\
In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.
Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases.
Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models.
"""
fmt_prompt = prompt_tmpl.format(
context_str=context_str, query_str="How many params does llama 2 have"
)
print(fmt_prompt)
context_str = """\
In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.
Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases.
Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models.
"""
fmt_prompt = prompt_tmpl.format(
context_str=context_str, query_str="How many params does llama 2 have"
)
print(fmt_prompt)
Context information is below. --------------------- - In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. - Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. - Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models. --------------------- Given the context information and not prior knowledge, answer the query. Query: How many params does llama 2 have Answer:
4. 动态少样本示例¶
通过函数映射,您还可以基于其他提示变量动态注入少样本示例。
以下示例展示了如何使用向量存储,根据查询动态注入文本到 SQL 的少样本示例。
首先,我们定义一个文本到 SQL 的提示模板。
In [ ]:
Copied!
text_to_sql_prompt_tmpl_str = """\
You are a SQL expert. You are given a natural language query, and your job is to convert it into a SQL query.
Here are some examples of how you should convert natural language to SQL:
<examples>
{{ examples }}
</examples>
Now it's your turn.
Query: {{ query_str }}
SQL:
"""
text_to_sql_prompt_tmpl_str = """\
You are a SQL expert. You are given a natural language query, and your job is to convert it into a SQL query.
Here are some examples of how you should convert natural language to SQL:
{{ examples }}
Now it's your turn.
Query: {{ query_str }}
SQL:
"""
根据此提示模板,我们将定义并索引一些文本到SQL的少样本示例。
In [ ]:
Copied!
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
In [ ]:
Copied!
from llama_index.core import Settings, VectorStoreIndex
from llama_index.core.schema import TextNode
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
# Set global default LLM and embed model
Settings.llm = OpenAI(model="gpt-4o-mini")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
# Setup few-shot examples
example_nodes = [
TextNode(
text="Query: How many params does llama 2 have?\nSQL: SELECT COUNT(*) FROM llama_2_params;"
),
TextNode(
text="Query: How many layers does llama 2 have?\nSQL: SELECT COUNT(*) FROM llama_2_layers;"
),
]
# Create index
index = VectorStoreIndex(nodes=example_nodes)
# Create retriever
retriever = index.as_retriever(similarity_top_k=1)
from llama_index.core import Settings, VectorStoreIndex
from llama_index.core.schema import TextNode
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
# Set global default LLM and embed model
Settings.llm = OpenAI(model="gpt-4o-mini")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
# Setup few-shot examples
example_nodes = [
TextNode(
text="Query: How many params does llama 2 have?\nSQL: SELECT COUNT(*) FROM llama_2_params;"
),
TextNode(
text="Query: How many layers does llama 2 have?\nSQL: SELECT COUNT(*) FROM llama_2_layers;"
),
]
# Create index
index = VectorStoreIndex(nodes=example_nodes)
# Create retriever
retriever = index.as_retriever(similarity_top_k=1)
借助我们的检索器,我们可以创建带有函数映射的提示模板,从而根据查询动态注入少量示例。
In [ ]:
Copied!
from llama_index.core.prompts import RichPromptTemplate
def get_examples_fn(**kwargs):
query = kwargs["query_str"]
examples = retriever.retrieve(query)
return "\n\n".join(node.text for node in examples)
prompt_tmpl = RichPromptTemplate(
text_to_sql_prompt_tmpl_str,
function_mappings={"examples": get_examples_fn},
)
from llama_index.core.prompts import RichPromptTemplate
def get_examples_fn(**kwargs):
query = kwargs["query_str"]
examples = retriever.retrieve(query)
return "\n\n".join(node.text for node in examples)
prompt_tmpl = RichPromptTemplate(
text_to_sql_prompt_tmpl_str,
function_mappings={"examples": get_examples_fn},
)
In [ ]:
Copied!
prompt = prompt_tmpl.format(
query_str="What are the number of parameters in the llama 2 model?"
)
print(prompt)
prompt = prompt_tmpl.format(
query_str="What are the number of parameters in the llama 2 model?"
)
print(prompt)
You are a SQL expert. You are given a natural language query, and your job is to convert it into a SQL query. Here are some examples of how you should convert natural language to SQL: <examples> Query: How many params does llama 2 have? SQL: SELECT COUNT(*) FROM llama_2_params; </examples> Now it's your turn. Query: What are the number of parameters in the llama 2 model? SQL:
In [ ]:
Copied!
response = Settings.llm.complete(prompt)
print(response.text)
response = Settings.llm.complete(prompt)
print(response.text)
SELECT COUNT(*) FROM llama_2_params;