FlagEmbeddingReranker
重排序(Rerank)能够在不牺牲准确性的情况下加速大语言模型(LLM)查询(实际上还可能提升准确性)。其原理是通过从上下文中剔除不相关的节点来实现这一效果。
如果您在 Colab 上打开此 Notebook,可能需要安装 LlamaIndex 🦙。
In [ ]:
Copied!
%pip install llama-index-embeddings-huggingface
%pip install llama-index-llms-openai
%pip install llama-index-postprocessor-flag-embedding-reranker
%pip install llama-index-embeddings-huggingface
%pip install llama-index-llms-openai
%pip install llama-index-postprocessor-flag-embedding-reranker
In [ ]:
Copied!
!pip install llama-index
!pip install git+https://github.com/FlagOpen/FlagEmbedding.git
!pip install llama-index
!pip install git+https://github.com/FlagOpen/FlagEmbedding.git
In [ ]:
Copied!
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
下载数据
In [ ]:
Copied!
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
In [ ]:
Copied!
import os
OPENAI_API_KEY = "sk-"
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
import os
OPENAI_API_KEY = "sk-"
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
In [ ]:
Copied!
# load documents
documents = SimpleDirectoryReader("./data/paul_graham").load_data()
# load documents
documents = SimpleDirectoryReader("./data/paul_graham").load_data()
In [ ]:
Copied!
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = HuggingFaceEmbedding(
model_name="BAAI/bge-small-en-v1.5"
)
In [ ]:
Copied!
# build index
index = VectorStoreIndex.from_documents(documents=documents)
# build index
index = VectorStoreIndex.from_documents(documents=documents)
In [ ]:
Copied!
from llama_index.postprocessor.flag_embedding_reranker import (
FlagEmbeddingReranker,
)
rerank = FlagEmbeddingReranker(model="BAAI/bge-reranker-large", top_n=5)
from llama_index.postprocessor.flag_embedding_reranker import (
FlagEmbeddingReranker,
)
rerank = FlagEmbeddingReranker(model="BAAI/bge-reranker-large", top_n=5)
首先,我们尝试使用重排序(reranking)。我们对查询进行计时,以观察处理检索上下文输出所需的时间。
In [ ]:
Copied!
from time import time
from time import time
In [ ]:
Copied!
query_engine = index.as_query_engine(
similarity_top_k=10, node_postprocessors=[rerank]
)
now = time()
response = query_engine.query(
"Which grad schools did the author apply for and why?",
)
print(f"Elapsed: {round(time() - now, 2)}s")
query_engine = index.as_query_engine(
similarity_top_k=10, node_postprocessors=[rerank]
)
now = time()
response = query_engine.query(
"Which grad schools did the author apply for and why?",
)
print(f"Elapsed: {round(time() - now, 2)}s")
Elapsed: 5.37s
In [ ]:
Copied!
print(response)
print(response)
The author applied to three grad schools: MIT, Yale, and Harvard. The reason for applying to these schools was because they were renowned for AI at the time and the author wanted to pursue a career in artificial intelligence.
In [ ]:
Copied!
print(response.get_formatted_sources(length=200))
print(response.get_formatted_sources(length=200))
> Source (Doc id: f7e7f522-40ae-416a-917e-a70e59979105): I didn't want to drop out of grad school, but how else was I going to get out? I remember when my friend Robert Morris got kicked out of Cornell for writing the internet worm of 1988, I was envious... > Source (Doc id: df6c6b73-b488-4506-9ab1-ae5e8d499d44): So I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp. I knew from experience that Lisp was interesting for its own sake and not just for its association ... > Source (Doc id: 8ee64ca0-3a8d-49d2-a41d-cbf1e10216fd): [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who'd already graduated, or were about to that spring. Already this S... > Source (Doc id: e95b6077-628a-4422-baad-765638cb6978): It was as weird as it sounds. I resumed all my old patterns, except now there were doors where there hadn't been. Now when I was tired of walking, all I had to do was raise my hand, and (unless it ... > Source (Doc id: 6c54f961-c5ff-466e-861a-66f5c1c25e36): I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI. AI was in the air in t...
接下来,我们尝试不使用重新排序
In [ ]:
Copied!
query_engine = index.as_query_engine(similarity_top_k=10)
now = time()
response = query_engine.query(
"Which grad schools did the author apply for and why?",
)
print(f"Elapsed: {round(time() - now, 2)}s")
query_engine = index.as_query_engine(similarity_top_k=10)
now = time()
response = query_engine.query(
"Which grad schools did the author apply for and why?",
)
print(f"Elapsed: {round(time() - now, 2)}s")
Elapsed: 10.35s
In [ ]:
Copied!
print(response)
print(response)
The author applied to three grad schools: MIT, Yale, and Harvard. They chose these schools based on their strong reputations in the field of AI at the time. Additionally, Harvard was appealing because it was where Bill Woods, the inventor of the parser used in the author's SHRDLU clone, was located.
In [ ]:
Copied!
print(response.get_formatted_sources(length=200))
print(response.get_formatted_sources(length=200))
> Source (Doc id: f7e7f522-40ae-416a-917e-a70e59979105): I didn't want to drop out of grad school, but how else was I going to get out? I remember when my friend Robert Morris got kicked out of Cornell for writing the internet worm of 1988, I was envious... > Source (Doc id: 6c54f961-c5ff-466e-861a-66f5c1c25e36): I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI. AI was in the air in t... > Source (Doc id: d258db84-0975-4de0-a19b-752f529d9e5a): What I Worked On February 2021 Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed... > Source (Doc id: 04582ebe-239a-432a-9304-611676593c66): It's not that unprestigious types of work are good per se. But when you find yourself drawn to some kind of work despite its current lack of prestige, it's a sign both that there's something real t... > Source (Doc id: 8ee64ca0-3a8d-49d2-a41d-cbf1e10216fd): [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who'd already graduated, or were about to that spring. Already this S... > Source (Doc id: d46b4c41-05f8-4492-b978-0ce1863a0f00): Now that I could write essays again, I wrote a bunch about topics I'd had stacked up. I kept writing essays through 2020, but I also started to think about other things I could work on. How should ... > Source (Doc id: df6c6b73-b488-4506-9ab1-ae5e8d499d44): So I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp. I knew from experience that Lisp was interesting for its own sake and not just for its association ... > Source (Doc id: d91c08cf-6f7d-4ac5-8cf0-d8bcba4e77ff): It was missing a lot of things you'd want in a programming language. So these had to be added, and when they were, they weren't defined using McCarthy's original axiomatic approach. That wouldn't h... > Source (Doc id: e95b6077-628a-4422-baad-765638cb6978): It was as weird as it sounds. I resumed all my old patterns, except now there were doors where there hadn't been. Now when I was tired of walking, all I had to do was raise my hand, and (unless it ... > Source (Doc id: 027ba923-2307-4e28-8e6b-53be8e4db8ec): But Interleaf still had a few years to live yet. [5] Interleaf had done something pretty bold. Inspired by Emacs, they'd added a scripting language, and even made the scripting language a dialect ...
如我们所见,采用重排序机制的查询引擎在更短时间内(6秒 vs 10秒)输出了更为简洁的结果。虽然两个响应本质上都正确,但未启用重排序的查询引擎包含了大量无关信息——这种现象可归因于"上下文窗口污染"。