Google Drive 阅读器¶
展示我们的 Google Drive 数据连接器功能
先决条件¶
按照以下步骤设置您的环境:
- 在您的 GCP 项目中启用 Google Drive API。
- 为您的 GCP 项目配置 OAuth 同意屏幕。
- 如果您不在 Google Workspace 中,将其设为"外部"即可。
- 为您的应用程序(此笔记本)创建客户端凭据。
- 确保使用"桌面应用"作为应用程序类型。
- 将这些客户端凭据移动到此笔记本所在的目录,并将其命名为"credentials.json"。
如果您在 Colab 上打开此 Notebook,可能需要安装 LlamaIndex 🦙。
In [ ]:
Copied!
%pip install llama-index llama-index-readers-google
%pip install llama-index llama-index-readers-google
In [ ]:
Copied!
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
In [ ]:
Copied!
from llama_index.core import SummaryIndex
from llama_index.readers.google import GoogleDriveReader
from IPython.display import Markdown, display
from llama_index.core import SummaryIndex
from llama_index.readers.google import GoogleDriveReader
from IPython.display import Markdown, display
选择读取文件夹¶
您可以通过在 Google 云端硬盘中导航至目标文件夹,然后提取网址的最后部分来获取文件夹 ID。
例如,对于这个网址:https://drive.google.com/drive/u/0/folders/abcdefgh12345678
,其文件夹 ID 为 abcdefgh12345678
In [ ]:
Copied!
# Replace the placeholder with your chosen folder ID
folder_id = ["<your_folder_id>"]
# Make sure credentials.json file exists in the current directory (data_connectors)
documents = GoogleDriveReader().load_data(folder_id=folder_id)
# Replace the placeholder with your chosen folder ID
folder_id = [""]
# Make sure credentials.json file exists in the current directory (data_connectors)
documents = GoogleDriveReader().load_data(folder_id=folder_id)
In [ ]:
Copied!
index = SummaryIndex.from_documents(documents)
index = SummaryIndex.from_documents(documents)
In [ ]:
Copied!
# Set Logging to DEBUG for more detailed outputs
query_engine = index.as_query_engine()
response = query_engine.query("<query_text>")
# Set Logging to DEBUG for more detailed outputs
query_engine = index.as_query_engine()
response = query_engine.query("")
In [ ]:
Copied!
display(Markdown(f"<b>{response}</b>"))
display(Markdown(f"{response}"))