AnalyticDB¶

AnalyticDB for PostgreSQL 是一款面向在线海量数据分析的大规模并行处理（MPP）数据仓库服务。

运行此笔记本需要您在云端部署一个 AnalyticDB for PostgreSQL 实例（可通过阿里云通用购买页面获取）。

创建实例后，您需要通过API或在实例详情页面的"账号管理"功能创建管理员账户。

请确保已安装 llama-index：

In [ ]:

Copied!

%pip install llama-index-vector-stores-analyticdb
%pip install llama-index-vector-stores-analyticdb

In [ ]:

Copied!

!pip install llama-index
!pip install llama-index

请提供参数：¶

In [ ]:

Copied!





import os
import getpass

# alibaba cloud ram ak and sk:
alibaba_cloud_ak = ""
alibaba_cloud_sk = ""

# instance information:
region_id = "cn-hangzhou"  # region id of the specific instance
instance_id = "gp-xxxx"  # adb instance id
account = "test_account"  # instance account name created by API or 'Account Management' at the instance detail web page
account_password = ""  # instance account password
import os
import getpass

# alibaba cloud ram ak and sk:
alibaba_cloud_ak = ""
alibaba_cloud_sk = ""

# instance information:
region_id = "cn-hangzhou"  # region id of the specific instance
instance_id = "gp-xxxx"  # adb instance id
account = "test_account"  # instance account name created by API or 'Account Management' at the instance detail web page
account_password = ""  # instance account password

导入所需依赖包：¶

In [ ]:

Copied!





from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
)
from llama_index.vector_stores.analyticdb import AnalyticDBVectorStore
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
)
from llama_index.vector_stores.analyticdb import AnalyticDBVectorStore

加载示例数据：¶

In [ ]:

Copied!

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

读取数据：¶

In [ ]:

Copied!





# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
print(f"Total documents: {len(documents)}")
print(f"First document, id: {documents[0].doc_id}")
print(f"First document, hash: {documents[0].hash}")
print(
    "First document, text"
    f" ({len(documents[0].text)} characters):\n{'='*20}\n{documents[0].text[:360]} ..."
)
# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
print(f"Total documents: {len(documents)}")
print(f"First document, id: {documents[0].doc_id}")
print(f"First document, hash: {documents[0].hash}")
print(
    "First document, text"
    f" ({len(documents[0].text)} characters):\n{'='*20}\n{documents[0].text[:360]} ..."
)

创建 AnalyticDB 向量存储对象：¶

In [ ]:

Copied!





analytic_db_store = AnalyticDBVectorStore.from_params(
    access_key_id=alibaba_cloud_ak,
    access_key_secret=alibaba_cloud_sk,
    region_id=region_id,
    instance_id=instance_id,
    account=account,
    account_password=account_password,
    namespace="llama",
    collection="llama",
    metrics="cosine",
    embedding_dimension=1536,
)
analytic_db_store = AnalyticDBVectorStore.from_params(
    access_key_id=alibaba_cloud_ak,
    access_key_secret=alibaba_cloud_sk,
    region_id=region_id,
    instance_id=instance_id,
    account=account,
    account_password=account_password,
    namespace="llama",
    collection="llama",
    metrics="cosine",
    embedding_dimension=1536,
)

从文档构建索引：¶

In [ ]:

Copied!

storage_context = StorageContext.from_defaults(vector_store=analytic_db_store)

index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)
storage_context = StorageContext.from_defaults(vector_store=analytic_db_store)

index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

使用索引进行查询：¶

In [ ]:

Copied!

query_engine = index.as_query_engine()
response = query_engine.query("Why did the author choose to work on AI?")

print(response.response)
query_engine = index.as_query_engine()
response = query_engine.query("Why did the author choose to work on AI?")

print(response.response)

删除集合：¶

In [ ]:

Copied!

analytic_db_store.delete_collection()
analytic_db_store.delete_collection()