使用 AgentQL 构建浏览器代理¶
AgentQL 工具提供网页交互和结构化数据提取功能,可通过 AgentQL 查询或自然语言提示从任何网页获取数据。AgentQL 可跨多种语言和网页使用,且不会因时间推移或网页变更而失效。
本教程将向您展示如何:
- 使用 AgentQL 工具和 LlamaIndex 创建浏览器代理
- 如何使用 AgentQL 工具浏览互联网
- 如何使用 AgentQL 工具从互联网抓取内容
概述¶
AgentQL 提供三种功能工具。第一种无需浏览器,基于 REST API:
extract_web_data_with_rest_api
通过 URL 从网页提取结构化 JSON 数据,支持使用 AgentQL 查询或自然语言描述指定数据。
另外两种工具必须配合 Playwright
浏览器或通过 Chrome DevTools 协议(CDP)连接的远程浏览器实例使用:
extract_web_data_from_browser
从浏览器当前活动页面提取结构化 JSON 数据,支持使用 AgentQL 查询或自然语言描述。get_web_element_from_browser
通过自然语言描述在浏览器当前活动页面查找网页元素,并返回其 CSS 选择器以供后续交互。
工具特性¶
工具 | 网页数据提取 | 网页元素提取 | 支持本地浏览器 |
---|
| extract_web_data_with_rest_api | ✅ | ❌ | ❌ | extract_web_data_from_browser | ✅ | ❌ | ✅ | get_web_element_from_browser | ❌ | ✅ | ✅
设置¶
%pip install llama-index-tools-agentql llama-index-tools-playwright llama-index
凭证¶
要使用 AgentQL 工具,您需要从 AgentQL 开发者门户 获取自己的 API 密钥,并设置 AgentQL 环境变量:
import os
os.environ["AGENTQL_API_KEY"] = "YOUR_AGENTQL_API_KEY"
配置 Playwright 浏览器与 AgentQL 工具¶
要运行此笔记本,需安装 Playwright 浏览器并配置 Jupyter Notebook 的 asyncio
事件循环。
!playwright install
# This import is required only for jupyter notebooks, since they have their own eventloop
import nest_asyncio
nest_asyncio.apply()
实例化¶
AgentQLRestAPIToolSpec
¶
AgentQLRestAPIToolSpec
提供了 extract_web_data_with_rest_api
功能工具。
可通过以下参数实例化 AgentQLRestAPIToolSpec
:
timeout
:请求超时前的等待秒数。若数据提取超时可增大此值。默认值为900
。is_stealth_mode_enabled
:是否启用实验性反爬虫规避策略。此功能可能无法在所有网站上持续生效,启用后数据提取耗时可能增加。默认值为False
。wait_for
:提取数据前等待页面加载的秒数。默认值为0
。is_scroll_to_bottom_enabled
:是否在提取数据前滚动至页面底部。默认值为False
。mode
:"standard"
模式采用深度数据分析,"fast"
模式会牺牲部分分析深度以换取速度,适用于大多数场景。在此指南中了解模式详情。 默认值为"fast"
。is_screenshot_enabled
:是否在提取数据前截取屏幕截图。截图将以Base64字符串形式返回至'metadata'中。默认值为False
。
AgentQLRestAPIToolSpec
使用 AgentQL REST API,参数详情请参阅 API参考文档。
from llama_index.tools.agentql import AgentQLRestAPIToolSpec
agentql_rest_api_tool = AgentQLRestAPIToolSpec()
AgentQLBrowserToolSpec
¶
AgentQLBrowserToolSpec
提供两个工具:extract_web_data_from_browser
和 get_web_element_from_browser
。
该工具规格可通过以下参数实例化:
async_browser
: 一个异步的 playwright 浏览器实例。timeout_for_data
: 提取数据请求的超时等待秒数。默认为900
。timeout_for_element
: 获取元素请求的超时等待秒数。默认为900
。wait_for_network_idle
: 是否等待网络完全空闲后再执行操作。默认为True
。include_hidden_for_data
: 提取数据时是否考虑页面上视觉隐藏的元素。默认为True
。include_hidden_for_element
: 获取元素时是否考虑页面上视觉隐藏的元素。默认为False
。mode
:"standard"
模式采用深度数据分析,而"fast"
模式会牺牲部分分析深度以换取速度,适用于大多数场景。在此指南中了解不同模式的详情。 默认为"fast"
。
AgentQLBrowserToolSpec
使用 AgentQL SDK。您可以在 SDK API 参考文档 中找到关于参数和函数的更多详细信息。
注意: 要实例化
AgentQLBrowserToolSpec
,您需要提供一个浏览器实例。您可以使用 LlamaIndex 的 Playwright ToolSpec 中的create_async_playwright_browser
工具方法来创建一个浏览器实例。
from llama_index.tools.playwright.base import PlaywrightToolSpec
from llama_index.tools.agentql import AgentQLBrowserToolSpec
async_browser = await PlaywrightToolSpec.create_async_playwright_browser()
agentql_browser_tool = AgentQLBrowserToolSpec(async_browser=async_browser)
调用 AgentQL 工具¶
extract_web_data_with_rest_api
¶
该工具底层使用 AgentQL 的 REST API,将公开网页的 URL 发送至 AgentQL 的端点。此方法不适用于私有页面或需要登录的会话,此类场景请使用 extract_web_data_from_browser
。
url
: 需要提取数据的网页 URL。query
: 要执行的 AgentQL 查询。若需按自定义结构提取数据时使用。详细了解如何在文档中编写 AgentQL 查询。prompt
: 对页面待提取数据的自然语言描述。AgentQL 将根据您的提示推断数据结构。
注意: 使用 AgentQL 时必须定义
query
或prompt
其中一项。
# You can invoke the tool with either a query or a prompt
# await agentql_rest_api_tool.extract_web_data_with_rest_api(
# url="https://www.agentql.com/blog",
# prompt="the blog posts with title, url, author and publication date",
# )
await agentql_rest_api_tool.extract_web_data_with_rest_api(
url="https://www.agentql.com/blog",
query="{ posts[] { title url author date }}",
)
{'data': {'posts': [{'title': 'AgentQL MCP Server: Structured Web Data for Claude, Cursor, Windsurf, and more', 'url': 'https://www.agentql.com/blog/2025-mcp-integration', 'author': 'Rachel-Lee Nabors', 'date': 'Mar 12, 2025'}, {'title': 'Dify + AgentQL: Build AI Apps with Live Web Data, No Code Needed', 'url': 'https://www.agentql.com/blog/2025-dify-integration', 'author': 'Rachel-Lee Nabors', 'date': 'Mar 11, 2025'}, {'title': 'Zapier + AgentQL: No-Code Web Data for Smarter Workflows', 'url': 'https://www.agentql.com/blog/2025-zapier-integration', 'author': 'Rachel-Lee Nabors', 'date': 'Mar 10, 2025'}, {'title': 'Something is coming.', 'url': 'https://www.agentql.com/blog/2025-iw-teaser', 'author': 'Rachel-Lee Nabors', 'date': 'Mar 7, 2025'}, {'title': 'Automated web application testing with AI and Playwright', 'url': 'https://www.agentql.com/blog/2025-automated-testing-web-ai-playwright', 'author': 'Vladimir de Turckheim', 'date': 'Feb 26, 2025'}]}, 'metadata': {'request_id': '5a43ab86-f68b-4470-bca9-ab51a791041a', 'generated_query': None, 'screenshot': None}}
隐身模式¶
AgentQL 提供实验性的反机器人规避策略,以避免被反机器人服务检测到。
注意:隐身模式目前处于实验阶段,可能无法在所有网站上持续生效。与非隐身模式相比,数据提取过程可能需要更长时间才能完成。
# agentql_rest_api_tool = AgentQLRestAPIToolSpec(is_stealth_mode_enabled=True)
await agentql_rest_api_tool.extract_web_data_with_rest_api(
url="https://www.patagonia.com/shop/web-specials/womens",
query="{ items[] { name price}}",
)
{'data': {'items': [{'name': "W's Recycled Down Sweater™ Parka - Pitch Blue (PIBL) (28460)", 'price': 178.99}, {'name': "W's Recycled Down Sweater™ Parka - Shelter Brown (SHBN) (28460)", 'price': 178.99}, {'name': "W's Recycled Down Sweater™ Parka - Pine Needle Green (PNGR) (28460)", 'price': 178.99}, {'name': "W's Recycled Down Sweater™ Parka - Burnished Red (BURR) (28460)", 'price': 178.99}, {'name': "W's Nano Puff® Jacket - Burnished Red (BURR) (84217)", 'price': 118.99}, {'name': "W's Nano Puff® Jacket - Pine Needle Green (PNGR) (84217)", 'price': 118.99}, {'name': "W's Powder Town Jacket - Vivid Apricot (VAPC) (31635)", 'price': 208.99}, {'name': "W's Powder Town Jacket - Pine Needle Green (PNGR) (31635)", 'price': 208.99}, {'name': "W's Powder Town Jacket - Dulse Mauve (DLMA) (31635)", 'price': 208.99}, {'name': "W's Powder Town Jacket - Smolder Blue w/Dulse Mauve (SBMA) (31635)", 'price': 208.99}, {'name': "W's Powder Town Pants - Pine Needle Green (PNGR) (31645)", 'price': 148.99}, {'name': "W's Powder Town Pants - Thermal Blue (TMBL) (31645)", 'price': 173.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Dulse Mauve (DLMA) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Synched Flight Small: Natural (SYNL) (25455)", 'price': 96.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Thermal Blue (TMBL) (25455)", 'price': 82.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Across Oceans: Pitch Blue (ASPH) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Terra Pink (TRPI) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Small Currents: Natural (SCNL) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Nickel w/Vivid Apricot (NLVA) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Echo Purple (ECPU) (25455)", 'price': 68.99}, {'name': "W's Lightweight Synchilla® Snap-T® Pullover - Oatmeal Heather w/Vessel Blue (OHVL) (25455)", 'price': 68.99}, {'name': "W's Down Sweater™ - Seabird Grey (SBDY) (84684)", 'price': 166.99}, {'name': "W's Pine Bank 3-in-1 Parka - Shelter Brown (SHBN) (21025)", 'price': 273.99}, {'name': "W's Pine Bank 3-in-1 Parka - Pitch Blue (PIBL) (21025)", 'price': 328.99}, {'name': "W's Pine Bank 3-in-1 Parka - Burnished Red (BURR) (21025)", 'price': 273.99}, {'name': "W's Pine Bank 3-in-1 Parka - Pine Needle Green (PNGR) (21025)", 'price': 273.99}, {'name': "W's SnowDrifter Jacket - Vessel Blue (VSLB) (30071)", 'price': 268.99}, {'name': "W's SnowDrifter Jacket - Dulse Mauve (DLMA) (30071)", 'price': 268.99}, {'name': "W's SnowDrifter Jacket - Vivid Apricot (VAPC) (30071)", 'price': 268.99}, {'name': "W's SnowDrifter Jacket - Thermal Blue (TMBL) (30071)", 'price': 268.99}, {'name': "W's Re-Tool Half-Snap Pullover - Burnished Red (BURR) (26465)", 'price': 78.99}, {'name': "W's Re-Tool Half-Snap Pullover - Vessel Blue (VSLB) (26465)", 'price': 94.99}, {'name': "W's Re-Tool Half-Snap Pullover - Dulse Mauve (DLMA) (26465)", 'price': 78.99}, {'name': "W's Re-Tool Half-Snap Pullover - Shelter Brown (SHBN) (26465)", 'price': 78.99}, {'name': "W's Insulated Storm Shift Jacket - Dulse Mauve (DLMA) (31835)", 'price': 383.99}, {'name': "W's Insulated Storm Shift Jacket - Pine Needle Green (PNGR) (31835)", 'price': 328.99}, {'name': "W's SnowDrifter Bibs - Black (BLK) (30081)", 'price': 238.99}, {'name': "W's SnowDrifter Bibs - Smolder Blue (SMDB) (30081)", 'price': 278.99}, {'name': "W's SnowDrifter Bibs - Dulse Mauve (DLMA) (30081)", 'price': 238.99}, {'name': "W's SnowDrifter Bibs - Pine Needle Green (PNGR) (30081)", 'price': 238.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Chevron Cable: Natural (CHNL) (51025)", 'price': 73.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Only Earth: Beeswax Tan (OETN) (51025)", 'price': 103.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Snowdrift: Thermal Blue (SDTL) (51025)", 'price': 88.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Ridge: Pine Needle Green (RPNG) (51025)", 'price': 88.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Chevron Cable: Madder Red (CHMR) (51025)", 'price': 88.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Smolder Blue (SMDB) (51025)", 'price': 73.99}, {'name': "W's Recycled Wool-Blend Crewneck Sweater - Fireside: Shelter Brown (FISN) (51025)", 'price': 73.99}, {'name': "W's Micro D® Joggers - Synched Flight Small: Natural (SYNL) (22020)", 'price': 48.99}, {'name': "W's Micro D® Joggers - Endless Blue (ENLB) (22020)", 'price': 58.99}, {'name': "W's Micro D® Joggers - Small Currents: Natural (SCNL) (22020)", 'price': 48.99}, {'name': "W's Better Sweater® 1/4-Zip - Stormy Mauve (STMA) (25618)", 'price': 68.99}, {'name': "W's Better Sweater® 1/4-Zip - Dulse Mauve (DLMA) (25618)", 'price': 82.99}, {'name': "W's Better Sweater® 1/4-Zip - Torrey Pine Green (TPGN) (25618)", 'price': 82.99}, {'name': "W's Better Sweater® 1/4-Zip - Nouveau Green (NUVG) (25618)", 'price': 68.99}, {'name': "W's Better Sweater® 1/4-Zip - Raptor Brown (RPBN) (25618)", 'price': 68.99}, {'name': "W's Insulated Powder Town Pants - Black (BLK) (31185)", 'price': 160.99}, {'name': "W's Insulated Powder Town Pants - Smolder Blue (SMDB) (31185)", 'price': 160.99}, {'name': "W's Insulated Powder Town Pants - Dulse Mauve (DLMA) (31185)", 'price': 160.99}, {'name': "W's Insulated Powder Town Pants - Vivid Apricot (VAPC) (31185)", 'price': 160.99}, {'name': "W's Insulated Powder Town Pants - Across Oceans: Smolder Blue (ASBE) (31185)", 'price': 160.99}, {'name': 'Atom Sling 8L - Vessel Blue (VSLB) (48262)', 'price': 44.99}, {'name': 'Atom Sling 8L - Buckhorn Green (BUGR) (48262)', 'price': 44.99}, {'name': 'Atom Sling 8L - Dulse Mauve (DLMA) (48262)', 'price': 44.99}, {'name': "W's Classic Retro-X® Jacket - Natural w/Smolder Blue (NTSB) (23074)", 'price': 136.99}, {'name': "W's Classic Retro-X® Jacket - Nest Brown w/Dulse Mauve (NBDU) (23074)", 'price': 113.99}, {'name': "W's Classic Retro-X® Jacket - Small Currents: Natural (SCNL) (23074)", 'price': 113.99}, {'name': "W's Los Gatos 1/4-Zip - Salt Grey (SGRY) (25236)", 'price': 53.99}, {'name': "W's Los Gatos 1/4-Zip - Dulse Mauve (DLMA) (25236)", 'price': 64.99}, {'name': "W's Stand Up® Cropped Corduroy Overalls - Nest Brown (NESB) (75100)", 'price': 68.99}, {'name': "W's Stand Up® Cropped Corduroy Overalls - Pitch Blue (PIBL) (75100)", 'price': 68.99}, {'name': "W's Stand Up® Cropped Corduroy Overalls - Beeswax Tan (BWX) (75100)", 'price': 68.99}, {'name': "W's Synchilla® Jacket - Oatmeal Heather w/Natural (OTNL) (22955)", 'price': 88.99}, {'name': "W's Synchilla® Jacket - Black (BLK) (22955)", 'price': 73.99}, {'name': "W's Synchilla® Jacket - Pitch Blue (PIBL) (22955)", 'price': 73.99}, {'name': "W's Synchilla® Jacket - Beeswax Tan (BWX) (22955)", 'price': 73.99}, {'name': "W's Insulated Powder Town Jacket - Vivid Apricot (VAPC) (31200)", 'price': 238.99}, {'name': "W's Insulated Powder Town Jacket - Black (BLK) (31200)", 'price': 278.99}, {'name': "W's Insulated Powder Town Jacket - Across Oceans: Smolder Blue (ASBE) (31200)", 'price': 238.99}, {'name': "W's Powder Town Bibs - Smolder Blue (SMDB) (31650)", 'price': 178.99}, {'name': "W's Powder Town Bibs - Dulse Mauve (DLMA) (31650)", 'price': 208.99}, {'name': "W's Powder Town Bibs - Pine Needle Green (PNGR) (31650)", 'price': 178.99}, {'name': "W's Powder Town Bibs - Seabird Grey (SBDY) (31650)", 'price': 178.99}, {'name': "W's Retro Pile Marsupial - Thermal Blue (TMBL) (22835)", 'price': 73.99}, {'name': "W's Retro Pile Marsupial - Shroom Taupe (STPE) (22835)", 'price': 88.99}, {'name': "W's Retro Pile Marsupial - Shelter Brown (SHBN) (22835)", 'price': 73.99}, {'name': "W's Cord Fjord Coat - Dulse Mauve (DLMA) (26881)", 'price': 163.99}, {'name': "W's Cord Fjord Coat - Shelter Brown (SHBN) (26881)", 'price': 163.99}, {'name': "W's Regenerative Organic Certified® Cotton Essential Top - Thermal Blue (TMBL) (42171)", 'price': 41.99}, {'name': "W's Regenerative Organic Certified® Cotton Essential Top - Pine Needle Green (PNGR) (42171)", 'price': 41.99}, {'name': "W's Lonesome Mesa Long Coat - Pitch Blue (PIBL) (26655)", 'price': 148.99}, {'name': "W's Lonesome Mesa Long Coat - Pine Needle Green (PNGR) (26655)", 'price': 148.99}]}, 'metadata': {'request_id': '0016c761-92c1-47b5-9b8f-f71f9727d58d', 'generated_query': None, 'screenshot': None}}
extract_web_data_from_browser
¶
query
: 需要执行的 AgentQL 查询语句。若需按自定义结构提取数据时使用此参数。详细了解如何编写 AgentQL 查询语句。prompt
: 对页面待提取数据的自然语言描述。AgentQL 将根据您的提示推断数据结构。
注意: 使用 AgentQL 时必须定义
query
或prompt
其中一项参数。
执行数据提取前,需先通过 LlamaIndex 的 Playwright 点击工具导航至目标网页。
playwright_tool = PlaywrightToolSpec(async_browser=async_browser)
await playwright_tool.navigate_to("https://www.agentql.com/blog")
# You can invoke the tool with either a query or a prompt
# await agentql_browser_tool.extract_web_data_from_browser(
# query="{ posts[] { title url }}",
# )
await agentql_browser_tool.extract_web_data_from_browser(
prompt="the blog posts with title and url",
)
/Users/jisonz/Library/Caches/pypoetry/virtualenvs/llama-index-AJEGkUS0-py3.13/lib/python3.13/site-packages/agentql/_core/_utils.py:167: UserWarning: 🚨 The function get_data_by_prompt_experimental is experimental and may not work as expected 🚨
warnings.warn(
{'blog_post': [{'title': 'AgentQL MCP Server: Structured Web Data for Claude, Cursor, Windsurf, and more', 'url': 'https://www.agentql.com/blog/2025-mcp-integration'}, {'title': 'Dify + AgentQL: Build AI Apps with Live Web Data, No Code Needed', 'url': 'https://www.agentql.com/blog/2025-dify-integration'}, {'title': 'Zapier + AgentQL: No-Code Web Data for Smarter Workflows', 'url': 'https://www.agentql.com/blog/2025-zapier-integration'}, {'title': 'Something is coming.', 'url': 'https://www.agentql.com/blog/2025-iw-teaser'}, {'title': 'Automated web application testing with AI and Playwright', 'url': 'https://www.agentql.com/blog/2025-automated-testing-web-ai-playwright'}]}
get_web_element_from_browser
¶
prompt
: 用于描述需要在页面上查找的网页元素的自然语言文本。
await playwright_tool.navigate_to("https://www.agentql.com/blog")
print(await playwright_tool.get_current_page())
next_page_button = await agentql_browser_tool.get_web_element_from_browser(
prompt="The next page navigation button",
)
next_page_button
https://www.agentql.com/blog
"[tf623_id='1111']"
点击该元素并再次检查网址
await playwright_tool.click(next_page_button)
"Clicked element '[tf623_id='1111']'"
print(await playwright_tool.get_current_page())
https://www.agentql.com/blog/page/2
配合代理使用 AgentQL 工具¶
要开始使用,您需要一个 OpenAI API 密钥
# set your openai key, if using openai
import os
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
from llama_index.core.agent import FunctionCallingAgent
from llama_index.llms.openai import OpenAI
# We add playwright's click, get_current_page, and navigate_to tools to the agent along with agentql tools
playwright_tool = PlaywrightToolSpec(async_browser=async_browser)
playwright_tool_list = playwright_tool.to_tool_list()
playwright_agent_tool_list = [
tool
for tool in playwright_tool_list
if tool.metadata.name in ["click", "get_current_page", "navigate_to"]
]
agent = FunctionCallingAgent.from_tools(
playwright_agent_tool_list + agentql_browser_tool.to_tool_list(),
llm=OpenAI(model="gpt-4o"),
)
print(
agent.chat(
"""
Navigate to https://blog.samaltman.com/archive,
Find blog posts titled "What I wish someone had told me", click on the link,
Extract the blog text and number of views.
"""
)
)
I have extracted the blog post titled "What I wish someone had told me" along with the number of views. Here are the details: **Blog Text:** > Optimism, obsession, self-belief, raw horsepower and personal connections are how things get started. Cohesive teams, the right combination of calmness and urgency, and unreasonable commitment are how things get finished. Long-term orientation is in short supply; try not to worry about what people think in the short term, which will get easier over time. It is easier for a team to do a hard thing that really matters than to do an easy thing that doesn’t really matter; audacious ideas motivate people. Incentives are superpowers; set them carefully. Concentrate your resources on a small number of high-conviction bets; this is easy to say but evidently hard to do. You can delete more stuff than you think. Communicate clearly and concisely. Fight bullshit and bureaucracy every time you see it and get other people to fight it too. Do not let the org chart get in the way of people working productively together. Outcomes are what count; don’t let good process excuse bad results. Spend more time recruiting. Take risks on high-potential people with a fast rate of improvement. Look for evidence of getting stuff done in addition to intelligence. Superstars are even more valuable than they seem, but you have to evaluate people on their net impact on the performance of the organization. Fast iteration can make up for a lot; it’s usually ok to be wrong if you iterate quickly. Plans should be measured in decades, execution should be measured in weeks. Don’t fight the business equivalent of the laws of physics. Inspiration is perishable and life goes by fast. Inaction is a particularly insidious type of risk. Scale often has surprising emergent properties. Compounding exponentials are magic. In particular, you really want to build a business that gets a compounding advantage with scale. Get back up and keep going. Working with great people is one of the best parts of life. **Number of Views:** 531,222
在代理工作流中使用 Playwright 工具¶
from llama_index.llms.openai import OpenAI
from llama_index.core.agent.workflow import AgentWorkflow
from llama_index.core.agent.workflow import (
AgentInput,
AgentOutput,
ToolCall,
ToolCallResult,
AgentStream,
)
playwright_tool_list = playwright_tool.to_tool_list()
playwright_agent_tool_list = [
tool
for tool in playwright_tool_list
if tool.metadata.name in ["click", "get_current_page", "navigate_to"]
]
llm = OpenAI(model="gpt-4o")
workflow = AgentWorkflow.from_tools_or_functions(
playwright_agent_tool_list + agentql_browser_tool.to_tool_list(),
llm=llm,
system_prompt="You are a helpful assistant that can do browser automation, data extraction and text summarization",
)
handler = workflow.run(
user_msg="""
Navigate to https://blog.samaltman.com/archive,
Find blog posts titled "What I wish someone had told me", click on the link,
Detect if the webpage has navigated to the blog post,
then extract the blog text and number of views.
"""
)
async for event in handler.stream_events():
if isinstance(event, AgentStream):
print(event.delta, end="", flush=True)
elif isinstance(event, ToolCallResult):
print(event.tool_name) # the tool name
print(event.tool_kwargs) # the tool kwargs
print(event.tool_output) # the tool output
navigate_to {'url': 'https://blog.samaltman.com/archive'} Navigating to https://blog.samaltman.com/archive returned status code 200 get_web_element_from_browser {'prompt': "blog post titled 'What I wish someone had told me'"} [tf623_id='1849'] click {'selector': "[tf623_id='1849']"} Clicked element '[tf623_id='1849']' get_current_page {} https://blog.samaltman.com/what-i-wish-someone-had-told-me extract_web_data_from_browser {'prompt': 'Extract the blog text and number of views from the page.'} {'blog_post_text': 'Optimism, obsession, self-belief, raw horsepower and personal connections are how things get started.\nCohesive teams, the right combination of calmness and urgency, and unreasonable commitment are how things get finished. Long-term orientation is in short supply; try not to worry about what people think in the short term, which will get easier over time.\nIt is easier for a team to do a hard thing that really matters than to do an easy thing that doesn’t really matter; audacious ideas motivate people.\nIncentives are superpowers; set them carefully.\nConcentrate your resources on a small number of high-conviction bets; this is easy to say but evidently hard to do. You can delete more stuff than you think.\nCommunicate clearly and concisely.\nFight bullshit and bureaucracy every time you see it and get other people to fight it too. Do not let the org chart get in the way of people working productively together.\nOutcomes are what count; don’t let good process excuse bad results.\nSpend more time recruiting. Take risks on high-potential people with a fast rate of improvement. Look for evidence of getting stuff done in addition to intelligence.\nSuperstars are even more valuable than they seem, but you have to evaluate people on their net impact on the performance of the organization.\nFast iteration can make up for a lot; it’s usually ok to be wrong if you iterate quickly. Plans should be measured in decades, execution should be measured in weeks.\nDon’t fight the business equivalent of the laws of physics.\nInspiration is perishable and life goes by fast. Inaction is a particularly insidious type of risk.\nScale often has surprising emergent properties.\nCompounding exponentials are magic. In particular, you really want to build a business that gets a compounding advantage with scale.\nGet back up and keep going.\nWorking with great people is one of the best parts of life.', 'views_count': 531223} I have navigated to the blog post titled "What I Wish Someone Had Told Me" and extracted the following information: **Blog Text:** Optimism, obsession, self-belief, raw horsepower and personal connections are how things get started. Cohesive teams, the right combination of calmness and urgency, and unreasonable commitment are how things get finished. Long-term orientation is in short supply; try not to worry about what people think in the short term, which will get easier over time. It is easier for a team to do a hard thing that really matters than to do an easy thing that doesn’t really matter; audacious ideas motivate people. Incentives are superpowers; set them carefully. Concentrate your resources on a small number of high-conviction bets; this is easy to say but evidently hard to do. You can delete more stuff than you think. Communicate clearly and concisely. Fight bullshit and bureaucracy every time you see it and get other people to fight it too. Do not let the org chart get in the way of people working productively together. Outcomes are what count; don’t let good process excuse bad results. Spend more time recruiting. Take risks on high-potential people with a fast rate of improvement. Look for evidence of getting stuff done in addition to intelligence. Superstars are even more valuable than they seem, but you have to evaluate people on their net impact on the performance of the organization. Fast iteration can make up for a lot; it’s usually ok to be wrong if you iterate quickly. Plans should be measured in decades, execution should be measured in weeks. Don’t fight the business equivalent of the laws of physics. Inspiration is perishable and life goes by fast. Inaction is a particularly insidious type of risk. Scale often has surprising emergent properties. Compounding exponentials are magic. In particular, you really want to build a business that gets a compounding advantage with scale. Get back up and keep going. Working with great people is one of the best parts of life. **Number of Views:** 531,223