-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Miguel/observe a11y #412
Miguel/observe a11y #412
Conversation
🦋 Changeset detectedLatest commit: 0861e9c The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
lib/handlers/observeHandler.ts
Outdated
(fullPage: boolean) => | ||
fullPage ? window.processAllOfDom() : window.processDom([]), | ||
fullPage, | ||
const cdpClient = await this.stagehandPage.context.newCDPSession( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put raw CDP as a stagehandPage
function? i wanna do something like this.stagehandPage.sendCDP(cmd, args)
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 🚢
why
Including a new format to get website context using a11y trees. It unlocks a new paradigm in processing context for web agents leveraging the full potential of CDP/playwright. By using the structured, semantic data from a11y trees, this approach aims to improve interaction fidelity, reduce token cost, speed up inference, and optimize contextual awareness when LLMs perform tasks in web-based environments through text.
what changed
Context is now provided optionally with the flag useAccessibilityTree for observe tasks. This changes the way DOM is processed by using a11y trees. The DOM function is still used for backward compatibility with selector maps, but I also include a backendNodeId which is another approach for directly interacting with elements through CDP.
Sample usage:
Sample output:
test plan