Abilities v1: Complete BrowserOps
Opened this issue · 0 comments
markokraemer commented
GOAL: Get the Browser Automation / Interaction done, so we can implement the Execution Agent fully to be able to perform "Flow-Engineering". Mirko needs to able to use the Web Browser just like a human developer & thats the exact goal with this ability --> full browser interaction & automation.
Full Browser Interaction & automation
- https://www.youtube.com/watch?v=Yidy_ePo7pE&t=73s Watch the Demo to understand whats needed as an ability for the Agent to have
- https://youtu.be/qBs_50SzyBQ?si=ukoT2YM-E-ORVOUX&t=18 Also can solve captchas
- The code in the video is open source & we can reuse it in our own BrowserOps ability module: https://github.com/VRSEN/agency-swarm/tree/main/agency_swarm/agents/browsing/tools
- An alternative to explore is maybe also https://github.com/Skyvern-AI/skyvern