/bot

GPT-powered bot that can automate complex online tasks using both the web browser and API calls.

Primary LanguageJavaScriptOtherNOASSERTION

Announcement: Fediverse, Twitter, Show HN

aicombinator2.mp4

This tool gives AI the 3 big things it's currently missing: Identity, Memory, and Agency. Essentially, it is a personal assistant bot that runs locally, understands tasks in natural language and with a human-reviewed library of composable tasks, can perform complex online tasks across multiple websites and webapps, by driving a browser on your local machine or call APIs when available.

The tasks are described in natural language. For example:

Register myshop.com on Google Domains. Then setup a Shopify website on it. Then post a tweet about it.

This is somewhat similar to the LangChain agents. But this one operates by controlling a real browser with puppeteer scripts. It has the ability to complete signup or login with OTP (SMS or Email) and 2FA tokens, retain and recall data from persistent storage, and even make payments. The aim for this tool is to become a true personal assistant for everyone. And because it uses the web browser like a human would, it doesn't need API access for the websites and webapps. But it does use APIs for many popular sites.

This tool can be launched against either your primary Chrome instance or puppeteer's sandboxed chromium instance. It can be invoked either as a command-line tool or in your custom JavaScript code.

There are two kinds of bots. LocalBot stores their data in a local .json file. CloudBot uses remote storage, which is useful for sharing bot instances within your org etc. When launched from command-line, both kind of bots operate using the local web browser. No data is sent to any remote servers.

Currently, only the high-level script is generated by AI at run-time. The code for individual tasks can be human-coded and human-reviewed for high robustness and correctness. The API pieces are imported from ActivePieces.

Install

Needs NodeJS version 18 or higher:

npm install -g @aicombinator/bot

Usage

On command-line:

Set OpenAI's API key as environment variable: export OPENAI_API_KEY=<api_key>

See the list of available apps: bot --list-apps

Describe your task in natural language: bot "pair device on google sms. Then post on twitter with username foo and password bar and message hello world from aicombinator. Then fetch the top story from hacker news"

If OPENAI_API_KEY is not available, the command will be parsed with a strict syntax (until open-source LLMs arrive):

bot <task> on|to|at|from <site> with <param1_name> <param1_value> and <param2_name> <param2_value>

Necessary params will also be looked up in environment variables and bot's storage.

Options:

--bot: String ID to identify each unique localbot (default `mybot1`)
--token: APIKEY to access a CloudBot's remote storage. Get yours at [aicombinator.app](https://aicombinator.app)
--data-dir: Path to the directory where data for localbots is stored (default `.aicombinator`)

In your custom script. See sample.js.

  const bot = await aicombinator.LocalBot.init({bot_id: 'mybot1', data_dir: null});
  await aicombinator.site.task(bot, {param1_name: 'param1_value', param2_name: 'param2_value'})
  
  // for example:
  await aicombinator.google_sms.pair_device(bot, {});
  await aicombinator.twitter.post(bot, {username:"foo", password:"bar", message: 'Hello world from aicombinator'});
  // To make the bot pause for 5 seconds for the user to manually intervene (for eg: solving captcha):
  await bot.wait(5);

  let stories = await aicombinator.hackernews.fetch_top_stories(bot, {number_of_stories: 1});
  console.log({stories});

Supported sites and apps

The recipes can be both AI-generated but human-reviewed as well as human-coded.

Puppeteer-based recipes:

The following recipes are imported from activepieces actions and use providers' API, not the browser and therefore require auth tokens:

  • airtable
  • asana
  • bannerbear
  • binance
  • blackbaud
  • clickup
  • discord
  • drip
  • dropbox
  • figma
  • gmail
  • googleCalendar
  • googleContacts
  • googleDrive
  • googleSheets
  • googleTasks
  • hackernews
  • hubspot
  • intercom
  • mailchimp
  • openai
  • pipedrive
  • posthog
  • sendgrid
  • slack
  • square
  • stripe
  • telegramBot
  • todoist
  • trello
  • twilio
  • typeform
  • wordpress
  • youtube
  • zoom

The recipes are all open-source so feel free to raise PRs for your favorite sites.