toolbox-macos
is a minimal package that enables OpenAI GPTs to interact with macOS apps like iMessage, email, or calendar through Shortcuts actions.
- Simple Integration: Easy setup with a local server and GPT API schema.
- Privacy-Focused: Runs locally to keep your data secure.
- Versatile: gives access to 128 APIs from Apple Shortcuts.
For a demo see: https://x.com/LinzhiQ/status/1729555314217734240?s=20
Tweet2.mov
tweet.mov
On a macOS machine with Node.js installed, run:
git clone https://github.com/iter-ai/toolbox-macos.git
npm install
npm run dev
The command will start a Cloudflare Tunnel to allow GPTs to connect to your machine.
toolbox-macos
is designed with supporting custom GPTs in mind. While custom GPTs provide a flexible interface, they come with constraints like single-agent design, character limit for schema descriptions, etc.
Our custom GPT is designed to perform the following five steps:
listTools
(/list
): providing a list of available action names to the modelselectTools
(/schema
): providing the schema details for the input actionssubmitPlan
(/plan
): this endpoint receives a plan from the model in plain text and always returns success. The goal of this endpoint is to simply hide the plan from the user.submitCritique
(/critique
): similarly, this endpoint receives a critique of the plan and always returns success. Again, this dummy endpoint hides the critique from the user.runTool
(/run
): this endpoint executes an action that the GPT decides to take with the given parameters.
The hierarchical design of /list
and /schema
enable toolbox-macos
to support more than a hundred actions to a
single GPT. The model can dynamically query and decide which actions to take.
/plan
and /critique
abstract away the Chain of Thought and Self Critique steps from the user. The user can simply
focus on the conversation with the model.
You check the system prompt (in cli/src/index.tsx
) for more details on how we instruct the agent to leverage these
endpoints.
There are several considerations when designing the agent architecture:
- Providing user information includes time zones and names
- Explaining specific quirks about Apple Shortcuts, such as timezone formats and how to find certain identifiers
- Instructing the model to follow the above five steps
- Instructing the model on some interaction patterns, such as when to ask for clarification and confirmation