Self-hosted GPT-4V api. Welcome any questions and suggestions. You may also PR to improve the code!
⚠️ Important Note: As GPT-4V(ision) has not yet been made publicly available, this project necessitates an active ChatGPT Plus subscription for multimodal prompting access. It's worth noting that the tactics used by this project to tap into an unofficial GPT-4V API may contravene the associated ChatGPT Term of Service clause:2. (c) Restrictions: You may not ... (iv) except as permitted through the API, use any automated or programmatic method to extract data or output from the Services, including scraping, web harvesting, or web data extraction;
(warnings here are from https://github.com/ddupont808/GPT-4V-Act)
Install dependencies
npm install .
Start the server (default at 3000)
node server.js
Note:
- The basic logic of the api is to open a browser and interact with ChatGPT with vision on the webpage. Just like a human operator.
- Headless = false by default. When you run it for the first time, you need to log in your ChatGPT account manually in the browser opened by puppeteer and make sure the page is on https://chat.openai.com like the image below. After log in for the first time, your browser user data will be stored at ./user_data and you can change the headless to "new" (https://developer.chrome.com/articles/new-headless/) to use the headless mode:
- You may refer to test.py and the comments of functions to see how to use the API
- To run on Ubuntu or other Linux systems:
you may need to install xvfb.remember to launch it each time you use headless=falsesudo apt-get install xvfb x11-apps x11-xkb-utils libx11-6 libx11-xcb1
Login code is temporarily not supported yet. You may copy the user_data dir from your own system to the linux system to avoid login.Xvfb -ac :99 -screen 0 1280x1024x16 & export DISPLAY=:99