A powerful deep search agent that uses BAML functions to perform intelligent web searches and generate comprehensive answers to questions.
For more details about our project, please visit our blog post.
We benchmarked on the Frames dataset by Google using the DeepSeek-R1-0528 model, and II-Researcher achieved an accuracy of 84.12%.
- π Intelligent web search using Tavily and SerpAPI search providers
- πΈοΈ Web scraping and content extraction with multiple providers (Firecrawl, Browser, BS4, Tavily)
- π§ Multi-step reasoning and reflection
- βοΈ Configurable LLM models for different tasks
- β‘ Asynchronous operation for better performance
- π Comprehensive answer generation with references
- π οΈ Support for customizable pipelines and reasoning methods for deep search
II-Researcher.Demo_.Reasoning.Question.2.mp4
II.Researcher.MCP.Demo.mp4
- Python 3.7+ (required for local development)
- Docker and Docker Compose (required for containerized deployment)
- Node.js and npm (required for local frontend development)
pip install ii-researcher
git clone https://github.com/Intelligent-Internet/ii-researcher.git
cd ii-researcher
pip install -e .
# API Keys
export OPENAI_API_KEY="your-openai-api-key"
export TAVILY_API_KEY="your-tavily-api-key" # set this api key when you select SEARCH_PROVIDER is tavily
export SERPAPI_API_KEY="your-serpapi-api-key" # set this api key when you select SEARCH_PROVIDER is serpapi
export FIRECRAWL_API_KEY="your-firecrawl-api-key" # set this api key when you select SCRAPER_PROVIDER is firecrawl
# API Endpoints
export OPENAI_BASE_URL="http://localhost:4000"
# Compress Configuration
export COMPRESS_EMBEDDING_MODEL="text-embedding-3-large"
export COMPRESS_SIMILARITY_THRESHOLD="0.3"
export COMPRESS_MAX_OUTPUT_WORDS="4096"
export COMPRESS_MAX_INPUT_WORDS="32000"
# Search and Scraping Configuration
export SEARCH_PROVIDER="serpapi" # Options: 'serpapi' | 'tavily'
export SCRAPER_PROVIDER="firecrawl" # Options: 'firecrawl' | 'bs' | 'browser' | 'tavily_extract'
# Timeouts and Performance Settings
export SEARCH_PROCESS_TIMEOUT="300" # in seconds
export SEARCH_QUERY_TIMEOUT="20" # in seconds
export SCRAPE_URL_TIMEOUT="30" # in seconds
export STEP_SLEEP="100" # in milliseconds
Config env when using compress by LLM (Optional: For better compression performance)
export USE_LLM_COMPRESSOR="TRUE"
export FAST_LLM="gemini-lite" # The model use for context compression
Config env when run with Pipeline:
# Model Configuration
export STRATEGIC_LLM="gpt-4o" # The model use for choose next action
export SMART_LLM="gpt-4o" # The model use for others tasks in pipeline
Config env when run with Reasoning:
export R_MODEL=r1 # The model use for reasoning
export R_TEMPERATURE=0.2 # Config temperature for reasoning model
export R_REPORT_MODEL=gpt-4o # The model use for writing report
export R_PRESENCE_PENALTY=0 # Config presence_penalty for reasoning model
# Install LiteLLM
pip install litellm
# Create litellm_config.yaml file
cat > litellm_config.yaml << EOL
model_list:
- model_name: text-embedding-3-large
litellm_params:
model: text-embedding-3-large
api_key: ${OPENAI_API_KEY}
- model_name: gpt-4o
litellm_params:
model: gpt-4o
api_key: ${OPENAI_API_KEY}
- model_name: o1-mini
litellm_params:
model: o1-mini
api_key: ${OPENAI_API_KEY}
- model_name: r1
litellm_params:
model: deepseek-reasoner
api_base: https://api.deepseek.com/beta
api_key: ${DEEPSEEK_API_KEY}
litellm_settings:
drop_params: true
EOL
# Start LiteLLM server
litellm --config litellm_config.yaml
The LiteLLM server will run on http://localhost:4000 by default.
cat > litellm_config.yaml << EOL
model_list:
- model_name: text-embedding-3-large
litellm_params:
model: text-embedding-3-large
api_key: ${OPENAI_API_KEY}
- model_name: "gpt-4o"
litellm_params:
model: "openai/chatgpt-4o-latest"
api_base: "https://openrouter.ai/api/v1"
api_key: "your_openrouter_api_key_here"
- model_name: "r1"
litellm_params:
model: "deepseek/deepseek-r1"
api_base: "https://openrouter.ai/api/v1"
api_key: "your_openrouter_api_key_here"
- model_name: "gemini-lite"
litellm_params:
model: "gemini/gemini-2.5-pro-preview-03-25"
api_base: "https://openrouter.ai/api/v1"
api_key: "your_openrouter_api_key_here"
litellm_settings:
drop_params: true
EOL
Run the deep search agent with your question:
python ii_researcher/cli.py --question "your question here" --stream
Note: The legacy pipeline mode is still available in branch legacy/ii_researcher_pipeline
but is no longer recommended for use.
- Set up your environment variables
- Copy the .env.example file to create a new file named .env
cp .env.example .env
- Edit the .env file and add your API keys and configure other settings:
- Integrating with Claude You can integrate your MCP server with Claude using: Claude Desktop Integration
- Install mcp to Claude
mcp install mcp/server.py -f .env
- Restart your Claude App
- Install and Run Backend API (In case for frontend serving):
# Start the API server
python api.py
The API server will run on http://localhost:8000
- Setup env for Frontend
Create a .env
file in the frontend directory with the following content:
NEXT_PUBLIC_API_URL=http://localhost:8000
- Install and Run Frontend:
# Navigate to frontend directory
cd frontend
# Install dependencies
npm install
# Start the development server
npm run dev
The frontend will be available at http://localhost:3000
-
Important: Make sure you have set up all environment variables from step 3 before proceeding.
-
Start the services using Docker Compose:
# Build and start all services
docker compose up --build -d
The following services will be started:
- frontend: Next.js frontend application
- api: FastAPI backend service
- litellm: LiteLLM proxy server
The services will be available at:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- LiteLLM Server: http://localhost:4000
- View logs:
# View all logs
docker compose logs -f
# View specific service logs
docker compose logs -f frontend
docker compose logs -f api
docker compose logs -f litellm
- Stop the services:
docker compose down
To run the Qwen/QwQ-32B model using SGLang, use the following command:
python3 -m sglang.launch_server --model-path Qwen/QwQ-32B --host 0.0.0.0 --port 30000 --tp 8 --context-length 131072
II-Researcher is inspired by and built with the support of the open-source community:
- LiteLLM β Used for efficient AI model integration.
- node-DeepResearch β Prompt inspiration
- gpt-researcher - Prompt inspiration, web scraper tool
- baml - Structured outputs