NanoAgent is a 135M parameter, 8k context length, open-source language model designed for agentic tasks such as tool calling, instruction following, and lightweight reasoning.
Itโs small enough (~135 MB in 8-bit) to run on edge devices like personal laptops, low-memory CPUs, and even wearables โ yet smart enough to make tool calls, parse web information, and give structured answers.
Quick inference resource: here
- ๐น๏ธ Runs on edge devices โ laptops, smartwatches, browsers, or CPU-only environments.
- ๐ Parses and answers from the web โ supports tool calling to fetch real-time information.
- ๐ Answers recent questions with live web search tools.
- ๐ฌ Continues conversations โ ideal for assistant or agent frameworks.
- โ๏ธ Tool calling support enables chaining multiple tools and parsing results to produce final answers.
| Capability | Description |
|---|---|
| ๐ฌ Basic conversation | Casual small talk |
| ๐ Information retrieval | e.g., โHow to bake a cake?โ, โWeather in Torontoโ through web search. Extracts answers from information returned by tools (scraping/search) |
| ๐งฐ Tool calling | Single & multi-tool call with structured explanation |
| ๐ง Question decomposition | Breaks complex questions into steps |
| ๐งญ Question classification | Identifies type of user query (e.g., fact, reasoning, instruction) |
| ๐ Following system prompts | Responds properly to system-level instructions |
| โ๏ธ Writing emails and tasks | Writes emails, structured messages |
- Base model:
SmolLM2-135M-Instruct - Fine-tuning method:
Dynamic Fine-Tuning (DFT)Supervised Fine-Tuning - Platform: Apple Mac M1 (16 GB) โ MLX framework
This model was trained using a combination of datasets under different open licenses.
Each dataset retains its original license, and use of those datasets is subject to their respective terms.
| Dataset | Purpose | License |
|---|---|---|
| microsoft/orca-agentinstruct-1M-v1 | RAG, MCQ answering, JSON parsing, Text classification, instruction following | Community Data License Agreement โ Permissive, Version 2.0 |
| microsoft/orca-math-word-problems-200k | Lightweight reasoning, word-level reasoning | MIT |
| allenai/tulu-3-sft-personas-instruction-following | Instruction following with persona | Open Data Commons License Attribution family |
| xingyaoww/code-act | ReAct style reasoning and acting | Apache-2.0 |
| m-a-p/Code-Feedback | Feedback alignment | Apache-2.0 |
| HuggingFaceTB/smoltalk | General conversation, system prompt handling | Apache-2.0 |
| HuggingFaceTB/smoltalk/apigen | Tool calling stabilization | Creative Commons Attribution 4.0 (was sourced from 1, 2) |
| weijie210/gsm8k_decomposed | Question decomposition | - |
| Locutusque/function-calling-chatml | Tool call response formatting | Apache-2.0 |
| Salesforce/xlam-function-calling-60k | Stronger function calling coverage | Creative Commons Attribution 4.0 |
| HuggingFaceTB/smoltalk2/SFT/smolagents_toolcalling_traces_think | Web search, scraping, real-time reasoning | Apache-2.0 |
| NousResearch/hermes-function-calling-v1 | Tool calling support with thinking | Apache-2.0 |
| HuggingFaceTB/smoltalk/smol-magpie-ultra | For python code writing | Apache-2.0 |
- โ๏ธ Dataset deduplication significantly improved performance by removing noisy or duplicate Q/As.
- โ๏ธ Shortening the responses (casual response) and using shorter python code in training improved performance and reduce repeated token generation.
- ๐งฎ Word-level reasoning from
orca-mathenhanced the modelโs ability to handle stepwise logic. - ๐งฐ Designing tool calling prompts using six open-source tool calling datasets resulted in stronger structured output generation.
- ๐ Tool calling integration enabled the model to extract answers from parsed web data, supporting up-to-date queries.
| Metric / Task | SmolLM2-135M-Instruct | NanoAgent |
|---|---|---|
| ๐งฎ Parameters | 135M | 135M |
| ๐ Context Length | 8k | 8k |
| ๐ IFEval Score (Overall) | --- | --- |
| ๐งฐ Tool Call Tasks | โ Not Supported | โ Supported |
| ๐งญ Instruction Following | ๐ก Moderate | ๐ข Improved |
| ๐ง Reasoning (Light) | ๐ก Moderate | ๐ก Moderate |
| ๐ Training Method | Baseline (SFT) | SFT + Agentic Finetuning |
| ๐งช Strength | Instruction following | Tool call ability + structured outputs |
| No tool calling | Occasional tool errors, still beta |
- ๐ Benchmark more agentic tasks
- ๐ง Explore GRPO for tool calling improvement
- ๐ Experiment with weight merging
- ๐งช Evaluate multi-turn tool chaining
- ๐งน Further refine datasets for stability
This project (code, model weights, and training recipes) is licensed under the Apache License 2.0.
- Model & code are ยฉ quwsarohi, licensed under Apache 2.0.
- Portions of the training data were sourced from third-party datasets under CDLA-P 2.0, MIT, CC-BY 4.0, ODC-BY, and Apache 2.0.
- The licensors of these datasets do not endorse this project or its outputs.
- If you redistribute or fine-tune this model, ensure your use complies with all applicable dataset licenses.