/NanoAgent

An agent that can run everywhere - even in your watch!

Primary LanguagePythonApache License 2.0Apache-2.0

๐Ÿง  NanoAgent โ€” A 135M Parameter Agentic LLM

NanoAgent is a 135M parameter, 8k context length, open-source language model designed for agentic tasks such as tool calling, instruction following, and lightweight reasoning.
Itโ€™s small enough (~135 MB in 8-bit) to run on edge devices like personal laptops, low-memory CPUs, and even wearables โ€” yet smart enough to make tool calls, parse web information, and give structured answers.

Quick inference resource: here

๐ŸŒ Real-World Use Cases

  • ๐Ÿ•น๏ธ Runs on edge devices โ€” laptops, smartwatches, browsers, or CPU-only environments.
  • ๐ŸŒ Parses and answers from the web โ€” supports tool calling to fetch real-time information.
  • ๐Ÿ”Ž Answers recent questions with live web search tools.
  • ๐Ÿ’ฌ Continues conversations โ€” ideal for assistant or agent frameworks.
  • โš™๏ธ Tool calling support enables chaining multiple tools and parsing results to produce final answers.

โœจ What NanoAgent Supports

Capability Description
๐Ÿ’ฌ Basic conversation Casual small talk
๐ŸŒ Information retrieval e.g., โ€œHow to bake a cake?โ€, โ€œWeather in Torontoโ€ through web search. Extracts answers from information returned by tools (scraping/search)
๐Ÿงฐ Tool calling Single & multi-tool call with structured explanation
๐Ÿง  Question decomposition Breaks complex questions into steps
๐Ÿงญ Question classification Identifies type of user query (e.g., fact, reasoning, instruction)
๐Ÿ“ Following system prompts Responds properly to system-level instructions
โœ๏ธ Writing emails and tasks Writes emails, structured messages

๐Ÿงช Training Overview

๐Ÿ“š Datasets Used

This model was trained using a combination of datasets under different open licenses.
Each dataset retains its original license, and use of those datasets is subject to their respective terms.

Dataset Purpose License
microsoft/orca-agentinstruct-1M-v1 RAG, MCQ answering, JSON parsing, Text classification, instruction following Community Data License Agreement โ€“ Permissive, Version 2.0
microsoft/orca-math-word-problems-200k Lightweight reasoning, word-level reasoning MIT
allenai/tulu-3-sft-personas-instruction-following Instruction following with persona Open Data Commons License Attribution family
xingyaoww/code-act ReAct style reasoning and acting Apache-2.0
m-a-p/Code-Feedback Feedback alignment Apache-2.0
HuggingFaceTB/smoltalk General conversation, system prompt handling Apache-2.0
HuggingFaceTB/smoltalk/apigen Tool calling stabilization Creative Commons Attribution 4.0 (was sourced from 1, 2)
weijie210/gsm8k_decomposed Question decomposition -
Locutusque/function-calling-chatml Tool call response formatting Apache-2.0
Salesforce/xlam-function-calling-60k Stronger function calling coverage Creative Commons Attribution 4.0
HuggingFaceTB/smoltalk2/SFT/smolagents_toolcalling_traces_think Web search, scraping, real-time reasoning Apache-2.0
NousResearch/hermes-function-calling-v1 Tool calling support with thinking Apache-2.0
HuggingFaceTB/smoltalk/smol-magpie-ultra For python code writing Apache-2.0

๐Ÿงญ Key Explorations & Findings

  • โœ‚๏ธ Dataset deduplication significantly improved performance by removing noisy or duplicate Q/As.
  • โœ‚๏ธ Shortening the responses (casual response) and using shorter python code in training improved performance and reduce repeated token generation.
  • ๐Ÿงฎ Word-level reasoning from orca-math enhanced the modelโ€™s ability to handle stepwise logic.
  • ๐Ÿงฐ Designing tool calling prompts using six open-source tool calling datasets resulted in stronger structured output generation.
  • ๐ŸŒ Tool calling integration enabled the model to extract answers from parsed web data, supporting up-to-date queries.

โšก Benchmark

Metric / Task SmolLM2-135M-Instruct NanoAgent
๐Ÿงฎ Parameters 135M 135M
๐Ÿ“ Context Length 8k 8k
๐Ÿ“Š IFEval Score (Overall) --- ---
๐Ÿงฐ Tool Call Tasks โŒ Not Supported โœ… Supported
๐Ÿงญ Instruction Following ๐ŸŸก Moderate ๐ŸŸข Improved
๐Ÿง  Reasoning (Light) ๐ŸŸก Moderate ๐ŸŸก Moderate
๐Ÿ“ Training Method Baseline (SFT) SFT + Agentic Finetuning
๐Ÿงช Strength Instruction following Tool call ability + structured outputs
โš ๏ธ Limitations No tool calling Occasional tool errors, still beta

๐Ÿงญ Roadmap

  • ๐Ÿ“Š Benchmark more agentic tasks
  • ๐Ÿง  Explore GRPO for tool calling improvement
  • ๐Ÿ”€ Experiment with weight merging
  • ๐Ÿงช Evaluate multi-turn tool chaining
  • ๐Ÿงน Further refine datasets for stability

๐Ÿ“„ License

This project (code, model weights, and training recipes) is licensed under the Apache License 2.0.

๐Ÿ“ข Notice

  • Model & code are ยฉ quwsarohi, licensed under Apache 2.0.
  • Portions of the training data were sourced from third-party datasets under CDLA-P 2.0, MIT, CC-BY 4.0, ODC-BY, and Apache 2.0.
  • The licensors of these datasets do not endorse this project or its outputs.
  • If you redistribute or fine-tune this model, ensure your use complies with all applicable dataset licenses.