README.md · DMindAI/DMind-3-nano at main

DMind-3-nano / README.md

yuzhe

Update README with official FunctionGemma calling format

0532c27 verified 8 days ago

preview code

raw

history blame contribute delete

19.1 kB

	---
	language:
	- en
	- zh
	license: apache-2.0
	base_model: google/functiongemma-270m-it
	tags:
	- function-calling
	- tool-use
	- crypto
	- blockchain
	- solana
	- ethereum
	- on-device
	- privacy
	- edge-ai
	- mobile
	- wallet
	- standard-protocol
	library_name: transformers
	pipeline_tag: text-generation
	---

	# DMind-3-nano: Privacy-First On-Device Crypto Intent Recognition

	> Inference stays on your device. Standardized function calling for wallets, DEXs, and agents. Built on `google/functiongemma-270m-it`.

	## Model Description

	DMind-3-nano is a small, edge-optimized language model fine-tuned for crypto wallet and DEX intent recognition using standardized function-calling protocols. It is designed to run entirely on-device, enabling privacy-preserving, low-latency intent parsing for Web3 wallets and local agents.

	This repository hosts the open-source training and evaluation pipeline as well as the released model artifacts.

	Repo purpose: host the open-source training/eval pipeline and release artifacts.


	## Performance Snapshot
	<img src="figures/model_comparison_chart.png" width="720" />


	Figure 1. DMind-3-nano significantly outperforms both the untuned base model and a similarly sized general-purpose model (Qwen3-0.6B), especially in multi-turn success.

	## Highlights

	- 🔐 Privacy-first: 100% on-device intent recognition; no data leaves the device.
	- 📱 Edge-optimized: 270M params; runs on phones/tablets/edge CPUs.
	- 🔄 Standardized protocols: `SEARCH_TOKEN` / `EXECUTE_SWAP` with unified schemas.
	- 🌐 Multi-chain: Solana, Ethereum, BSC, Base.
	- 🌍 Multilingual: English + Chinese intents (Chinese samples kept in data/benchmarks).
	- 🤖 Agent-native: designed for local-first wallet/agent workflows where a growing share of trading decisions and execution happen on-device.
	- 📊 Training data: the final full fine-tune used 12,000+ samples in total; LLM-generated data is only a subset, and 60%+ of the data comes from real trading scenarios.
	- 🧾 (To our knowledge) first public vertical-domain FunctionGemma case study: an end-to-end example of fine-tuning `google/functiongemma-270m-it` for a real wallet/DEX intent domain, including the practical training/evaluation pipeline and reproducible scripts.

	## Why This Matters for Web3 (Standardization as a Step-Change)

	Web3 is composable at the protocol layer (tokens, RPCs), but still fragmented at the intent layer. Today every wallet, DEX, and agent framework invents its own “swap/search intent” schema and function-calling format. The result is high integration cost, brittle adapters, inconsistent safety guarantees, and poor ecosystem interoperability.

	This work targets a transformative goal: standardize wallet intents as a small, versionable protocol between natural language and transaction builders. Concretely, DMind-3-nano enforces a minimal set of typed tools (e.g. `SEARCH_TOKEN`, `EXECUTE_SWAP`) with strict schemas and a deterministic wrapper output format.

	What standardization unlocks:

	- Interoperability: one protocol works across wallets/DEXs/agents; integrations become plug-and-play.
	- Safety & auditability: tool calls are structured data—easy to validate, simulate, policy-check, and display for confirmation before signing.
	- Benchmarkability: shared datasets and comparable evaluations across models and releases.
	- Ecosystem scaling: new tools can be added via versioning without breaking existing clients.

	In short, DMind-3-nano is not only a model—it is a proposal for a standard protocol layer that can make wallet intelligence as interoperable as ERC-20 made tokens.

	### The next wave: local agents executing trades

	We expect a large share of future Web3 activity to be agent-driven: wallets will run local copilots that continuously parse user intent, monitor context, and propose/execute transactions. In that world, “cloud-only” intelligence becomes a bottleneck and a risk:

	- Privacy: trading intent, token preferences, and behavioral signals should not be streamed to third-party servers.
	- Latency & reliability: agents must work instantly and offline (mobile, hardware wallets, poor connectivity).
	- Security boundaries: local agents can keep a tighter loop between intent → policy checks → simulation → user confirmation → signing.

	This is why a small, high-accuracy on-device function-calling model is necessary infrastructure for the agent-native wallet era—and why standardizing the intent protocol matters even more when millions of agents need to speak the same language.

	Equally important, this repository serves as a public reference implementation for applying FunctionGemma to a concrete vertical domain. By openly sharing fine-tuning details (data format, training configs, evaluation, and benchmarks), it lowers the barrier for the community to replicate, extend, and standardize on a common intent protocol.

	## Model Overview

	\| Property \| Value \|
	\| --- \| --- \|
	\| Model \| DMind-3-nano \|
	\| Base \| google/functiongemma-270m-it \|
	\| Params \| 270M \|
	\| Context \| 2048 \|
	\| Precision \| BF16 (train) \|
	\| Best tokens \| SOL, USDC, JUP, RAY, BONK, WIF, ETH, BTC, POPCAT, BOME, TRUMP \|
	\| Chains \| solana, ethereum, bsc, base \|

	Experimental notice: Highest accuracy on the token/chain set above; other assets may need further tuning. Validate outputs before transacting.

	## Repository Layout

	- `model/` We have uploaded an experimental version of the model weights. Please note that this is a bold exploratory release, and we do not take responsibility for any financial losses incurred from using this model in production environments.
	- `src/` training/eval utilities
	- `train.py` (LoRA or full fine-tune)
	- `evaluate.py` (benchmark evaluation)
	- `prepare_dataset.py` (SFT-ready formatting)
	- `generate_benchmark.py` (100-case benchmark)
	- `config.py` (tools, prompts, token maps)
	- `data/` sample data
	- `training_data.json` (raw; open-sourced subset for reproducibility)
	- `benchmark_dataset.json` (eval set; includes Chinese test prompts by design)
	- `results/evaluation_results.json` sample output
	- `run_training.sh`, `requirements.txt`

	## Quick Start (Training & Eval)

	Install:
	```bash
	pip install -r requirements.txt
	```

	Train (LoRA default):
	```bash
	python -m src.train \
	--model_path /path/to/functiongemma-270m-it \
	--dataset_path ./data/training_data.json \
	--output_dir ./runs \
	--bf16
	```
	Switch to full fine-tune: add `--no-use-lora`. Use `--use_4bit/--use_8bit` + `--gradient_checkpointing` for low memory.

	Evaluate:
	```bash
	python -m src.evaluate \
	--model_path ./runs/<run>/final_model \
	--benchmark_path ./data/benchmark_dataset.json \
	--output_path ./results/eval_$(date +%Y%m%d_%H%M%S).json
	```

	Data utilities:
	```bash
	# Prepare SFT data
	python -m src.prepare_dataset --input ./data/training_data.json --output ./data/prepared_dataset.json
	# Regenerate benchmark
	python -m src.generate_benchmark --output ./data/benchmark_dataset.json
	```

	Note: `data/prepared_dataset.json` is a generated artifact (optional) and is intentionally not committed.

	## Tool Definitions & Schemas

	To ensure interoperability, DMind-3-nano uses strict JSON schemas for tool definitions. Below are the standard definitions used during training and inference.

	1. SEARCH_TOKEN
	Used to find token metadata or address on a specific chain.

	```json
	{
	"name": "SEARCH_TOKEN",
	"description": "Search for a cryptocurrency token on-chain to retrieve its metadata or address.",
	"parameters": {
	"type": "object",
	"properties": {
	"symbol": {
	"type": "string",
	"description": "The ticker symbol of the token (e.g., 'SOL', 'USDC')."
	},
	"address": {
	"type": "string",
	"description": "The specific contract address (CA) of the token, if known."
	},
	"chain": {
	"type": "string",
	"enum": ["solana", "ethereum", "bsc", "base"],
	"description": "The target blockchain network."
	},
	"keyword": {
	"type": "string",
	"description": "General search keywords (e.g., project name) if symbol/address are unclear."
	}
	},
	"required": []
	}
	}
	```

	2. EXECUTE_SWAP
	Used to construct a swap transaction intent between two assets.
	```json
	{
	"name": "EXECUTE_SWAP",
	"description": "Propose a token swap transaction.",
	"parameters": {
	"type": "object",
	"properties": {
	"inputTokenSymbol": {
	"type": "string",
	"description": "Symbol of the token being sold (e.g., 'SOL')."
	},
	"inputTokenCA": {
	"type": "string",
	"description": "Contract address of the token being sold."
	},
	"outputTokenCA": {
	"type": "string",
	"description": "Contract address of the token being bought."
	},
	"inputTokenAmount": {
	"type": "number",
	"description": "Absolute amount of input token to swap."
	},
	"inputTokenPercentage": {
	"type": "number",
	"description": "Percentage of balance to swap (0.0 to 1.0), used if exact amount is not specified."
	},
	"outputTokenAmount": {
	"type": "number",
	"description": "Minimum amount of output token expected (optional/slippage related)."
	}
	},
	"required": ["inputTokenSymbol"]
	}
	}
	```

	Output Format
	The model outputs the function call wrapped in special tokens (standard FunctionGemma format):
	```plaintext
	<start_function_call>call:FUNCTION_NAME{key1:val1, key2:val2}<end_function_call>
	```

	Example:

	User: "Search for SOL on Solana" Model:
	```plaintext
	<start_function_call>call:SEARCH_TOKEN{symbol:"SOL", chain:"solana"}<end_function_call>
	```

	## Developer Prompt (System Message)

	For optimal performance, use the following developer/system prompt when initializing the model:

	### Usage Principles (Important)

	Follow these rules for best results:

	1. Place Once at the Beginning: Put the developer prompt only once, at the very start of your conversation session
	2. Do NOT place in user messages: Never include the developer prompt content in user/assistant messages or tool schemas
	3. Session-wide persistence: For multi-turn conversations, keep the same developer prompt at the session start - do not repeat it

	Correct usage pattern:
	```json
	{
	"messages": [
	{"role": "developer", "content": "<developer prompt goes here>"},
	{"role": "user", "content": "first user query"},
	{"role": "assistant", "content": "assistant response"},
	{"role": "user", "content": "second user query"}
	// No need to repeat developer prompt in subsequent turns
	]
	}
	```

	### Developer Prompt Content

	```json
	{
	"messages": [
	{"role": "developer", "content": "You are a model that can do function calling with the following functions.\nYou are an on-chain trading assistant.\nYou may use only two tools: SEARCH_TOKEN and EXECUTE_SWAP.\n\nCore policy:\n- Use a tool only when needed.\n- If required fields are missing or ambiguous, ask one concise clarification question first.\n- If the user is just chatting, reply naturally without calling tools.\n- Never fabricate addresses, amounts, balances, prices, or execution results.\n- Never resolve token symbols to contract addresses from memory or static snapshots.\n- Treat ticker symbols as potentially ambiguous and contract addresses as dynamic (can migrate/upgrade).\n- Supported chains are: solana, ethereum, bsc, base.\n If the user asks for an unsupported chain (for example polygon), explain the limitation and ask for a supported chain.\n\nTool-call format (must match exactly):\n<start_function_call>call:TOOL_NAME{\"key\":\"value\",\"amount\":1.23}</end_function_call>\nDo not output XML-style tags such as <function_calls>, <invoke>, or <parameter>.\n\nStrict schema:\n\nSEARCH_TOKEN params\n{\n \"symbol\": \"string, optional\",\n \"address\": \"string, optional\",\n \"keyword\": \"string, optional\",\n \"chain\": \"solana \| ethereum \| bsc \| base, optional\"\n}\nRules:\n- At least one of symbol/address/keyword is required.\n- If the user gives only an address, do address-only lookup (do not guess chain).\n- If user explicitly gives chain, include chain.\n- For symbol/keyword based requests, call SEARCH_TOKEN first before producing a swap call.\n- If lookup may return multiple candidates (same ticker/name), ask the user to confirm the exact token (address or more context).\n\nEXECUTE_SWAP params\n{\n \"inputTokenSymbol\": \"string, required\",\n \"inputTokenCA\": \"string, optional\",\n \"outputTokenCA\": \"string, optional\",\n \"inputTokenAmount\": \"number, optional\",\n \"inputTokenPercentage\": \"number in [0,1], optional\",\n \"outputTokenAmount\": \"number, optional\"\n}\nRules:\n- inputTokenAmount and inputTokenPercentage are mutually exclusive.\n- Convert 30% to inputTokenPercentage=0.3.\n- If both amount and percentage are provided, ask the user to choose one.\n- If outputTokenCA is unknown, call SEARCH_TOKEN first and use the returned result.\n- If user already provides output token address explicitly, you may call EXECUTE_SWAP directly.\n- If lookup returns multiple candidates or low-confidence candidates, ask a clarification question; do not guess.\n\nLanguage:\n- Support both Chinese and English.\n- Reply in the same language as the user unless they ask otherwise."},
	{"role": "user", "content": "<user query goes here>"}
	]
	}
	```

	Usage Example (Python/Transformers):

	```python
	from transformers import AutoModelForCausalLM, AutoProcessor

	model_path = "DMindAI/DMind-3-nano"

	# Load model and processor (processor combines tokenizer and tool handling)
	model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")
	processor = AutoProcessor.from_pretrained(model_path, device_map="auto")

	# Define tool schemas (must match training format)
	tools = [
	{
	"name": "SEARCH_TOKEN",
	"description": "Search for a cryptocurrency token on-chain to retrieve its metadata or address.",
	"parameters": {
	"type": "object",
	"properties": {
	"symbol": {"type": "string", "description": "The ticker symbol of the token (e.g., 'SOL', 'USDC')."},
	"address": {"type": "string", "description": "The specific contract address (CA) of the token, if known."},
	"chain": {"type": "string", "enum": ["solana", "ethereum", "bsc", "base"], "description": "The target blockchain network."},
	"keyword": {"type": "string", "description": "General search keywords (e.g., project name) if symbol/address are unclear."}
	},
	"required": []
	}
	},
	{
	"name": "EXECUTE_SWAP",
	"description": "Propose a token swap transaction.",
	"parameters": {
	"type": "object",
	"properties": {
	"inputTokenSymbol": {"type": "string", "description": "Symbol of the token being sold (e.g., 'SOL')."},
	"inputTokenCA": {"type": "string", "description": "Contract address of the token being sold."},
	"outputTokenCA": {"type": "string", "description": "Contract address of the token being bought."},
	"inputTokenAmount": {"type": "number", "description": "Absolute amount of input token to swap."},
	"inputTokenPercentage": {"type": "number", "description": "Percentage of balance to swap (0.0 to 1.0)."},
	"outputTokenAmount": {"type": "number", "description": "Minimum amount of output token expected."}
	},
	"required": ["inputTokenSymbol"]
	}
	}
	]

	# Prepare messages with developer prompt (CRITICAL: must be first message)
	developer_prompt = """You are a model that can do function calling with the following functions.
	You are an on-chain trading assistant.
	You may use only two tools: SEARCH_TOKEN and EXECUTE_SWAP.

	Core policy:
	- Use a tool only when needed.
	- If required fields are missing or ambiguous, ask one concise clarification question first.
	- If the user is just chatting, reply naturally without calling tools.
	- Never fabricate addresses, amounts, balances, prices, or execution results.
	- Never resolve token symbols to contract addresses from memory or static snapshots.
	- Treat ticker symbols as potentially ambiguous and contract addresses as dynamic (can migrate/upgrade).
	- Supported chains are: solana, ethereum, bsc, base.
	If the user asks for an unsupported chain (for example polygon), explain the limitation and ask for a supported chain.

	Tool-call format (must match exactly):
	<start_function_call>call:TOOL_NAME{\"key\":\"value\",\"amount\":1.23}</end_function_call>
	Do not output XML-style tags such as <function_calls>, <invoke>, or <parameter>.

	Strict schema:

	SEARCH_TOKEN params
	{
	\"symbol\": \"string, optional\",
	\"address\": \"string, optional\",
	\"keyword\": \"string, optional\",
	\"chain\": \"solana \| ethereum \| bsc \| base, optional\"
	}
	Rules:
	- At least one of symbol/address/keyword is required.
	- If the user gives only an address, do address-only lookup (do not guess chain).
	- If user explicitly gives chain, include chain.
	- For symbol/keyword based requests, call SEARCH_TOKEN first before producing a swap call.
	- If lookup may return multiple candidates (same ticker/name), ask the user to confirm the exact token (address or more context).

	EXECUTE_SWAP params
	{
	\"inputTokenSymbol\": \"string, required\",
	\"inputTokenCA\": \"string, optional\",
	\"outputTokenCA\": \"string, optional\",
	\"inputTokenAmount\": \"number, optional\",
	\"inputTokenPercentage\": \"number in [0,1], optional\",
	\"outputTokenAmount\": \"number, optional\"
	}
	Rules:
	- inputTokenAmount and inputTokenPercentage are mutually exclusive.
	- Convert 30% to inputTokenPercentage=0.3.
	- If both amount and percentage are provided, ask the user to choose one.
	- If outputTokenCA is unknown, call SEARCH_TOKEN first and use the returned result.
	- If user already provides output token address explicitly, you may call EXECUTE_SWAP directly.
	- If lookup returns multiple candidates or low-confidence candidates, ask a clarification question; do not guess.

	Language:
	- Support both Chinese and English.
	- Reply in the same language as the user unless they ask otherwise."""

	messages = [
	{"role": "developer", "content": developer_prompt},
	{"role": "user", "content": "在base查BTC地址"}
	]

	# Generate with processor (handles tools automatically)
	inputs = processor.apply_chat_template(
	messages,
	tools=tools,
	add_generation_prompt=True,
	return_dict=True,
	return_tensors="pt"
	).to(model.device)

	outputs = model.generate(**inputs, max_new_tokens=256)
	response = processor.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	## License & Governance

	- Code: MIT (`LICENSE`)
	- Model card intent: Apache-2.0 (as in metadata above)
	- Protocol specs (SEARCH_TOKEN / EXECUTE_SWAP): public domain for maximal adoption
	- Contributions are welcome via issues/PRs.