Instructions to use saadxsalman/SS-350M-SQL-Strict with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps
- Unsloth Studio new
How to use saadxsalman/SS-350M-SQL-Strict with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for saadxsalman/SS-350M-SQL-Strict to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for saadxsalman/SS-350M-SQL-Strict to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for saadxsalman/SS-350M-SQL-Strict to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="saadxsalman/SS-350M-SQL-Strict", max_seq_length=2048, )
Model Card: SS-350M-SQL-Strict
Model Summary
SS-350M-SQL-Strict is a specialized, lightweight LLM fine-tuned for the singular task of Text-to-SQL translation. Built upon the LiquidAI LFM2.5-350M architecture, this model has been engineered to follow a "Strict" output protocol: it generates only raw SQL code, eliminating the conversational filler, Markdown blocks, and explanations typically found in general-purpose models.
By leveraging 4-bit QLoRA and Unsloth optimizations, this model provides high-speed, low-latency SQL generation suitable for edge deployment and resource-constrained environments.
Model Details
- Developed by: Saad Salman
- Architecture: Liquid Foundation Model (LFM) 2.5
- Parameters: 350 Million
- Quantization: 4-bit (bitsandbytes)
- Fine-tuning Method: QLoRA
- Primary Task: Natural Language to SQL (Strict)
Training Logic & Parameters
The model was trained using a custom pipeline to enforce strict code generation. The key differentiator is the use of Completion-Only Loss masking, which prevents the model from wasting weights on learning the prompt structure, focusing 100% of its learning capacity on the SQL syntax.
Hyperparameters
| Parameter | Value | Description |
|---|---|---|
| Max Steps | 800 | Optimal convergence point for 350M params |
| Learning Rate | 2e-4 | High enough for rapid logic acquisition |
| Batch Size | 16 | (4 per device with 4 grad accumulation) |
| Rank (r) | 32 | High rank to capture complex SQL logic |
| Alpha | 32 | Scaling factor for LoRA weights |
| Optimizer | AdamW 8-bit | Memory-efficient optimization |
Training Curve Analysis
The model demonstrated a classic "L-shaped" convergence curve. Initial loss started at ~38.1 and successfully plateaued between 8.0 and 11.0. This plateau indicates the model has fully internalized the ChatML structure and the SQL schema-mapping logic.
Prompting Specification (ChatML)
To ensure the "Strict" behavior, you must use the following ChatML format. Failure to use this format may result in hallucinated text.
Template
<|im_start|>system
You are a SQL translation engine. Return ONLY raw SQL. Schema: {YOUR_SCHEMA}<|im_end|>
<|im_start|>user
{YOUR_QUESTION}<|im_end|>
<|im_start|>assistant
Example Input
<|im_start|>system
You are a SQL translation engine. Return ONLY raw SQL. Schema: Table 'orders' (id, price, status, created_at)<|im_end|>
<|im_start|>user
Find the average price of all 'completed' orders.<|im_end|>
<|im_start|>assistant
Example Output
SELECT AVG(price) FROM orders WHERE status = 'completed';
Training Dataset
The model was trained on the Gretel Synthetic SQL dataset. This dataset is designed to cover:
- Complex joins and subqueries.
- Diverse industry domains (Finance, Retail, Tech).
- Correct handling of
GROUP BY,ORDER BY, andHAVINGclauses.
Technical Limitations
- Schema Size: Best suited for schemas with < 20 tables.
- Dialect: Defaulted to standard SQL.
- Reasoning: The model does not "explain" its code; it is a direct translation engine.
How to Use with Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = "saadxsalman/SS-350M-SQL-Strict"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")
# Ready for inference!
- Downloads last month
- 28