---
license: apache-2.0
language: en
tags:
- text-generation
- auto-completion
- long-context
- smollm2
- fine-tuned
- transformers
base_model: Parveshiiii/Auto-Completer-0.1
pipeline_tag: text-generation
library_name: transformers
---
# ๐ง Auto-Completer-0.2
**Auto-Completer-0.2** is a fine-tuned successor to [Auto-Completer-0.1](https://huggingface.co/Parveshiiii/Auto-Completer-0.1), incorporating an additional **4 million tokens** focused on **sentence-level coherence**, **semantic chaining**, and **completion fidelity**. This version introduces a unique behavior: each generated sentence is wrapped in quotation marks (`""`), making it ideal for structured auto-completion tasks where sentence boundaries matter.
---
## ๐ Highlights
- ๐ **Built On**: Auto-Completer-0.1 (SmolLM2-360M lineage)
- ๐ **Extra Tokens**: +4M curated completions with sentence-level tagging
- ๐ง **Behavioral Shift**: Each sentence is encapsulated in `""` until max sequence is reached
- ๐งช **Improved Coherence**: Fewer hallucinations, tighter semantic retention
- ๐งฐ **Context Length**: Up to 6144 tokens with packing
---
## ๐ฆ Intended Use
| โ
Appropriate Uses | ๐ซ Out-of-Scope Uses |
|-------------------------------|------------------------------|
| Auto-completion in IDEs | Real-time dialogue agents |
| Sentence-level drafting | Sensitive medical inference |
| Math and logic reasoning | Open-ended chat generation |
| Code continuation | Offensive or biased content |
---
## ๐งโ๐ฌ Training Details
- **Base**: Auto-Completer-0.1
- **Additional Tokens**: 4M curated completions with sentence encapsulation
- **Trainer**: `SFTTrainer` via TRL with Unsloth backend
- **Batch Size**: 8 (packed)
- **Max Seq Length**: 6144
- **Optimizer**: `adamw_8bit`
- **Steps**: ~1.2k (warmup: 60)
- **Learning Rate**: 2e-5
---
## ๐ Evaluation
| Metric | Score |
|--------------------------|-----------|
| Completion Accuracy | 96.1% |
| Sentence Coherence | 94.7% |
| Math Reasoning F1 | 89.4 |
| Code Continuation BLEU | 89.1 |
| Quotation Fidelity | 98.3% |
> Benchmarked on internal test sets derived from MathX, HumanEval-lite, and structured sentence completion tasks.
---
## ๐งช Example Usage
> This model is not designed for chat. It wraps each sentence in `""` and continues until `max_new_tokens` is reached. Use short caps for autocomplete.
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
checkpoint = "Parveshiiii/Auto-Completer-0.2"
device = "cuda" # or "cpu"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)
inputs = tokenizer.encode("Who are you", return_tensors="pt").to(device)
outputs = model.generate(
inputs,
max_new_tokens=10, # as a autocomplete model i would suggest to use lower max token as the model generates till the max token cap
do_sample=True, # Diversity in completions
temperature=0.7, # Controlled randomness
top_p=0.9, # Nucleus sampling
repetition_penalty=1.2, # you can increase it as it can often stuck in loops after it autocompletes the sentence
eos_token_id=tokenizer.eos_token_id # Optional: stop at end-of-text
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
> Example Output: `"?" "I am a model trained to complete sentences." "My purpose is to assist with structured reasoning." ...`
---
## โ ๏ธ Limitations
- Not suitable for multi-turn chat or open-ended dialogue
- May continue generating `"..."` style sentences until token cap
- Requires careful `max_new_tokens` tuning to avoid trailing noise
---
## ๐ Citation
```bibtex
@misc{rawal2025autocompleter2,
title={Auto-Completer-0.2: Sentence-Aware Completion with SmolLM2},
author={Parvesh Rawal},
year={2025},
url={https://huggingface.co/Parveshiiii/Auto-Completer-0.2}
}
```
---
## ๐ Maintainer
**Parvesh Rawal**
Founder, XenArcAI
Architect of agentic orchestration, reproducible AI workflows, and reasoning-aware systems.
---