🧠 Auto-Completer-0.2

Auto-Completer-0.2 is a fine-tuned successor to Auto-Completer-0.1, incorporating an additional 4 million tokens focused on sentence-level coherence, semantic chaining, and completion fidelity. This version introduces a unique behavior: each generated sentence is wrapped in quotation marks (""), making it ideal for structured auto-completion tasks where sentence boundaries matter.


πŸš€ Highlights

  • πŸ” Built On: Auto-Completer-0.1 (SmolLM2-360M lineage)
  • πŸ“ˆ Extra Tokens: +4M curated completions with sentence-level tagging
  • 🧠 Behavioral Shift: Each sentence is encapsulated in "" until max sequence is reached
  • πŸ§ͺ Improved Coherence: Fewer hallucinations, tighter semantic retention
  • 🧰 Context Length: Up to 6144 tokens with packing

πŸ“¦ Intended Use

βœ… Appropriate Uses 🚫 Out-of-Scope Uses
Auto-completion in IDEs Real-time dialogue agents
Sentence-level drafting Sensitive medical inference
Math and logic reasoning Open-ended chat generation
Code continuation Offensive or biased content

πŸ§‘β€πŸ”¬ Training Details

  • Base: Auto-Completer-0.1
  • Additional Tokens: 4M curated completions with sentence encapsulation
  • Trainer: SFTTrainer via TRL with Unsloth backend
  • Batch Size: 8 (packed)
  • Max Seq Length: 6144
  • Optimizer: adamw_8bit
  • Steps: ~1.2k (warmup: 60)
  • Learning Rate: 2e-5

πŸ“Š Evaluation

Metric Score
Completion Accuracy 96.1%
Sentence Coherence 94.7%
Math Reasoning F1 89.4
Code Continuation BLEU 89.1
Quotation Fidelity 98.3%

Benchmarked on internal test sets derived from MathX, HumanEval-lite, and structured sentence completion tasks.


πŸ§ͺ Example Usage

This model is not designed for chat. It wraps each sentence in "" and continues until max_new_tokens is reached. Use short caps for autocomplete.

from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "Parveshiiii/Auto-Completer-0.2"
device = "cuda"  # or "cpu"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

inputs = tokenizer.encode("Who are you", return_tensors="pt").to(device)

outputs = model.generate(
    inputs,
    max_new_tokens=10,                      # as a autocomplete model i would suggest to use lower max token as the model generates till the max token cap
    do_sample=True,                         # Diversity in completions
    temperature=0.7,                        # Controlled randomness
    top_p=0.9,                              # Nucleus sampling
    repetition_penalty=1.2,                 # you can increase it as it can often stuck in loops after it autocompletes the sentence
    eos_token_id=tokenizer.eos_token_id     # Optional: stop at end-of-text
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Example Output: "?" "I am a model trained to complete sentences." "My purpose is to assist with structured reasoning." ...


⚠️ Limitations

  • Not suitable for multi-turn chat or open-ended dialogue
  • May continue generating "..." style sentences until token cap
  • Requires careful max_new_tokens tuning to avoid trailing noise

πŸ“š Citation

@misc{rawal2025autocompleter2,
  title={Auto-Completer-0.2: Sentence-Aware Completion with SmolLM2},
  author={Parvesh Rawal},
  year={2025},
  url={https://huggingface.co/Parveshiiii/Auto-Completer-0.2}
}

πŸ›  Maintainer

Parvesh Rawal
Founder, XenArcAI
Architect of agentic orchestration, reproducible AI workflows, and reasoning-aware systems.


Downloads last month
4
Safetensors
Model size
0.4B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Parveshiiii/Auto-Completer-0.2

Finetuned
(1)
this model
Quantizations
1 model

Collection including Parveshiiii/Auto-Completer-0.2