1 4 1

Ivan PRO

aufklarer

https://blog.ivan.digital

AI & ML interests

GenAI

Recent Activity

updated a dataset 23 minutes ago

aufklarer/central-bank-communications

updated a collection about 12 hours ago

CoreML Speech Models

updated a model 1 day ago

aufklarer/Nemotron-Speech-Streaming-0.6B-CoreML-INT8

View all activity

Organizations

Posts 10

Post

1270

After running extensive benchmarks across ASR, TTS, and VAD on Apple Silicon, we found some results that weren't documented anywhere.

The most counterintuitive: INT8 runs 3.3x faster than INT4 on the Neural Engine. A 332 MB CoreML model allocates 1,677 MB at runtime. And the right architecture uses both MLX and CoreML simultaneously — not one or the other.

MLX talks to the GPU — programmable, fast for large transformer inference. CoreML talks to the Neural Engine — fixed-function silicon, 135x real-time for small feedforward models like VAD, near-zero power draw.

All benchmarks are from speech-swift, our open-source Swift library for on-device speech AI: ASR, TTS, VAD, diarization, speech-to-speech — everything running locally on Apple Silicon with no API, no cloud, no data leaving the device.

Models on HF: aufklarer/Qwen3-ASR-0.6B-MLX-4bit · aufklarer/parakeet-tdt-0.6b-coreml-int8 · aufklarer/PersonaPlex-7B-MLX-4bit

Full article: https://blog.ivan.digital
Library: https://github.com/soniqo/speech-swift

View all Posts

Articles 1

Article

AI Trends 2026: Test-Time Reasoning and the Rise of Reflective Agents

View all Articles

Collections 3

View 3 collections

models 57

datasets 1

aufklarer/central-bank-communications

Viewer • Updated 23 minutes ago • 255k • 417 • 3

Ivan PRO

AI & ML interests

Recent Activity

Organizations

Posts 10

Articles 1

AI Trends 2026: Test-Time Reasoning and the Rise of Reflective Agents

Collections 3

aufklarer/Qwen3-ASR-0.6B-MLX-4bit

aufklarer/WeSpeaker-ResNet34-LM-MLX

aufklarer/Qwen3-ForcedAligner-0.6B-4bit

aufklarer/Qwen3-ASR-1.7B-MLX-8bit

aufklarer/Omnilingual-ASR-CTC-300M-CoreML-INT8-10s

aufklarer/Parakeet-TDT-v3-CoreML-INT8

aufklarer/Kokoro-82M-CoreML

aufklarer/WeSpeaker-ResNet34-LM-CoreML

aufklarer/Qwen3-ASR-0.6B-MLX-4bit

aufklarer/WeSpeaker-ResNet34-LM-MLX

aufklarer/Qwen3-ForcedAligner-0.6B-4bit

aufklarer/Qwen3-ASR-1.7B-MLX-8bit

aufklarer/Omnilingual-ASR-CTC-300M-CoreML-INT8-10s

aufklarer/Parakeet-TDT-v3-CoreML-INT8

aufklarer/Kokoro-82M-CoreML

aufklarer/WeSpeaker-ResNet34-LM-CoreML

models 57

aufklarer/Nemotron-Speech-Streaming-0.6B-CoreML-INT8

aufklarer/DeepFilterNet3-CoreML

aufklarer/Kokoro-82M-CoreML-INT8

aufklarer/Kokoro-82M-CoreML

aufklarer/Omnilingual-ASR-CTC-300M-CoreML-INT8-10s

aufklarer/Qwen3.5-0.8B-Chat-CoreML

aufklarer/Omnilingual-ASR-CTC-300M-CoreML-INT8

aufklarer/Qwen3-0.6B-Chat-CoreML

aufklarer/KWS-Zipformer-3M-CoreML-INT8

aufklarer/FireRedVAD-CoreML

datasets 1

aufklarer/central-bank-communications

Ivan PRO

AI & ML interests

Recent Activity

Organizations

Posts 10

Articles 1

AI Trends 2026: Test-Time Reasoning and the Rise of Reflective Agents

Collections 3

models 57 Sort: Recently updated

datasets 1

models 57