Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration

github.com

98 points by noahkay13 14 hours ago


noahkay13 - 14 hours ago

I built a C++ inference engine for NVIDIA's Parakeet speech recognition models using Axiom(https://github.com/Frikallo/axiom) my tensor library.

What it does: - Runs 7 model families: offline transcription (CTC, RNNT, TDT, TDT-CTC), streaming (EOU, Nemotron), and speaker diarization (Sortformer) - Word-level timestamps - Streaming transcription from microphone input - Speaker diarization detecting up to 4 speakers

pzo - 6 hours ago

You probably still better use inference on ANE (Apple Neural Engine) via CoreML rather than Metal - speed will be either similar or even faster on non-pro macbooks or iphones and power consumption significantly better. Metal or even MLX format doesn't have to be the fastest and the only way to access ANE is via CoreML.

Can use this library:

https://github.com/FluidInference/FluidAudio

ghostpepper - 13 hours ago

Off topic but if anyone is looking for a nice web-GUI frontend for a locally-hosted transcription engine, Scriberr is nice

https://github.com/rishikanthc/Scriberr

d4rkp4ttern - 5 hours ago

For MacOS I haven’t seen any STT app that has faster transcription than Hex (with Parakeet V3), which leverages Apple silicon + FluidAudio:

https://github.com/kitlangton/Hex

This is now my standard way to speak to coding agents.

I used to use Handy but Hex is even faster. Last I checked, Handy has stuttering issues but Hex doesn’t.

antirez - 9 hours ago

Related:

https://github.com/antirez/qwen-asr

https://github.com/antirez/voxtral.c

Qwen-asr can easily transcribe live radio (see README) in any random laptop. It looks like we are going to see really cool things on local inference, now that automatic programming makes a lot simpler to create solid pipelines for new models in C, C++, Rust, ..., in a matter of hours.

nullandvoid - 10 hours ago

I've been using handy with parakeet on both Windows and mac, and have been very impressed.

Hoe does this compare?

- 10 hours ago
[deleted]
rowanG077 - 7 hours ago

Is there anything truly low latency(sub 100ms)? Speech recognition is so cool but I want it to be low latency.

MarcLore - 7 hours ago

[dead]