Show HN: qqqa – A fast, stateless LLM-powered assistant for your shell

github.com

162 points by iagooar 5 days ago


I built qqqa as an open-source project, because I was tired of bouncing between shell, ChatGPT / the browser for rather simple commands. It comes with two binaries: qq and qa.

qq means "quick question" - it is read-only, perfect for the commands I always forget.

qa means "quick agent" - it is qq's sibling that can run things, but only after showing its plan and getting an approval by the user.

It is built entirely around the Unix philosophy of focused tools, stateless by default - pretty much the opposite of what most coding agent are focusing on.

Personally I've had the best experience using Groq + gpt-oss-20b, as it feels almost instant (up to 1k tokens/s according to Groq) - but any OpenAI-compatible API will do.

Curious if the HN crowd finds it useful - and of course, AMA.

baalimago - 5 days ago

For inspiration (and, ofc, PR since I'm salty that this gets attention while my pet project doesn't), you can checkout clai[0] which works very similarly but has a year or so's worth of development behind it.

So feature suggestions:

* Pipe data into qq ("cat /tmp/stacktrace | qq What is wrong with this: "),

* Profiles (qq -profile legal-analysis Please checkout document X and give feedback)

* Conversations (this is simply appending a new message to a previous query)

[0]: https://github.com/baalimago/clai/blob/main/EXAMPLES.md

pmarreck - 5 days ago

Just about everyone has already written one of these. Mine are called "ask" and "please". My "ask" has a memory though, since I often needed to ask followup questions:

https://github.com/pmarreck/dotfiles/blob/master/bin/ask

I have a local version of ask that works with ollama: https://github.com/pmarreck/dotfiles/blob/master/bin/ask_loc...

And here is "please" as in "please rename blahblahblah in this directory to blahblah": https://github.com/pmarreck/dotfiles/blob/master/bin/please

etaioinshrdlu - 4 days ago

I can suggest our service (previously here https://news.ycombinator.com/item?id=44849129 ) that might be helpful -- If you want a zero-setup backend to try qqqa, ch.at might be a useful option. We built ch.at — a single-binary, OpenAI‑compatible chat service with no accounts, no logs, and no tracking. You can point qqqa at our API endpoint and it should “just work”:

OpenAI-compatible endpoint: https://ch.at/v1/chat/completions (supports streamed responses)

Also accessible via HTTP/SSH/DNS for quick tests: curl ch.at/?q=… , ssh ch.at Privacy note: we don’t log anything, but upstream LLM providers might...

buster - 5 days ago

I'm using https://github.com/kagisearch/ask

It's a simple shell script of 204 lines.

d4rkp4ttern - 5 days ago

I built a similar tool called “lmsh” (LM shell) that uses Claude-code non-interactive mode (hence no API keys needed, since it uses your CC subscription): it presents the shell command on a REPL like line that you can edit first and hit enter to run it. Used Rust to make it a bit snappier:

https://github.com/pchalasani/claude-code-tools?tab=readme-o...

It’s pretty basic, and could be improved a lot. E.g make it use Haiku or codex-CLI with low thinking etc. Another thing is have it bypass reading CLAUDE.md or AGENTS.md. (PRs anyone? ;)

stevedsimkins - 5 days ago

Feel like this might have already been done and beyond by aichat (which I give the alias `ai` on my machines)

https://github.com/sigoden/aichat

Nevertheless it’s good to see more tools with the Unix philosophy!

sheepscreek - 5 days ago

On the stateless part - I increasingly believe that state keeping is an absolute necessity. Not necessarily across requests but on the local storage. Handoffs are proving invaluable in overcoming context limitations and I would like more tools to support a higher level of coordination and orchestration across sessions and with sub-agents.

I believe the best “worker” agents of the future are going to be great at following instructions, have a fantastic intuition but not so much knowledge. They’ll be very fast but will need to retain their learnings so they can build on it, rather than relearning everything in every request - which is slow and a complete waste a resources. Much like what Claude is trying to achieve with skills.

I’m not suggesting that every tool reinvent this paradigm in its own unique way. Perhaps we a single system that can do all the necessary state keeping so each tool can focus on doing its job really well.

Unfortunately, this is more art than science - for example, asking each model to carry out handoff in the expected way will be a challenge. Especially on current gen small models. But many people are using frontier models, that are slowly converging in their intuition and ability to comprehend instructions. So it might still be worth the effort.

iagooar - 4 days ago

What a phenomenal launch it has been! Thanks a lot to everyone, for the many ideas and feedback. It has really made me push harder to make qqqa even cooler.

Since I launched it yesterday, I added a few new features - check out the latest version on Github!

Here is what we have now:

* added support for OpenRouter

* added support for local LLMs (Ollama)

* qqqa can be installed via Homebrew, to avoid signing issues on MacOS

* qq/qa can ingest piped input from stdin

* qa now preserves ANSI colors and TTY behavior

* hardened the agent sandbox - execute_command can't escape the working directory anymore

* history is disabled by default - can be enabled at --init, via config or flag

* qq --init refuses to override an existing .qq/config.json

foobarqux - 5 days ago

llm cmdcomp is better:

    - it puts the command in the shell editor line so you can edit it (for example to specify filenames using the line editor after the fact and make use of the shell tools like glob expansion etc.) 
    - it goes into the history. 
    - It can use a binding so you can start writing something without remembering to prefix it with a command and invoke the cmd completion at any place in the line editor. 
    - It also allows you to refine the command interactively.
I haven't see any of the other of the myriad of tools do these very obvious things.

https://github.com/CGamesPlay/llm-cmd-comp

Jotalea - 4 days ago

apparently everyone has made their own, some better, others worse. but here's my implementation (not as full-featured as this one but it does the job): https://github.com/Jotalea/FRIDAY

it's inspired on F.R.I.D.A.Y. from the Marvel Cinematic Universe, a digital assistant with access to all of the (fictional) hardware.

kissgyorgy - 5 days ago

There is also the llm tool written by simonwillison: https://github.com/simonw/llm

I personally use "claude -p" for this

krzkaczor - 5 days ago

This is nice. Reminds me how in warp terminal you can (could?) just type `# question` and it would call some LLM under the hood. Good UX.

Zetaphor - 4 days ago

I personally prefer aichat, as it allows me the option to copy the command its proposing to the clipboard, iterate further on the prompt, or to describe its choice

https://github.com/sigoden/aichat

iagooar - 5 days ago

And of course, if you find any bugs or feature requests, report them via issues on Github.

NSPG911 - 5 days ago

very cool, can be useful for simple commands, but i find github cli's copilot extension useful for this, i just do `ghcs <question>` and it gives me an command, i can ask it how it works, or make it better, copy it, or run it

RamtinJ95 - 5 days ago

This looks really cool and I love the idea but I will stick with opencode run ”query” and for specific agents which have specific models, I can just configure that also in an agent.md then add opencode run ”query” -agent quick

CGamesPlay - 5 days ago

Looks interesting! Does it support multiple tool calls in a chain, or only terminating with a single tool use?

Why is there a flag to not upload my terminal history and why is that the default?

insane_dreamer - 4 days ago

Nice! Do you have plans to make it work with a CC subscription? Great idea but not really interested in paying for another API key

swah - 5 days ago

I usually do this in Raycast but the Groq tip is good...

armcat - 5 days ago

One mistake in your README - groq throughput is actually 1000 tokens per "second" (not "minute"), for gpt-oss-20b.

- 5 days ago
[deleted]
ripped_britches - 4 days ago

I’ve used sgpt and really liked that as prior art

flashu - 5 days ago

Good one, but I do not see release for MacOS :(

psychoslave - 5 days ago

Can it run local LLM with quick parameters?

hmokiguess - 4 days ago

Nice, the `qq` part reminds me of this project: https://github.com/tldr-pages/tldr

That said, I rather use claude in headless mode https://code.claude.com/docs/en/headless

jcmontx - 5 days ago

why use this and not claude code?

dipittt - 5 days ago

[flagged]

_vxgq - 5 days ago

[flagged]