What happened after 2k people tried to hack my AI assistant

fernandoi.cl

311 points by cuchoi 16 hours ago


lelanthran - 13 hours ago

This conclusion:

> I am less worried about prompt injection now. Before running this experiment, I expected prompt injection to be much easier than it turned out to be.

Is unwarranted. Sure, the agent never output the secret, but did it output anything else? IOW, was it usable?

An agent that considers every prompt an attack (and responds accordingly) "passes" this test, while being useless anyway.