Backprompting: Leveraging synthetic production data for health advice guardrails

arxiv.org

27 points by PaulHoule 2 days ago


mentalgear - a day ago

> We test our technique in one of the most difficult and nuanced guardrails: the identification of health advice in LLM output, and demonstrate improvement versus other solutions. Our detector is able to outperform GPT-4o by up to 3.73%, despite having 400x less parameters.