Backprompting: Leveraging synthetic production data for health advice guardrails
arxiv.org27 points by PaulHoule 2 days ago
27 points by PaulHoule 2 days ago
> We test our technique in one of the most difficult and nuanced guardrails: the identification of health advice in LLM output, and demonstrate improvement versus other solutions. Our detector is able to outperform GPT-4o by up to 3.73%, despite having 400x less parameters.