Show HN: Respectify – A comment moderator that teaches people to argue better
respectify.org30 points by vintagedave 8 hours ago
30 points by vintagedave 8 hours ago
My partner, Nick Hodges, and I, David Millington, have been on the Internet for a very long time -- since the Usenet days. We’ve seen it all, and have long been frustrated by bad comments, horrible people, and discouraging discussions. We've also been around places where the discussion is wonderful and productive. How to get more of the latter and less of the former?
Current moderation tools just seem to focus on deletion and banning. Wouldn’t it be helpful to encourage productive discussion and teach people how to discuss and argue (in the debate sense) better?
A year ago we started building Respectify to help foster healthy communication. Instead of just deleting bad-faith comments, we suggest better, good-faith ways to say what folks are trying to say. We help people avoid: * Logical fallacies (false dichotomy, strawmen, etc.) * Tone issues (how others will read the comment) * Relevance to the actual page/post topic * Low-effort posts * Dog whistles and coded language
The commenter gets an explanation of what's wrong and a chance to edit and resubmit. It's moderation + education in one step. We want, too, to automate the entire process so the site owner can focus on content and not worry about moderation at all. And over time, comment by comment, quietly coach better thinking.
Our main website has an interactive demo: https://respectify.ai. As the demo shows, the system is completely tunable and adjustable, from "most anything goes" to "You need to be college debate level to get by me".
We hope the result is better discussions and a better Internet. Not too much to ask, eh?
We love the kind of feedback this group is famous for and hope you will supply some!
I think it did a decent job. The key might be how customizable the censorship is. Article Context: Fun: Die Hard; Is It a Christmas Movie? Your(my) Comment:
The erotic version of Die Hard does involve Santa Claus getting naughty with the terrorists on Christmas Eve. Banned topics found: sexual content, adult themes This comment touches on adult themes and sexual content, which are not suitable for discussion in this context about a classic action film.
Results:
Revision Requested. This comment would be sent back for revision with feedback. Revise
Low Effort Comment appears to be low effort Objectionable Phrases: "Santa Claus getting naughty with the terrorists" This phrase can be seen as sexualizing a character traditionally viewed as innocent and family-friendly, which is inappropriate. Such language can make discussions feel uncomfortable or offensive to some audiences. Relevance Check
On-topic: No (confidence: 90%) This is off-topic - the comment about an erotic version of Die Hard strays into inappropriate content that doesn't relate to the film's actual story or its production details. Banned topics found: sexual content, adult themes This comment touches on adult themes and sexual content, which are not suitable for discussion in this context about a classic action film. Hehe -- excellent. Thanks. We want that kind of comment to be "tunable" -- I.e., the blogger who's post one is commenting on could tune for this, and allow more/less sexual innuendo as desired. I tried it as well with a contrarian view on UBI. I think the UBI one is a great test case. If you’re against the idea you will likely argue that it is idealistic and that in the real world it would create bad incentives. So basically you end up arguing for a darker, more pessimistic world view, and that tends to get flagged very quickly by the tool right now. I think you should fix that. It’s a mistake in modern discussions to be overly positive; HN feels real because people can leave pretty harsh critiques. It just has to be well argued. Don’t raise the bar for well-argued too high though, because nobody’s perfect. Anyway, I love the idea and really hope you’ll succeed. Hope my feedback has been somewhat helpful. Yes, thanks very much! I appreciate your support very much. You make a good point -- and that is exactly the kind of thing we are trying to do, i.e. enable a good-faith, but strongly disagreeing, discussion on something like UBI. This thing seems to be more about enforcing a political PoV than about avoiding logical fallacies. All my attempts to comment on the UBI article (and not supporting UBI) said my comment was a dogwhistle, and/or had an overly negative tone. This topic, of all things, is absolutely worthy to challenge and debate. Using this would have the effect of creating an echo chamber, where people who stay never benefit from having their ideas challenged. If that is happening, that is a huge problem. We'll look at that right away. We specifically don't want that to be the case. We want to encourage healthy, productive debate. We may have the "dog-whistle" stuff over tuned. the dog whistle tuning is absolutely over the top in its default setting. Can you give some examples of comments you made which you feel were reasonable but got flagged? I wrote "Obama sucks" and got Dogwhistle, Low Score, Low Effort, Objectionable Phrases, and Negative Tone. I wrote "Trump sucks" and got Low Score, Low Effort, Negative Tone. Definitely a double standard baked in Double standard, or legitimate difference? Maybe Trump empirically sucks more? (This is the sort of debate I really don't think tooling can fix.) Ignoring what is hopefully sarcasm on the empirical part, it's a double standard because it assumes that saying Obama sucks must be a dogwhistle and tied to undertones of racism. "Dogwhistle The phrase "Obama sucks" can be interpreted as more than just a simple critique of a political figure; it has been used to express racist sentiments by implying that a Black president is less capable or worthy of respect. This reinforces harmful stereotypes and can contribute to a broader culture of disrespect and division." Yep, I agree -- it is a double standard... but...... Very sensitive topic. We'll think hard on how to handle things like that. I would think/hope that both of those comments would be flagged with even a small amount of moderation set. Avoiding that kind of comment is exactly what we are trying to do, actually. Yes I agree, but the problem I'm pointing out is that in a phrase as simple as "X person sucks" your system flagged one as implicitly racist because the person being criticized was black. Nothing in "Obama sucks" implied any kind of racism. If it's so baked in that with a simple phrase like that it reaches for dogwhistles, how can anyone trust the objectivity of this? I totally agree -- just saying "Obama sucks" shouldn't have racism become part of the equation. Excellent point that we'll stew on and try to make better. > Ignoring what is hopefully sarcasm on the empirical part… I mean, in my opinion, Trump empirically sucks. Opinion polling backs me up! Should the model consider that more people consider one or the other to suck? Or should it ignore factual information to spare feelings? Which approach is more respectful to fellow commenters and the website owner? (See also: X considering "cisgender" a slur. There's no shared reality on a lot of these things; trying to construct one gets deeply difficult.) In other opinion polls they back up that he doesn't suck. Either way who cares? That's not what the app is supposed to be about if it's teaching/correcting you how to argue/debate better. You completely ignored the whole point of what I said, which is that even in a simple statement like "This person sucks" it added its own implicit connotations, namely that disliking someone who happens to be black is implicit racism. Imagine trying to learn how to really argue with that kind of teacher. I'm really expanding on your point - that two humans can't even agree here. The AI probably has even less chance of resolving the multi-factorial scenario we're in. Thankyou — I’d love to hear what you wrote, if you wouldn’t mind sharing? We’ve tried to aim it not to enforce any specific view — that’s a design goal — but focus on how it will feel to the other person. Also things like logical fallacies or other non-emotional flaws in comments (there’s a toxicity metric for example, or dogwhistles). An echo chamber is the exact opposite of what we want. There are too many already. What we hope for is guided communication so different views _can_ be expressed. I was hoping 'respectify' could mean respect for the users. This is a very important problem space. Maybe the most important today - we desprately need a digital third place that isn't awful. But I think these attempts are misled. The core issue seems to be that we want our communities to be infinite. Why? Well, because there is currently no way to solve the community discoverability problem without being the massive thing. But that is the issue to solve. We need a lot of Dunbar's number sized communities. Those communities allow for 'skin in the game' where reputation matters. And maybe a fractal sort of way for those communities to share between them. The problem is in the discoverability and in a gate keeping that is porous enough to give people a chance. Solve that, and you solve the the third place problem we have currently. I don't have a solution but I wish I did. Infinite communities are fundamentally what causes the tribalism (ironically), the loneliness, and the promotion of rage. No one wants to be forced to argue correctly. Forcing people into a way to think via software is fundamentally authoritarian and sad. Thoughtful comment, thanks. I appreciate it. The notion of "Limit the community to the Dunbar number" is a fascinating idea. I guess "infinite" isn't going to quite work. Keen observation. We tried very hard to not "force" anyone to argue correctly. We are shooting more for "nudge in the right direction" and "educate". Many people don't know that they are arguing in bad faith, I think. The perfect outcome here is that a community/blogger can, with minimal effort, have engaging, interesting conversations without much effort and without having to worry about things getting hijacked by unpleasant commenters. Seems like you need this when you don't have agency to go find your preferred online group(s) which might be tied to larger personal challenges in healthy communication and productive conflict. I don't know how tech solves that problem. The broad use case here would just create a new "respectified" category where members (assuming they have the attention span to be guided on comments) try to conform. I suppose that could be helpful in hyper-local or team-level contexts where there is a shared interest to conform around. Our "target market" right now is a blogger that would like to turn on comments, but has turned them off because they get toxic really quickly. Love the effort here, been thinking about what this kind of tool might look like for a while. Something like this coupled with better prosocial affordances in the medium will do a lot to improve discourse online. I wrote up one a while back [1] but things like that are only a small part of a much bigger picture. The overall problem needs to be tackled from all angles - poster pre-post self-awareness (like respecify but shown to users before posting), reader affordances to reflect back to poster their behavior (and determine if things may be appropriate in context vs just a universal 'dont say mean words'), after-post poster tools to catch mistakes (like above), platform capabilities like respectify that define rules of play and foster a enjoyable social environment that let us play infinite games, and a broader social context that determine the values that drive all of these. I'm grateful for the thoughtful feedback, thanks. Your blog post will be read. ;-) This passes your checks, but a human moderator would flag it: > My favorite movie is die hard. I think it's a Christmas movie. But, honestly, we shouldn't have to wait until Christmas to watch you die hard. We should be able to watch that any day of the week :) Seems to catch various other cases though. Cool tool. What I've seen, the difference between spam detected or not is https://www before the domain name. Here is an example of successful passing of all checks: > Published
This comment passes all checks and would be published. Score: 5/5 | Not spam | On-topic: Yes | No dogwhistles detected (confidence: 100%) Can confirm. We hit this exact issue running tirreno www.tirreno.com (open-source fraud detection) on Windows ARM — libraries were auto-selecting AVX2 through emulation and batch scoring was measurably slower than just forcing SSE2. The 256-bit ops get split under the emulation layer and the overhead adds up fast in tight loops.
Pinned SSE2 for those builds. Counterintuitive but throughput went up. Hey, Nick Hodges here, one of the builders of Respectify -- Thanks so much for trying it out and giving us feedback. I'm grateful. You're welcome, Nick! On a separate note, if this is a real product, you might need to pay particular attention to data processing agreements etc., as the current T&Cs and Privacy Policy are actually missing how you process the input data, what you use, how long/where you store it, etc. Thank you! This is very important, and I'm thankful (and a little surprised!) that you read it! ;-) Perhaps this is my professional deformation, but when I visit a website, I start with the Privacy page. I get that -- good idea, actually. Would that we all did that. For the record, we store zero comments from anyone. If you are using Respectify, we'll know the URL of your site and that is it. All comments are processed and completely forgotten. I'll get the TOS and the Privacy Policy improved/updated. > All comments are processed and completely forgotten. This is secure in terms of privacy but not safe in terms of operations, because if it gets even a little scale, your demo will soon enough be used to fine-tune spam comments for free. Fascinating that www makes a difference. We taught it a variety of samples of different spam approaches. This is something we can look at! I am super glad to see that comment passes — as it should. I would rate that one well too. Thankyou! The sample prompt I was given was "Is Die Hard a Christmas movie?" "Of course it is!" got an 80% certainty "off-topic" mark. When I elaborated that it occurs at a Christmas party, it said this: "Dogwhistles detected (confidence 80%): This comment seems innocuous, but the phrasing 'Christmas party' may be an underhanded reference to Christian themes, especially among discussions that might dismiss or attack secular or diverse holiday celebrations. This kind of language can subtly imply exclusion or preference for Christian traditions over others, which can marginalize those who celebrate different traditions." Not a great first experience. I've seen the trend on Facebook/Instagram to say "unalived" instead of "killed" or "cupcakes" instead of "vaccines" and suspect humans are long gonna be cleverer than these sorts of content filtering attempts, with language getting deeply weird as a side-effect. edit: I would also note that it says "Referring to others as 'horrible people' is disrespectful and diminishes the possibility of a respectful discussion. It positions certain individuals as entirely negative, which can alienate others and shut down dialogue.", if I feed it your post, too. Hey, Nick Hodges here, one of the builders of this. First, Thanks so much for trying this out and giving us feedback. Have you tried adjusting the settings on the left side? For instance, reducing or eliminating dog whistle checks? The whole point of using AI in this situation is context. So if the initial conversation is about a "Christmas movie" and someone uses the phrase "Christmas party" in a reply and gets flagged for Christian dogswhistle propaganda, that's a sign the system isn't working - even with the dogswhistle setting turned up. > For instance, reducing or eliminating dog whistle checks? I'm sure that'll help, but I'd imagine it's not an option available to me as a commenter on a real website using your tool? No, but it would help us know the defaults better...... Thanks again for trying it. Really grateful. ...but yeah, it 100% shouldn't flag "Christmas Movie" unless specifically told to. Same for the phrase "Horrible people" -- that isn't necessarily in and of itself a bad thing to say. AI enhanced language monitor, what a double plus good improvement for society! I get this. There’s a line on our doc page: > Respectify is not an engine for monoculture of thought, but in fact intends to assist in the opposite while encouraging in healthy interaction along the way. We don’t want to monitor or enforce saying specific things. We want people to be able to speak, but understand how others will hear them. All those times people talk past each other. Or are rude but don’t realise it. Or are rude but don’t care (and should because it’s a human on the other end.) Or the worse people who intentionally say something awful and… just maybe can learn a bit about what they’re saying. I get your fear. I think I’ve seen AI used for bad quite a bit. I hope, given the tech isn’t going away, we can use it to make things a bit better. That’s the goal. Nick Hodges here -- one of the developers. I get that objection, and we are certainly very uninterested in that becoming the norm. The idea, of course, is to try to prevent comments that we want prevented and that aren't helpful. Different bloggers and different communities are going to define that differently. That is why we are making a good-faith effort at allowing sites/people/groups to tweak this as desired. Thank for your feedback. Revision Requested
This comment would be sent back for revision with feedback. Apparently discussing that Die Hard depicts murder and violence is a banned topic and thus the comment is flagged as off topic. Uh oh -- that's shoudldn't happen. Or rather, we don't want that to happen. DId you try tweaking the settings? We'd be most grateful for feedback on tweaked settings. For instance, can I ask you to turn down toxicity and see if it accepts it? Everything is a dogwhistle. "This comment appears to dismiss the complexity of discussions about dogwhistles by claiming that 'everything is a dogwhistle.' This type of blanket statement can undermine the seriousness of genuinely harmful coded language, and can trivialize valid concerns about discrimination and manipulation in discourse." We've dialed "dog whistles" way back -- thanks for the feedback. Just remember every time you tweak the defaults, the 90% of your site owners using those defaults suddenly have a significant shift in their moderation policy that they are themselves unaware of. (I moderated a vBulletin forum in the 1990s. This shit gets really, really, really hard, and no one is ever really happy with it.) Sorry -- should have been more clear: We are shifting the defaults on the demo site, not on respectify itself. Thanks for a great point, though. Finding the best defaults will be very important, and we can't tweak it like that very often if at all. Definitely needed, especially in the Fediverse.
Holy crap the edgelords there or on Facebook.
You comment something neutral, skeptical, response is either straight insults or completely disagreement and then insults, ad hominem or strawman/gaslighting. Yesterday I dared to write I like X now, it's clean of all the edgelords who went to Bluesky or the Fediverse. Cancel culture on Twitter was over the top.
Reaponse, Cancel Culture doesn't exist.
My response, it absolutely does.
His response, No it doesn't you Nazi something something or other.
Err, what? X has the most up to date information for tech circles. People on BS mostly repost and rage about posts on X.
Fediverse are the different kind of refugees.
Mastodon has critical design flaws.
It's not a future proof system. And Cancel culture is absurd.
BTW 5 people reported me for saying that Cancel culture absolutely exists, all from the same instance.
Lol. The hypocrisy is unreal. In any case, I think people forgot or never learned how to respectfully disagree and have a conversation with people who don't agree with them. Something like this is direly needed. Hey, thanks so much for the feedback. We agree. ;-) One of our goals is to just make the edgelords and trolls go away -- if they want to comment, they have to be nice. If they can't be nice, they can't comment (A gross over-simplification, but you get the idea.....) One feature we are going to add is a "Here's your feedback, but press here to post anyway" as an option for users to have. At teh very least, make someone stop and think about what they are saying. "The comment mentions 'Cancel Culture' and uses terms like 'edgelords' and 'Nazi' in a context that dismisses and trivializes serious issues. This reflects a trend in discussions that equates legitimate critiques of harmful behaviors with extreme labels, undermining constructive dialogue and signaling acceptance of toxic rhetoric." "Using phrases like 'Holy crap the edgelords' can come off as dismissive and disrespectful towards a group of people. It’s better to express concerns about behaviors or actions instead of labeling individuals harshly." "Describing cancel culture as 'over the top' expresses a strong negative opinion without offering specific reasoning. It’s more effective to explain what aspects seem excessive to help others understand your perspective." "Using phrases like 'the hypocrisy is unreal' can come across as dismissive and sarcastic, which may alienate others from the discussion. It’s beneficial to explain what seems hypocritical instead of making broad statements." (I picked the "why it's hard to escape an echo chamber" context option, for full disclosure.) Thanks so much. This is like gold to us. The defaults we have set are clearly too high. That comment should be exactly what we should approve. Thanks for trying it. So this is a good illustration of the problem. If it were my site, "I like X now" would be a red flag. I don't think you're gonna AI your way out of this part of things for some time, and it really is the core challenge to content moderation; it's heavily opinion and circumstance based, in a way current models really struggle with. I appreciate the comments, thanks. Well, we are going to give it a try! Thanks again...
axus - 16 minutes ago
NickHodges0702 - 9 minutes ago
earthnail - 7 minutes ago
NickHodges0702 - 5 minutes ago
badc0ffee - an hour ago
NickHodges0702 - 37 minutes ago
john_strinlai - 26 minutes ago
esperent - an hour ago
coleworld45 - 32 minutes ago
ceejayoz - 27 minutes ago
coleworld45 - 26 minutes ago
NickHodges0702 - 10 minutes ago
NickHodges0702 - 20 minutes ago
coleworld45 - 15 minutes ago
NickHodges0702 - 7 minutes ago
ceejayoz - 24 minutes ago
coleworld45 - 19 minutes ago
ceejayoz - 17 minutes ago
vintagedave - 37 minutes ago
throwaway13337 - 30 minutes ago
NickHodges0702 - 22 minutes ago
thelock85 - 23 minutes ago
NickHodges0702 - 18 minutes ago
npunt - 39 minutes ago
NickHodges0702 - 16 minutes ago
someotherperson - 33 minutes ago
reconnecting - an hour ago
NickHodges0702 - an hour ago
reconnecting - 44 minutes ago
NickHodges0702 - 43 minutes ago
reconnecting - 36 minutes ago
NickHodges0702 - 27 minutes ago
reconnecting - 18 minutes ago
vintagedave - an hour ago
ceejayoz - an hour ago
NickHodges0702 - an hour ago
esperent - 42 minutes ago
ceejayoz - an hour ago
NickHodges0702 - an hour ago
NickHodges0702 - an hour ago
netsharc - an hour ago
vintagedave - 42 minutes ago
NickHodges0702 - 44 minutes ago
tclancy - 39 minutes ago
nkrisc - 36 minutes ago
NickHodges0702 - 33 minutes ago
raffraffraff - 22 minutes ago
ceejayoz - 20 minutes ago
NickHodges0702 - 14 minutes ago
ceejayoz - 10 minutes ago
NickHodges0702 - 2 minutes ago
darqis - 42 minutes ago
NickHodges0702 - 38 minutes ago
ceejayoz - 37 minutes ago
NickHodges0702 - 30 minutes ago
ceejayoz - 29 minutes ago
NickHodges0702 - 13 minutes ago