Spell Checking a Year's Worth of Hacker News
fi-le.net16 points by fi-le 2 days ago
16 points by fi-le 2 days ago
"English being my second language, I curse it everyday and wish it could be more like, say, Hungarian, in which such a thing as a spelling bee would be unthinkable."
I love this about English! We are the most prolific word thieves of all time. We even stole an entire grammatically complete sentence from French ("Je ne sais quoi").
If you want English to be more like Hungarian, start inserting Hungarian words into sentences otherwise written in English and I guarantee people will adopt them as loanwords in short order. Never define them, we'll figure it out from context and vibes, and we'll never pronounce them correctly, which might make it grating to listen to them spoken back to you. But you can absolutely just incept words into English. We'll take them. We're hoarders. We all love that shit.
My favorite thing about it is the register system that developed from all this theft. There are at least three: German, French, and Latin. German is less formal, and French and Latin are often equal but differ in that French is less bureaucratic than Latin. The start, commencement, and initiation of something are different. And an initiation is different from an inauguration. You ask your friend, question a witness, and interrogate a suspect. Greek is more abstract than Latin. A moral question is nearer to the heart than an ethical question. You diagnose a disease, you judge a person. You have compassion, you merely feel sympathy.
Though, I would hate to learn it as a second language for the exact same reasons.
> If you want English to be more like Hungarian, start inserting Hungarian words into sentences otherwise written in English
I've been doing something like this with Finnish (which is in the same language family as Hungary) - I use Finnish colloquialism but directly translated into English. Things like "going ass first up a tree" (meaning doing something in a sub-optimal way) or "better on the ground than in the devil's mouth" (when you spill something). I find it amusing.
The author is right though, the English language is dreadful; In Finnish the words are written and pronounced the same way. Try that with some names of cities or towns in England.
Going ass first up a tree is funny enough to catch on if you keep using it. It fills a real semantic gap in the idea space of taking great pains to do something the wrong way. I'll never forget reading it just now.
Come on now though, dreadful. There's something beautiful about a language that's a fusion reactor for all other languages on Earth.
> I love this about English! We are the most prolific word thieves of all time.
It's impressive. English language: ~500,000 words. German language: ~135,000 words.
Seems like something that would be well received if one and exactly one guy was doing it, and annoying as hell when 35 additional jabronis started to run the same type of script, like what you see with the oh-so-helpful AI PRs on github that make random ass changes nobody asked for.
Especially given we already have pretty good spell checkers, and have had for way over a decade.
Yes, and some of them can't even be turned off or made to ignore your variable names and acronyms...
Gonna get the harvard email server onto gmail's naughty list going like this!
I dliberately put speelling mistakes to confirm I am not an AI.
Is it otherwise that easy to confuse your writing with an LLM's? What if they deliberately start including spelling mistakes into slop pipelines, as the post points out?
Met a chap who wrote exactly like an AI all over reddit. Had 10k+ posts going back a decade, all in the same helpful style with em-dashes. Telephoned him to check he was a real person.
Turns out that certain really helpful reddit posters respond in exactly the same way AI companies wish their models would respond, and the RLHF process really reinforces their mannerisms despite being 0.00001% of the total training data.
I feel sorry for them - their recent post history is full of mods banning them for being a bot.
Ouch. Yeah I imagine AI models get fed with proportionally more of their content, hence looks and reads like their style.
Nice idea…but this will just end up in the same bucket as “I really like your website {domain} and especially your post {link to blog post} about {topic} would you like to include a link to {our service}” SEO link building spam.
>>The recipients of the mail campaign aren't passive in this story; on the contrary, some played a reverse card, informing of issues...
Yeah, this gets me. Grammar is grating to me and I used to call out when someone would write "Me and my friend..." only to get attacked in response as if grammar matters to no one.
It doesn’t though. If you think about it, improper English and slang strongly effects cultural and social bonding. I too would feel the opposite party is pretentious if someone is correcting me for a casual conversation. If it’s a professional relationship, that’s different.
Is it kind to email people telling them they wrote 'its' and not 'it's'? Thus used to be called being a grammar Nazi and is just a different kind of asshole
It's one thing when you're correcting someone you're in conversation with and a different thing entirely when you're correcting long-form material with at least some pretense for polish.
I don’t agree. Sending AI generated spam with spelling errors isn’t kind. Traditionally, humans ask to be edited and they deserve that agency.
The person I'm responding to didn't say anything about AI. I don't agree that this kind of thing should be automated like in the OP either. However, as far as I'm concerned, notifying post authors about spelling and grammatical errors is normal and generally appreciated.
Do you understand the difference between these two words? I am asking genuinely, no offense