The entire New Yorker archive is now digitized
newyorker.com378 points by thm 6 days ago
378 points by thm 6 days ago
With every passing year the New Yorker stands out even more. High quality long-form journalism and short fiction with minimal advertising (in the print issue it’s just a few at the front and one at the back) is very hard to find. I love getting my issue in the mail every week and I’ve never once thought that reading it was a waste of my time.
I’d highly encourage anyone who loves great writing to subscribe.
I’m a longtime New Yorker lover myself. I think there is some truth to this though: https://open.substack.com/pub/persuasion1/p/how-the-new-york...
I subscribe, but stare right through ads, unnoticing. Do they really not have that margin ad for berets anymore?
Did this change? I stopped reading the print version for lack of time a few years back, and there was definitely some full-page and margin advertising throughout the paper. I recall some of it being clearly directed at much wealthier customers than I was.
The placements and counts tends to vary issue to issue, but in general is much lower volume than many publications. But agreed, the ads do tend to be almost comically high end (for me)
I’ve long thought about trying to map of how the locations of music and maybe theater events listed in the magazine have changed over time.
There are performances of some kind in pretty much every corner of NYC but it’s interesting to see which neighborhoods have had events deemed relevant to The New Yorker readership in different eras.
It also speaks to what we lose when we lose magazine listings of events (New Yorker effectively gutted this section within the past decade), movie showtime listings via newspaper, etc
We have a very strong archive going back a century until about 2015, but now wading through linkrot circa 2017 is miserable
And the current era of less-than-major-venue music listings in many places is exclusively on Instagram and Facebook pages of venues and bands.
in addition to making a map, it would also be a fascinating timeline: you could show venues (as they appear/disappear through time) and artists, and filter/search those
imagine seeing listings for John Coltrane or Miles Davis or Benny Goodman...
let me know if I can help - it's a beautiful & great project idea!
That’s an incredible idea and I hope you do this! If you do, you should consider adding restaurants too.
That's a very neat idea! If you ever have the time to do it you should try it out, in fact you've gave me an idea of trying to do the same for my city, Bucharest, just need to find some relevant data-sources.
I hope this gets incorporated into the existing website. I'm not an active subscriber but I used to be and I always thought there was a very fertile "other articles you might like" grounf that the New Yorker never took advantage of, given it's reputation and legacy.
I’ve happily lost hours to following links at the bottom of one story to the next. The new archive still feels a little clunky (search needs a fair bit of work and the OCR clearly struggled in places), but it’s fun to chase down old classics and they’ve done a great job of highlighting greatest hits from the past 100 years.
Plus the (really high-quality) crossword puzzles often have an Easter egg where the big revealer is linked to an essay from the past.
I think that a better link (even though it lacks the context) is this new archive (which is mostly good as it lets you quickly see all cover pages) - https://www.newyorker.com/archive
But yeah, without a subscription, this still mostly just leads to walled off pages.
Accessing the actual archived version of every issue at https://archives.newyorker.com/ is truly wonderful as they are fully digitized back to back.
hopefully a lot of local libraries will have access. i could spend hours sifting through this.
Unfortunately, it's not likely. The full text back to 1925 (of articles, with no images) has been available on ProQuest for a while, and many libraries subscribe to that which is ok, but lacking all the great photos, cartoons, ephemera etc.
Many libraries also subscribe to Libby/Overdrive which does include the full images of all the pages, but Libby only provides coverage for the past year. Unfortunately publishers of newspapers and magazines often offer great archival content of this sort on their websites, but don't allow libraries to license it for their patrons.
Slightly different question, but does anyone have any info about Google’s digitisation of Mainichi Shimbun’s pre-war articles? The work was announced 3 years ago, but it’s been radio silence since: https://mainichi.jp/english/articles/20221110/p2a/00m/0bu/00...
About 10 years ago, when I was at The New Yorker, I worked on launching the redesign, paywall, and the move to WordPress. We actually had most of the archive technically ready to go. The data wasn’t the hard part.
The real blocker was permissions and rights. Contracts going back a century obviously never contemplated digital publication, domains, or the internet at all. Untangling who owned what, and securing the right to republish everything online, was a massive legal and logistical undertaking.
That’s what held us back then, not so much the technology. Really glad to see that chapter finally closed.
Any idea what changed, if anything? Court decisions made in the meantime simplifying things?
Hopefully the content fits in a few buckets (cartoons, fiction, non-fiction) as far as different terms for rights might go. And then from there, you can lop off anything that's past its copyright term (?). Then maybe the next step is grouping works by the agent/publisher, if any? Or maybe all the contracts with the New Yorker are signed by individuals, with the New Yorker as a publisher. I don't know.
I assume it was a matter of time - ten years of digging into contracts or chasing people/agencies down (speculative on my part)? Bear in mind, if you are unsure if you have rights to a piece then you cannot use it until you know for sure - I am sure that was part of it too.
Fun (unrelated) fact:
My favorite product that I got to build there was “Cartoons at Random”. You’ll never guess what it did/was!
I miss it terribly, just swiping images off a stack to reveal a new random cartoon underneath.
The developer (Justin?) did an amazing interaction on iOS app (seamless, no jank) and web version was decent too.
They broke it when they migrated from Wordpress to their own Condé Nast CMS
https://www.newyorker.com/cartoons/random/share/1544311
Such delight. Sigh.
I'm bummed that we never made that link keep working - it was a fun start page.
Here’s a place to start, a list of 250 “best” articles from the New Yorker. I guess this is from previously available articles.
My personal favorite is Louis Menard’s piece on how bad Microsoft Word is.
https://www.newyorker.com/magazine/2003/10/06/the-end-matter
Nice, reminded me of this classic https://www.newyorker.com/magazine/2018/04/23/the-maraschino...
Possibly friendlier link:
https://old.reddit.com/r/longform/comments/1e8m5s1/the_250_b...
(old.reddit.com takes you to the old UI)
I am a subscriber but still would love a tarball of PDFs of each issue.
Honestly this got me to subscribe. The back catalog is pretty stellar with pretty much every major writer of the twentieth century making a contribution. Zooming in on PDFs just wasn't how you wanted to read them.
How soon can we chat with it via RAG?
Haha, I can't read long articles anymore because I want to reply, a habit I picked chatting LLMs.
I saw no way to pull down a PDF. That's unfortunate as I prefer to browse offline.
Nice! 100 years worth.
cynical me thinks they did this to sell to AI companies
Could have sworn they did this years ago. I even have the first 80 years or whatever on DVD in the closet.
Normally when laymen say "digitized" they mean one of two things: scanned images in a PDF or fully transcribed (and possible formatted) text extracted from the scan. The Complete New Yorker you're thinking of was mostly the former, with a bit of indexing (table of contents pointing to the PDFs if I remember correctly).
This latest digitization project does the latter, transcribing the text into their existing content management system and as far as I can tell, preserving much of the formatting. This comes with full text search, allows cross linking between articles, and all that good stuff.
I suspect that since they include an LLM summary and started this digitization project in early 2024, this was enabled by LLMs.
If I’m reading this correctly, they now have all their historic articles loaded into their CMS. I think they previously just had a system where you could page (and maybe search?) through scans of old issues, which is also cool but not as versatile.
When a lot of content was being put out on CD/DVD, a number of publications did but they are not straightforwardly accessible these days because they're usually on an old version of Windows. (Yes, if you want to make a project of it, you can probably get into them but has never been worth it for me.)
Usually Windows/Wine is the much better case than the old Mac apps (32bit, PPC etc) in the age of Apple Silcon
https://old.reddit.com/r/thenewyorker/comments/1jlhrve/instr...
Breaking the DJVU DRM would be the perfect solution though
It has been broken. I actually have the set on my desk ready to rip, I just couldn't find my USB DVD drive.
Here's a link to the guy that broke it:
Surprisingly, this has been a project I’ve been tinkering with for years. There is an easy way to get the raw png/jpeg files out, but it does require a windows box. Im planning on working on it more over the long holiday.
I think the disc release GP is talking about had files in DjVu format.
Encrypted DjVu, and the viewer doesn‘t run on modern Windows.
It runs great on windows 11. The install took a long time but I didn’t have to do anything special to make it work
I have the MAD archives bought in 90s on CDs but can't use..
The issues on the Absolutely MAD DVD (1952-2005) are just plain PDF files, no DRM, they work perfectly
The CDs I have seem to be proprietary for Windows from the late 90s. But I also have PDFs through 2005 on my computer which I must have "acquired" at some point.
The browser app might be some outdated Windows application, that's the case with the MAD DVD too, but you can find the actual issue files in some folders
I have MAD archives somewhere. I thought they were in some standard format but maybe not.
A lot of the gen 1 or so CD content isn't easily accessible although a more industrious person could probably get to it in some manner.
[dead]