The entire New Yorker archive is now digitized

378 points by thm 6 days ago

With every passing year the New Yorker stands out even more. High quality long-form journalism and short fiction with minimal advertising (in the print issue it’s just a few at the front and one at the back) is very hard to find. I love getting my issue in the mail every week and I’ve never once thought that reading it was a waste of my time.

I’d highly encourage anyone who loves great writing to subscribe.

whistle650 - 6 hours ago

I’m a longtime New Yorker lover myself. I think there is some truth to this though: https://open.substack.com/pub/persuasion1/p/how-the-new-york...
jbaber - an hour ago

I subscribe, but stare right through ads, unnoticing. Do they really not have that margin ad for berets anymore?
yujzgzc - 7 hours ago

Did this change? I stopped reading the print version for lack of time a few years back, and there was definitely some full-page and margin advertising throughout the paper. I recall some of it being clearly directed at much wealthier customers than I was.
- waldothedog - 7 hours ago
  
  The placements and counts tends to vary issue to issue, but in general is much lower volume than many publications. But agreed, the ads do tend to be almost comically high end (for me)

smelendez - 14 hours ago

I’ve long thought about trying to map of how the locations of music and maybe theater events listed in the magazine have changed over time.

There are performances of some kind in pretty much every corner of NYC but it’s interesting to see which neighborhoods have had events deemed relevant to The New Yorker readership in different eras.

bufordsharkley - 10 hours ago

It also speaks to what we lose when we lose magazine listings of events (New Yorker effectively gutted this section within the past decade), movie showtime listings via newspaper, etc
We have a very strong archive going back a century until about 2015, but now wading through linkrot circa 2017 is miserable
- smelendez - 9 hours ago
  
  And the current era of less-than-major-venue music listings in many places is exclusively on Instagram and Facebook pages of venues and bands.
gregsadetsky - 8 hours ago

in addition to making a map, it would also be a fascinating timeline: you could show venues (as they appear/disappear through time) and artists, and filter/search those
imagine seeing listings for John Coltrane or Miles Davis or Benny Goodman...
let me know if I can help - it's a beautiful & great project idea!
Q6T46nT668w6i3m - 5 hours ago

That’s an incredible idea and I hope you do this! If you do, you should consider adding restaurants too.
paganel - 12 hours ago

That's a very neat idea! If you ever have the time to do it you should try it out, in fact you've gave me an idea of trying to do the same for my city, Bucharest, just need to find some relevant data-sources.
- smelendez - 9 hours ago
  
  Travel guides are interesting too although obviously not quite the same.

krelian - 13 hours ago

I hope this gets incorporated into the existing website. I'm not an active subscriber but I used to be and I always thought there was a very fertile "other articles you might like" grounf that the New Yorker never took advantage of, given it's reputation and legacy.

tclancy - 11 hours ago

I’ve happily lost hours to following links at the bottom of one story to the next. The new archive still feels a little clunky (search needs a fair bit of work and the OCR clearly struggled in places), but it’s fun to chase down old classics and they’ve done a great job of highlighting greatest hits from the past 100 years.
Plus the (really high-quality) crossword puzzles often have an Easter egg where the big revealer is linked to an essay from the past.

gregsadetsky - 11 hours ago

I think that a better link (even though it lacks the context) is this new archive (which is mostly good as it lets you quickly see all cover pages) - https://www.newyorker.com/archive

But yeah, without a subscription, this still mostly just leads to walled off pages.

Accessing the actual archived version of every issue at https://archives.newyorker.com/ is truly wonderful as they are fully digitized back to back.

toofy - 9 hours ago

hopefully a lot of local libraries will have access. i could spend hours sifting through this.
- jjaaammmmy - 8 hours ago
  
  Unfortunately, it's not likely. The full text back to 1925 (of articles, with no images) has been available on ProQuest for a while, and many libraries subscribe to that which is ok, but lacking all the great photos, cartoons, ephemera etc.
  Many libraries also subscribe to Libby/Overdrive which does include the full images of all the pages, but Libby only provides coverage for the past year. Unfortunately publishers of newspapers and magazines often offer great archival content of this sort on their websites, but don't allow libraries to license it for their patrons.
- qingcharles - 2 hours ago
  
  I saw them all on the High Seas recently, but each year is ~20GB of PDFs.

robin_reala - 13 hours ago

Slightly different question, but does anyone have any info about Google’s digitisation of Mainichi Shimbun’s pre-war articles? The work was announced 3 years ago, but it’s been radio silence since: https://mainichi.jp/english/articles/20221110/p2a/00m/0bu/00...

donohoe - 9 hours ago

About 10 years ago, when I was at The New Yorker, I worked on launching the redesign, paywall, and the move to WordPress. We actually had most of the archive technically ready to go. The data wasn’t the hard part.

The real blocker was permissions and rights. Contracts going back a century obviously never contemplated digital publication, domains, or the internet at all. Untangling who owned what, and securing the right to republish everything online, was a massive legal and logistical undertaking.

That’s what held us back then, not so much the technology. Really glad to see that chapter finally closed.

rconti - 8 hours ago

Any idea what changed, if anything? Court decisions made in the meantime simplifying things?
Hopefully the content fits in a few buckets (cartoons, fiction, non-fiction) as far as different terms for rights might go. And then from there, you can lop off anything that's past its copyright term (?). Then maybe the next step is grouping works by the agent/publisher, if any? Or maybe all the contracts with the New Yorker are signed by individuals, with the New Yorker as a publisher. I don't know.
- donohoe - 5 hours ago
  
  I assume it was a matter of time - ten years of digging into contracts or chasing people/agencies down (speculative on my part)? Bear in mind, if you are unsure if you have rights to a piece then you cannot use it until you know for sure - I am sure that was part of it too.
donohoe - 8 hours ago

Fun (unrelated) fact:
My favorite product that I got to build there was “Cartoons at Random”. You’ll never guess what it did/was!
I miss it terribly, just swiping images off a stack to reveal a new random cartoon underneath.
The developer (Justin?) did an amazing interaction on iOS app (seamless, no jank) and web version was decent too.
They broke it when they migrated from Wordpress to their own Condé Nast CMS
https://www.newyorker.com/cartoons/random/share/1544311
Such delight. Sigh.
- taveras - 27 minutes ago
  
  I'm bummed that we never made that link keep working - it was a fun start page.

subpixel - 14 hours ago

Here’s a place to start, a list of 250 “best” articles from the New Yorker. I guess this is from previously available articles.

https://www.reddit.com/r/longform/s/zRJgAEdagi

detourdog - 8 hours ago

My personal favorite is Louis Menard’s piece on how bad Microsoft Word is.
https://www.newyorker.com/magazine/2003/10/06/the-end-matter
tclancy - 11 hours ago

Nice, reminded me of this classic https://www.newyorker.com/magazine/2018/04/23/the-maraschino...
msla - 14 hours ago

Possibly friendlier link:
https://old.reddit.com/r/longform/comments/1e8m5s1/the_250_b...
(old.reddit.com takes you to the old UI)

TrevorFSmith - 11 hours ago

I am a subscriber but still would love a tarball of PDFs of each issue.

boh - 12 hours ago

Honestly this got me to subscribe. The back catalog is pretty stellar with pretty much every major writer of the twentieth century making a contribution. Zooming in on PDFs just wasn't how you wanted to read them.

bookofjoe - 15 hours ago

https://news.ycombinator.com/item?id=46327909

gavmor - 13 hours ago

How soon can we chat with it via RAG?

visarga - 12 hours ago

Haha, I can't read long articles anymore because I want to reply, a habit I picked chatting LLMs.

JKCalhoun - 13 hours ago

I saw no way to pull down a PDF. That's unfortunate as I prefer to browse offline.

ez_mmk - 13 hours ago

I think you can download the entire issue from the archive

xnx - 16 hours ago

Nice! 100 years worth.

fnord77 - 5 hours ago

cynical me thinks they did this to sell to AI companies

- 12 hours ago

[deleted]

NoMoreNicksLeft - 15 hours ago

Could have sworn they did this years ago. I even have the first 80 years or whatever on DVD in the closet.

throwup238 - 11 hours ago

Normally when laymen say "digitized" they mean one of two things: scanned images in a PDF or fully transcribed (and possible formatted) text extracted from the scan. The Complete New Yorker you're thinking of was mostly the former, with a bit of indexing (table of contents pointing to the PDFs if I remember correctly).
This latest digitization project does the latter, transcribing the text into their existing content management system and as far as I can tell, preserving much of the formatting. This comes with full text search, allows cross linking between articles, and all that good stuff.
I suspect that since they include an LLM summary and started this digitization project in early 2024, this was enabled by LLMs.
smelendez - 14 hours ago

If I’m reading this correctly, they now have all their historic articles loaded into their CMS. I think they previously just had a system where you could page (and maybe search?) through scans of old issues, which is also cool but not as versatile.
ghaff - 15 hours ago

When a lot of content was being put out on CD/DVD, a number of publications did but they are not straightforwardly accessible these days because they're usually on an old version of Windows. (Yes, if you want to make a project of it, you can probably get into them but has never been worth it for me.)
- haunter - 14 hours ago
  
  Usually Windows/Wine is the much better case than the old Mac apps (32bit, PPC etc) in the age of Apple Silcon
  https://old.reddit.com/r/thenewyorker/comments/1jlhrve/instr...
  Breaking the DJVU DRM would be the perfect solution though
  - qingcharles - 12 hours ago
    
    It has been broken. I actually have the set on my desk ready to rip, I just couldn't find my USB DVD drive.
    Here's a link to the guy that broke it:
    https://github.com/reconSuave/PlayboyPDF/
- mekael - 12 hours ago
  
  Surprisingly, this has been a project I’ve been tinkering with for years. There is an easy way to get the raw png/jpeg files out, but it does require a windows box. Im planning on working on it more over the long holiday.
- zorked - 14 hours ago
  
  I think the disc release GP is talking about had files in DjVu format.
  - Tomte - 12 hours ago
    
    Encrypted DjVu, and the viewer doesn‘t run on modern Windows.
    
    medler - 6 hours ago
    
    It runs great on windows 11. The install took a long time but I didn’t have to do anything special to make it work
    
    Tomte - an hour ago
    
    Maybe we have different editions? I never got mine to work.
- fsckboy - 14 hours ago
  
  doesn't wine have old versions of mswindows pretty much nailed?
- kopirgan - 14 hours ago
  
  I have the MAD archives bought in 90s on CDs but can't use..
  - haunter - 14 hours ago
    
    The issues on the Absolutely MAD DVD (1952-2005) are just plain PDF files, no DRM, they work perfectly
    https://files.catbox.moe/x4np6u.png
    
    ghaff - 13 hours ago
    
    The CDs I have seem to be proprietary for Windows from the late 90s. But I also have PDFs through 2005 on my computer which I must have "acquired" at some point.
    
    haunter - 12 hours ago
    
    The browser app might be some outdated Windows application, that's the case with the MAD DVD too, but you can find the actual issue files in some folders
  - ghaff - 14 hours ago
    
    I have MAD archives somewhere. I thought they were in some standard format but maybe not.
    A lot of the gen 1 or so CD content isn't easily accessible although a more industrious person could probably get to it in some manner.

unit149 - 14 hours ago

[dead]