How Claude Code works in large codebases

claude.com

146 points by shenli3514 3 hours ago


prodigycorp - 2 hours ago

This is really a zero information blog post. I want to know how they use the LSP to improve their understanding of the code base. Would be great if it was open source for us to review.

A post like this should be providing people with some reassurance about Claude's ability to understand code at a large scale. It's mostly fluff.

Edit: so I did some googling to dig around for thoughts on LSP performance and integration. the author of bun has a tweet about saying that they are a big drag on performance for no real gain and virtually all of the replies agree. Anyone else have any experience/thoughts?

https://xcancel.com/jarredsumner/status/2017704989540684176

whh - a minute ago

A lot of words for not much. The harness taxonomy is fine, but anyone using Claude Code already knows CLAUDE.md exists.

hansmayer - an hour ago

A lot of words about nothing.

Meanwhile we are still waiting for these statements to come true:

https://eu.36kr.com/en/p/3648851352018565

https://www.businessinsider.com/anthropic-ceo-ai-90-percent-...

https://www.reddit.com/r/Anthropic/comments/1nemhxb/futurism...

https://medium.com/@coders.stop/dario-amodei-said-90-of-code...

https://www.youtube.com/shorts/0j1HqEEDThc

Accountability, anyone?

jwilliams - 2 hours ago

> Claude Code navigates a codebase the way a software engineer would: it traverses the file system, reads files, uses grep to find exactly what it needs, and follows references across the codebase. It operates locally on the developer’s machine and doesn’t require a codebase index to be built, maintained, or uploaded to a server....

> Agentic search avoids those failure modes. There's no embedding pipeline or centralized index to maintain as thousands of engineers commit new code. Each developer's instance works from the live codebase.

The frame of "the way a software engineer would" and the conclusion seem at odds. I'd love to be schooled otherwise?

I use autocomplete/LSPs all the time and they're useful. That's an index? Why wouldn't Claude be able to use one? Also a "software engineer" remembers the codebase - that's definitely a RAG. I have a lot of muscle memory to find the file I need through an auto-completed CMD+P.

It doesn't need to particularly be real-time across thousands of engineers -- just the branch I'm on.

It's rare that I'd be navigating a codebase from first-principles traversal. It would usually be a new codebase and in those cases it's definitely not what I'd call an optimal experience.

jameson - an hour ago

Why can't Claude Code generate effective harness for us by inspecting the code base?

I tried defining CLAUDE.md (or AGENTS.md), skills, plugins, but I'm not getting the effectiveness others claim to be. LSP plugin for example, CC doesn't to use LSP's symbol renaming and edits file one by one slowly, or it does not invoke the skill when I explicitly ask to remember to invoke when prompt contains a specific clue.

Am I using it wrong? Is there a robust example I can copy the harness?

sinsudo - 44 minutes ago

Just an anecdote: I was designing a project for LLMs onboarding and orchestration. Claude chose to read only the first 40 lines of each file. Later, in another session, looking for causes of low quality result, Claude detected the fault and changed the code to perform an AST analysis, so now the analyzer takes documentation lines and functions signature (input/output) as input.

Claude's initial approach was really poor. One has to wonder how many times Claude code has to be modified/reviewed for improvement, or whether it is possible at all to make good code with it.

Edited: Generalization: Claude can fix a localized, identifiable poor decision (e.g., "only reading first 40 lines") because the fault is discrete and traceable to one piece of code.

But real software quality problems often arise from many small, individually reasonable decisions that collectively produce bad outcomes. No single one is obviously "the fault." In that scenario, a tool that generates low-quality building blocks piecemeal may never converge on good code, because each piece seems fine in isolation.

prymitive - 17 minutes ago

What I’m curious about is how well LLMs do when they create something from scratch, because so far my experience was with letting it fix issues or add features to existing codebase where I already shaped the general architecture and put in a lot of guardrails. But what if the architecture is unclear and there is nothing letting agent know if change breaks something or not? My only experience with tiny codebase where it did a lot of scaffolding was poor - it did what I asked for, not what I needed. If i did more of the thinking myself I would realise it’s a code that works but doesn’t solve the problem I’m after.

Plywood1 - an hour ago

Claude clearly wrote this. A lot of fluff, not much substance.

thinkindie - 3 hours ago

I don’t agree with the statement about indexing codebase: it works pretty well for IDEs like PHPstorm or other jetbrains IDEs

belZaah - 3 hours ago

How very interesting. In an industry, where things shift around in months if not weeks, there’s been not only enough time for clear patterns to emerge but also these patterns have proven successful on large codebases. What’s the success criteria? Didn’t delete production database? Team velocity has increased? Codebase TTL has increased? Operations guys are happier?

cdnsteve - 36 minutes ago

Small plug for what I built:

You need a code dependency graph: https://github.com/roboticforce/remembrallmcp Ask "what breaks if I change this?"

Saves 98% token usage. Saves 95% tools being called.

Runs as an MCP server, works for 8 languages.

It just works, you need to try it.

ufish235 - 2 hours ago

How important are Claude.MD files when they don’t even describe (with concrete terms) what should even go into each one?

martypitt - an hour ago

I don't have any LSP's hooked up to CC yet (going to fix that today), or particularly sophisticated CLAUDE.md files.

So, if I've read this post correctly, that means that CC is navigating my codebase today by sending lots of it up to a model, and building an understanding. Is that correct? Did I misunderstand it?

I kinda suspected there was more local inference going on somehow -- partly because the iteration times are fairly fast.

tex0 - 2 hours ago

If the developer can have a local copy of the monorepo it's not a "large" codebase.

nilirl - an hour ago

So ... the better you explain the codebase to the LLM the better it explains it to you?

hbarka - 2 hours ago

Interesting that MCP was mentioned over CLI. For production or controlled environments, I would not make MCP the deployment path. I would let MCP help generate or choose commands, but have the actual deployment go through CLI scripts, Git commits, and CI/CD approval.

- 2 hours ago
[deleted]
Tsarp - 3 hours ago

Wondering if enterprises have a modified version of CC that doesnt have to optimize to stop bleeding on fixed cost subscription plans.

The article really does not align with the current sentiment. Everyone with a choice has mostly moved on to codex (ofc in this world all it takes is a model update/harness update to turn things around).

CC is great at a lot of things, but repeatedly misses out reading on crucial parts of the code base, hallucinates on the work that was done and a bunch of other issues.

wood_spirit - 3 hours ago

I’m super interested to know what the back and forth between models and tools really looks like in practice.

Are there any much more detailed walkthroughs of how it works and how it decides the tools to use and the grep to use etc and what the conversations actually look like?

In the UI you see just enough to know it’s doing something but you don’t really see the jumps it’s making offscreen.

ares623 - 2 hours ago

Lots of concepts. Release the harness that made it possible to port Bun to Rust in 9 days. That's what everyone really wants. Then everyone can go "do that but for this other goal".

xiaosong001 - 16 minutes ago

[flagged]

phoebe_builds - 17 minutes ago

[dead]

jdw64 - 2 hours ago

[dead]

baochillchill - 17 minutes ago

[flagged]