Ruby on Rails Audit Complete

214 points by todsacerdoti 6 months ago

Hey there everyone!

Some friends just notified us that this is trending.

Let me know if you have any questions/comments/feedback about the work! It was a great project, I wish we had more budget so that we could spend some time with ruzzy and the C extensions as there's probably some things to be unearthed there with ASAN and UBSAN.

As for the comments on Spring/Elixir/Django/Phoenix, they're on our wish list every year but we're always limited by funding. It's about what the communities, foundations, and agencies can and will support. We are always working toward getting larger grants where we can work fully independently on whatever we want, but so far that hasn't materialized.

We'll keep trying!

inopinatus - 6 months ago

Some good recommendations. Feedback on this one:

> "X41 recommends to disallow the creation of un-escaped SqlLiteral objects with user input in favor of a complete model of SQL"

Rails already has a sufficient model of SQL in its Arel layer. Complete? Not exactly, because SQL is never implemented by the standard, but certainly sufficient, and very composable and extensible. Sadly the core team killed off the public documentation of Arel a few majors ago. Nevertheless I still use Arel whenever Active Record doesn't expose enough of the model, such as expressing left-open intervals as inequalities¹. Sometimes incorrectly called a "private API", but it's not anyone's private possession, Arel is just undocumented in recent releases.

The recommendation's language is also just a touch naive, because it's nigh-impossible to outright disallow developers doing practically whatever they want to in Ruby, there's no isolated sandbox. The question is what incentives are in place to stop them wanting to.

Active Record already has excellent support for bound values via the predicate builder, and it's only egregiously bad code that concatenates raw user-supplied values directly into query strings. Nevertheless for those few remaining places in the API where this could happen inadvertently such as #calculate, a variant recommendation - similar in spirit, but not identical - might be that where it doesn't already, Active Record treats raw strings supplied as requiring escape unless explictly wrapped with Arel.sql(), or just accept an Arel node/AST (many already do on the QT). That is, force the developer to confess their sin in writing, or do it relationally.

But IMO the wizards shouldn't keep Arel locked up in the tower, either.

[1] https://gist.github.com/inopinatus/c84c78483b30fb2d5588db9be...

josephg - 6 months ago

> The recommendation is also a touch naive as framed, because it's nigh-impossible to outright disallow developers doing practically whatever they want to in Ruby
Sure but defaults matter.
Nothing is truly private in most languages. In C/C++, you can poke raw memory. In rust you can transmute. In Java you can override the classloader and edit classes before they’re passed to the JVM and so on. But most environments have a golden path for getting things done which most developers will walk. This is exactly what removing documentation does - it signals to developers that some API isn’t part of the expected golden path.
Every sql library needs ways to pass raw sql strings to the database. (Just as every browser framework needs a way to embed raw html strings). But it should require you to explicitly acknowledge you’re doing something unsafe. Rust’s unsafe keyword. React’s unsafelySetInnerHTML, and so on. It’s not about denying it. It’s about making the obvious thing safe and the unsafe thing not obvious.
- rrr_oh_man - 6 months ago
  
  > This is exactly what removing documentation does - it signals to developers that some API isn’t part of the expected golden path.
  What an insane statement. "Knowledge is bad for you, dummy"?
  - josephg - 6 months ago
    
    I have no idea how you got that weird statement from my comment.

Levitating - 6 months ago

> Rails application server (e.g., Puma, Unicorn)

I think it's more appropriate to call those Rack application servers, Rack being the Ruby CGI Rails implements.

It's a minor nitpick.

hhthrowaway1230 - 6 months ago

Thats good news! I'm a huge fan or Rails but a little surprised of such little vulnerabilities tbh. Would have expected more for such large codebase. But happy to hear it aint!

maxdamantus - 6 months ago

I don't think it's meant to be a complete audit of the codebase, and in fact this is alluded to in the final report [0] (though the wording is strange—perhaps they forgot to update it from an earlier report):
> Due to the size of Ruby on Rails, it will not be possible to cover all the tests and tasks in the following test plan. The tests performed will be covered in the final report along with suggestions on what to focus on in future audits.
I feel like these sorts of audits are usually performed on individual applications rather than "mature" already widely used frameworks. I've got the sense that they are meant to give confidence that the developers knew what they were doing (since they focus on typical vulnerabilities that good developers should know about), rather than proving anything about the code base. Still better than nothing.
[0] Section 3.7, https://ostif.org/wp-content/uploads/2025/06/X41-Rails-Audit...
- hhthrowaway1230 - 6 months ago
  
  Thank you for clearing this up!
yurishimo - 6 months ago

Wordpress is similar. The core project is quite battle tested, but the plugin ecosystem opens admins up to problems. With any system that allows modular code additions, the weakest attack area gets exponentially larger.

BilalBudhani - 6 months ago

I'm glad to see efforts towards improving Ruby on Rails security, it honestly keeps the framework still the most viable choice.

kelseydh - 6 months ago

It's funny that Rails generates params for parameters passed to it for GET, HEAD and DELETE requests, even though it shouldn't. I think I've noticed this before when debugging but never thought much of it. In a poorly coded application (e.g. globally detecting params on in a `before_action`) it definitely could be become an issue.

ukprogrammer - 6 months ago

The productivity of Rails for B2B 'CRUD' software is unmatched. Surprised to not see more newer startups make use of it!

kybishop - 6 months ago

After programming with elixir and phoenix for a few years (with many prior years of rails experience) I have a hard time seeing why one would choose rails.
Elixir is more performant, has compiler safety guarantees that are only getting better as types are introduced, is actually designed from the ground up for web dev (being based on the Erlang VM), and... it's just way more fun (subjective I know). Elixir is what I always wished Ruby was, and I couldn't be more excited about the coming type inference.
Programming with Elixir makes me feel like Ruby is a previous generation language, much like Ruby made me feel that way about Cobol or Fortran, it really is that stark.
- demosthanos - 6 months ago
  
  > is actually designed from the ground up for web dev (being based on the Erlang VM)
  Nit: this makes it sounds like the BEAM was designed for web dev, which it was not. Erlang came out of Ericsson and was built for telecoms (OTP stands for Open Telecom Platform), which is where its unique set of trade-offs comes from. Many of those trade-offs make a ton of sense for web but that's not because it was designed for web, it's because there's overlap between the fields.
  One way to see the difference between telecoms and web is to ask yourself when was the last time that you were working on a project with an availability SLA of 9 nines (1/3 of a second of downtime per year) or even 6 nines (32s per year). Some web dev has that, but most doesn't come close, and if that's not you then Erlang wasn't built for you (though you may still find you like it for other reasons!).
  - kybishop - 6 months ago
    
    Very true it is actually designed for telecoms, but like you mentioned the distinction is so small it's not really even a stretch to say it is purpose built with at least the general architecture of web in mind.
    
    demosthanos - 6 months ago
    
    In the grand scheme of things, if we're considering everything from web to bridge building, yeah, the distinction is small. But within the world of software engineering specifically it's not all that small and it's worth being precise when we're talking about it.
    Whatsapp and telecoms have a lot in common, so no one questions that they benefited a ton from the BEAM.
    Airbnb, though? The main similarity is that they both send large quantities of signal over wires.
    Again, none of this is to stop you from liking the BEAM, but when we're talking about professional software engineering it pays to be explicit about what the design constraints were for the products that you're using so that you can make sure that your own design constraints are not in conflict with theirs.
    
    throwawaymaths - 6 months ago
    
    no. in the modern web world you often have persistent client server connections, which make it a distributed system out the gate. the most inefficient way to deal with this is to go stateless, and without smart architecture to deal with unreliable connection, it's really your best choice (and, it's fine).
    since BEAM gives you smart disconnection handling, web stuff built in elixir gives you the ability to build on client-server distributed without too much headache and with good defaults.
    but look, if you want a concrete example of why this sucks. how much do you hate it when you push changes to your PR on github and the CI checks on browser tab are still not updated with the new CI that has been triggered? you've got to refresh first.
    if they had built github in elixir instead of ruby would almost certainly have this sync isdur solved. in maybe two or three lines of code.
    
    demosthanos - 6 months ago
    
    And if you need that kind of persistent immediately reactive connection and are willing to pay the price, go for it! If that's truly a requirement for you then you're in the subset of web that overlaps substantially with telecoms.
    I'm not cautioning against making the calculated decision that realtime is a core requirement and choosing the BEAM accordingly. I'm cautioning against positioning the BEAM as being designed for web use cases in general, which it's not.
    Many projects, including GitHub, do not need that kind of immediate reactivity and would not have benefited enough from the BEAM to be worth the trade-offs involved. A single example of a UX flow that could be made slightly better by rearchitecting for realtime isn't sufficient reason to justify an entirely different architecture. Engineering is about trade-offs, and too often in our field we fall for "when all you have is a hammer". Realtime architectures are one tool in a toolbox, and they aren't even the most frequently needed tool.
    
    throwawaymaths - 6 months ago
    
    "willing to pay the price"
    what price? learning a new language that is designed to be learned from the one you already know with fewer footguns? ok fine.
    but you make it seem like going to elixir is some kind of heavy lift or requires a devops team or something. the lift is low: for example i run a bespoke elixir app in my home on my local network for co2 monitoring.
    and for that purpose (maybe 300 lines of code? yes, i do want reactivity. wrangling longpoll for that does not sound fun to me.
    
    demosthanos - 6 months ago
    
    To name just a few costs that aren't worth it for many businesses:
    * A much smaller ecosystem of libraries to draw from.
    * Much weaker editor tooling than with more established languages.
    * An entirely different paradigm for deployments, monitoring, and everything else that falls under "operations" that may be incompatible with the existing infrastructure in the organization.
    * When something does go wrong, using a weird stack means you have less institutional knowledge to lean on and fewer resources from people who've been doing the same thing as you.
    * A whole new set of foot guns to dodge and UX problems to solve related to what happens when someone's connection is poor. This has come up repeatedly in discussions of Phoenix LiveView—what you get in reactivity comes at the expense of having to work harder to engineer for spotty connections than you would with a request/response model.
    * More difficulty hiring people, and an increased tendency when hiring for selecting people who are really just obsessed with a particular tool and unwilling to see when the situation calls for something else.
    There are many more, these are just the ones I can think of without having a concrete application with concrete requirements to analyze. In the end for most apps reactivity is so much a "nice to have" that it's hardly worth sacrificing the stability and predictability of the established option for moderately better support for that one aspect of UX, especially given that you can always add reactivity later if you need to at a slightly higher cost than it would have come at with Erlang.
    If reactivity is a core requirement, that's a different story. If it's polish, don't choose your architecture around it.