SQL Injection as a Feature

100 points by foxfired 5 days ago

What people often don't realize is that in a big business system a user may have no permission to raw data of some table, but may have permission to report which includes aggregated data of the same table, so report permissions cannot be deducted from base CRUD permissions.

If such SIAAS

    - Checks that query is SELECT query (can be tricky with CTE, requires proper SQL parser)
    - Allows editing said query by superuser only
    - Can be parametrized, including implicit $current_user_id$ parameter
    - Has it's own permissions and users can run the query if they have permissions

It's safe enough. I've seen and applied such "Edit raw SQL in HTML form" many times. It's super flexible, especially combined with some CSV-to-HTML, CSV-to-PDF, or CSV-to-XLS rendering engine.

indigo945 - 3 days ago

> - Checks that query is SELECT query (can be tricky with CTE, requires proper SQL parser)
Not only is this difficult parsing-wise, there's also no reason to assume that a select query is read-only even when no CTE or subqueries are involved. Function calls in the select clause can also write data.
> - Has it's own permissions and users can run the query if they have permissions
This is the important one. If the role the query runs as doesn't have write permissions on any table, then the user can't write data, period.
Note that this is often not as easy to implement as it seems. For example, in PostgreSQL, neither set role nor set session authorization actually prevent the user from doing malicious things, because the user can just reset role or reset session authorization in the query. For PostgreSQL to properly respect a role's permissions, the SIAAS needs to actually connect to the database as that role.
Common GUC-based row level security approaches are also incompatible with this idea.
- kogus - 3 days ago
  
  I'm not sure what database platform they used, but in SQL Server, functions cannot have side-effects.
  https://learn.microsoft.com/en-us/sql/relational-databases/u...
  - chasil - 3 days ago
    
    Dear SQL Server user, welcome to the world of SQL/PSM.
    https://en.wikipedia.org/wiki/SQL/PSM
    Within this ADA-esque world, packages, procedures, and functions may initiate DML.
    Assuming these objects are in the default "definer rights" context, the DML runs with the full privilege of the owner of the code (this can be adjusted to "invoker rights" with a pragma).
    Perhaps this is why Microsoft ignores it (as Sybase did before it).
    
    kogus - 2 days ago
    
    I am not an expert on SQL/PSM, but I have worked in an Oracle shop before, and used PL/SQL extensively. In SQL Server, the equivalent is T-SQL. T-SQL procedures can do pretty much anything (assuming it is executed by a user with sufficient privileges), including creating and altering tables, creating and executing other procedures, running dynamic sql, as well as ordinary CRUD style operations. The "no side effect" limitation applies specifically to SQL functions.
    
    indigo945 - 2 days ago
    
    In Postgres, the main difference between functions and procedures is that procedures can do transaction management (begin/rollback/commit), whereas functions can't. Other than that, functions are not limited in any way, and can be written in SQL, Postgres's PL/SQL knockoff PL/PgSQL, any plugin language such as Java, or even be called via the C ABI. Obviously, Postgres can do nothing about side effects here, and doesn't attempt to.
    (PS: Postgres does have a concept of "safe" languages. Safe languages are expected to run programs only with the database permissions of the calling context, and prevent all other IO. However, Postgres does nothing to ensure that they do, that's the language plugin's job. Also, those functions can still perform DML and DDL as long as the calling context has the permissions to do so.)
    By the way, you can do the same in SQL Server via what Microsoft ambiguously calls an Assembly. Via Assemblies, SQL Server can load procedures, aggregates and, yes, functions, from a .NET DLL file.
    
    chasil - 2 days ago
    
    Is appears that there are efforts to bring full PSM to Postgres, as I see with PL/pgPSM.
    https://postgres.cz/wiki/SQL/PSM_Manual
    
    chasil - 2 days ago
    
    Transact-SQL, as Sybase originally named it, is not standardized.
    As far as I know, the only implementation outside of Sybase and Microsoft is Amazon's Babelfish.
    https://aws.amazon.com/rds/aurora/babelfish/
    It goes without saying that SQL/PSM is far more pervasive on many more platforms, precisely because it is an ANSI standard.
  - Tostino - 3 days ago
    
    In Postgres they absolutely can. They are all just happening inside the same transaction scope unlike stored procedures.
- Tostino - 3 days ago
  
  Hell, your user can have no write access at all, but the function or procedure can be using SECURITY DEFINER and the code inside it will run with the permissions of the function owner rather than the calling user allowing writes to happen.
  Trusting a select to be read only is naive.
leprechaun1066 - 3 days ago
Most applications backed kdb+ do just this. It comes with its own parser and you can query tables using something like an ast.
For example the user might ask for data with the constraint
```
  where TradingDesk=`Eq, AvgPx>500.0
```
which kdb+ parses into
```
  ((=;`TradingDesk;(),`Eq);(>;`AvgPx;500.0))
```
As a dev on the system I can then have a function which takes in this constraint and a list of clients that I want to restrict the result to. That list of clients could come from another function related to the entitlements of the user who made the request:
```
  applyClientRestriction:{[constraint;clients] constraint,enlist(in;`Client;enlist clients)}
```
Which results in an extension of the constraint like this for two clients A and B:
```
  q)applyClientRestriction[((=;`TradingDesk;(),`Eq);(>;`AvgPx;500.0));`ClientA`ClientB]
  ((=;`TradingDesk;enlist`Eq);(>;`AvgPx;500.0);(in;`Client;enlist`ClientA`ClientB))
```
Then that gets passed to the function which executes the query on the table (kdb+ supports querying tables in a functional manner as well as with a structured query language) and the result has the restrictions applied.
It's really nice because, once parsed, it's list processing like in a lisp and not string processing which is a pain.
ako - 2 days ago

So you build a view on top of the tables, allows users access to the aggregating views, but not to the underlying tables?

EvanAnderson - 3 days ago

I am reminded of a Customer's grown up "tribal knowledge" around a long-unsupported custom application.

Typically and sadly, the developer used string concatenation to build queries. Some users figured out (or, perhaps, saw the developer do it) they could abuse "filter" fields in the Win32 thick-client application to get filtering functionality the client otherwise didn't offer.

Contrived example: AR aging report doesn't allow for filtering by Customer city. It does allow filtering by Customer name. In the Customer name filter box enter "' OR [City] = 'Boston" to get an AR aging report for only Boston customers.

The users had built-up a Word document with various magical incantations to get data they couldn't otherwise get.

I was impressed by the ingenuity even though it was all horrifying. (The application ran as the database server's superuser-equivalent with the password hard-coded in the client, so nothing was stop somebody from completely trashing the server. >sigh<)

ep103 - 3 days ago

SWIM worked as a PM at a company that decided to redo their UI. They ran into an issue on internal roll out, where they discovered their support team for years had been doing sql injection on a specific form in the UI, in order to run reports on the company's database. They had to stop the roll out, and productionize the support team's (very valid) use cases in order to remove the sql injection form.
wvbdmp - 3 days ago

>(The application ran as the database server's superuser-equivalent with the password hard-coded in the client
I was shocked to find that this seems to be the norm, at least in B2G. Development shops “recommend” it and IT depts are relieved if they can get the invariably batshit arcane on-prem deployments to work at all.
draw_down - 3 days ago

Hey, if you can know that your users are not adversarial and are capable of writing queries, I say let 'em! The problem is really in that last bit.

ivanjermakov - 3 days ago

> In the code, someone added a string search for INSERT, UPDATE, and CREATE. If any of these strings were detected, the page returned an unauthorized response.

STRING SEARCH? If only there was a way to limit connection privileges to read-only DML.

sgarland - 3 days ago

> read-only DML
Technically DQL, if you want to be pedantic.

dml2135 - 3 days ago

Ha, this reminds me of a story from my old job, before I became a software engineer.

I worked for an artist and we had an inventory system that a freelance web developer had built for us. I think it was some sort of php, laravel, mysql stack. There was a search bar that you could use to bring up records of artworks.

Everyone at the studio used this system but I was the main person that used it. Over time, I picked up some little tricks here and there. These were useful because engaging this freelance guy for new features involved a lot of back and forth, so if I could find my own improvement to a workflow that was always the easier option.

We didn't have a clear way to pull up works for a single decade. Until I somehow discovered that you could use `%` as a wildcard character in some cases, so I could pull up paintings from the 1970s but searching for `197%`, for example.

I remember proudly telling this to the freelance dev at one point and his eyes widened and he almost looked panicked. In retrospect I recognize that he was thinking about whether he left the system open to an injection attack.

By the end of my job there I had learned enough about this to realize it was a risk but didn't get curious enough to see if a `; DROP TABLE` query would work.

lenkite - 3 days ago

Gave analysts a workspace of their own with a SQLite snapshot copy of the main db. They could do whatever they liked with that copy with the tool of their choice.

zihotki - 3 days ago

When I worked on a similar functionality, I used sql parser and AST walker to verify that no mutations are done and only a subset of sql was whitelisted. That worked as a charm and it wasn't hard to implement. Benefits - no sql injections, cheap syntax check.

supriyo-biswas - 3 days ago

A similar approach that I used for an application is to take a simplified query language as input such as `name*~"john" emp_id>3000` and use a hand-crafted parser to turn it into a SQL query.
- tootubular - 3 days ago
  
  I just shipped a feature exactly like this... Jira has the same thing with JQL, which is what inspired my work. Safe from SQL injection and can be used directly by power users or managed through form inputs for basic search/filtering. We use Elasticsearch for other data atm, but I'm hopeful this new PostgreSQL only approach wins out as it makes authz so much simpler since it all composes into one query.
- emmelaich - 3 days ago
  
  That exists as @apenwarr's afterquery: https://apenwarr.ca/log/20121218
  As it is the URL string, you can share it easily.

louis_saglio - 3 days ago

This is actually what SQL was intended for.

lupire - 3 days ago

And if you want to give users an SQL client app, the database vendor usually provides one along with the server.

heckintime - 3 days ago

At this point, it'd be easier to give your users access to DBeaver or Bigquery directly. Also limiting their access to certain views with prepared data to avoid expensive queries.

For a UI based SQL tool I've used Looker few years ago and thought it was ok. I wonder how much AI tooling Google added since then to help with LookML generation. There are also other open source tools like mathesar (https://github.com/mathesar-foundation/mathesar) which look promising.

crazygringo - 3 days ago

It's easy in most modern multiuser databases to restrict a database user to SELECT privileges only, and to set a query timeout of something like 5 seconds (or 30 or whatever is appropriate), right?

Then the queries can be run only using that database user, and there's no risk of INSERT/UPDATE/DELETE. Injection isn't possible. And you can limit access to only the relevant tables as well, so you can't accidentally leak PII.

drysart - 3 days ago

Most ancient databases can do it too. It's been a standard feature of databases for decades.
It's a little horrifying seeing the hoops people are jumping through to solve a problem that's more quickly, more easily, and more fully solved with your database of choice's equivalent of a CREATE USER statement; and maybe creating some views on tables if you have a need to hide individual columns from users.
chasil - 3 days ago

No, a user that is only granted SELECT may write to the database.
The trivial example is sorting.
A more nuanced example, in the world of Oracle, is block cleanout. A block may be encountered that has been modified by a past transaction where the commit status is not immediately known; the query session is responsible for either updating the (confirmed) commit status, or rolling back the block from the UNDO/rollback disk image.
So readers always write, likely in every database to some degree.
- crazygringo - 3 days ago
  
  I don't know what you're talking about.
  Sorting doesn't write to a database. You can check yourself by looking at last-modified-by timestamps on database files. They don't change. Gigantic sorts may need to write to a temp file but the temp file isn't part of the database, it's a temp file.
  And I don't know how Oracle works under the hood, but the entire point of transactional multiuser databases is that one user can read while another writes a transaction. A read neither commits nor rolls back a separate transaction currently in progress. It reads from the pre-transaction state.
  - chasil - 3 days ago
    
    In Oracle, a user has a default tablespace, and a temporary tablespace.
    When a sort exceeds SORT AREA SIZE, then it writes into a sort segment in their registered temporary tablespace. Any or all of the datafiles behind this tablespace will show updated mtime. All users share a single sort segment within a tablespace.
    And users running SELECT will write to the database if a delayed block cleanout needs to occur.
    https://asktom.oracle.com/ords/asktom.search?tag=delayed-blo...
    
    crazygringo - 3 days ago
    
    All of that sounds like implementation details to do with files.
    No logical database writing occurs. The relational data in the database cannot be changed by a SELECT command, which is the point of user permissions. Even if temp disk space is used or stale data gets cleaned up in the process, that is irrelevant in the context of preventing inserts/updates/deletes. So not sure why you're bringing it up?
    
    chasil - 3 days ago
    
    Logical?
    If the transaction committed, then the block metadata must change, but the row content remains the same.
    If the transaction is not active and did not commit, then the uncommitted data must be discarded, and the saved rollback must be restored.
    That does not fit within what I assume to be your distinction between a physical and a logical write. It does preserve acid.

LikesPwsh - 3 days ago

Giving analysts direct SQL access can be great, but you should tell them to use an IDE rather than trying to build your own IDE in a text box.

dsego - 3 days ago

There are neat online tools like Redash, you connect it to your db and have user logins and you can build reports, parametric dashboards or just do plain queries.

subarctic - 3 days ago

The "machine fired me" story linked at the end is kind of spooky, especially these days when everyone's trying to use ai to automate as much as possible so they don't get left behind

codingdave - 3 days ago

I've been in scenarios when such UIs existed. But they always were protected so that only system admins had access to it, as a way to let them make quick queries in-app instead of having to pull up other tools. There was no additional access granted, it was just a question of UX, and we expected that anything beyond a simple ad-hoc query would be done with real tools, not in the app.

Also, the underlying databases were secured. Just because you can send a query to a database does not mean you are exposing additional data - database-level security exists and works well.

Rygian - 3 days ago

If I had to greenlight such a UI, here's my list of non-negotiables:
- Each human user has to use their own dedicated account.
- Every query leaves a trail that can't be tampered with.
- If the database contains sensitive data (personal info, payment data, ...) then the database provides a snapshot guarantee, so that we can inspect whose personal data/payment data were leaked by query X executed at instant T by a bad actor.
- List of humans who can access the feature is vetted regularly.
- Any access that can modify the data in database requires at least two separate humans to agree on the query before it can run.
- Any query that can hamper application throughput is either forbidden, happens on a replica database, or requires at least two separate humans to agree before it can run.
- - 3 days ago
  
  [deleted]
SoftTalker - 3 days ago

Back in the first dotcom era I worked at a place that had a "SQL page" in the website. Just a textarea where you could enter any query and run it. It was wide open, protected only by the fact that it wasn't linked anywhere (there was no way to get to it other than entering the URL directly into the browser). It was there for the reasons you list, a quick way to verify that the database connections were working and to run ad-hoc queries for support/troubleshooting.
It was thought to be safe enough, because "nobody could guess" the URL of that page.

somat - 3 days ago

Not quite the same, but for a toy application I was building I had one of those terrible/clever ideas "hey the database has this elaborate comprehensive permission system, Why not have that be my applications permission system as well"

So every user is a database user and if the database sez they can view the data then they can view the data. And for my use case it worked surprisingly well.

I will note note that I think this is generally a bad idea, I have heard that the postgres role data structures scale poorly. and I personally would be terrified to expose a database to a potentially hostile environment.

See also: the schemaverse a game where you directly interface with the db server. https://github.com/Abstrct/Schemaverse

theknarf - 2 days ago

I think databases _should_ be built so that one can do it this way.

fifticon - 3 days ago

Why did the story end with a reference to "the machine fired me" essay (which I assume to be fiction, though admittedly funny and soon to be AI reality).

Is he implying, that tampering with the devil's query engine would get him escorted out of the building, or just trying to hook me up on the next story?

zerocrates - 3 days ago

That essay doesn't read as fiction to me.
foxfired - 3 days ago

Not fiction at all: https://www.bbc.com/news/technology-44561838

zX41ZdbW - 3 days ago

I use the approach of public apps built on top of free-form SQL queries. Examples:

https://adsb.exposed/ - type any query and get a visualization.

https://play.clickhouse.com/ - explore the database as you want without logging in.

EGreg - 3 days ago

Some of my clients actually wanted to be able to make arbitrary queries in the database.

And this isn't just about SQL queries. Often, people want to be able to make privileged changes to json config files, etc. etc.

So, I added a feature to the Qbix framework, to do just that. Essentially you have some privileged payloads, signed off by an Auditor role with their private key. You might even require M of N keys.

Then, for each Target (e.g. a database) you can use an ACL or Roles+Permissions for who can actually execute the payload against it.

This can be used on:

  * Smart Contracts

  * SQL Queries on Databases

  * Upgrade Scripts on Nix machines

  * Updates on package managers (versions that were vetted)

and much more.

Usually, when you have a bunch of "n00bs" who shouldn't be trusted with X, you want to set up a class of Auditors who sign off on the latest changes, and then a class of Admins who manage the roles, including Auditors and Users.

You don't need sys admins or dev ops. Just have standard container images / AMIs etc. and support only those. Don't even allow ssh into the instance, if you can help it. Let the instance be a pristine environment (Trusted Execution Environment) and discover arbitrary updates at well-known URLs, such as on github. You can have mirrors and fallbacks, but the key is that M of N Auditors from the organization's whitelist have signed off on the arbitrary code of any upgrade, whether delivered via a package manager or a git post-commit hook.

And all updates become automatic, and available to the n00bs to use. Role-based access is always enforced per-target when they attempt to use it. Inside the black box you can use simple HMAC to verify authorizations it issued earlier.

It took me a while to understand this, but the only way to solve security is to either have "pristine environments" that can run arbitrary code but always check M of N signatures for updates, with "byzantine fault tolerant consensus" on top. In the first one, you have to trust cloud providers. For the second one, you sacrifice some privacy. (e.g. with the new FreeNet's smart contracts).

PS: the M of N signatures might have to be hardened against quantum computers, so rather than using elliptic curve cryptography you might want to use lattice-based or my favorite quantum-resistant approach: SPHINCS+ based on Lamport signatures.

theknarf - 2 days ago

At that point just create a database user with read only permissions, delete the entire tool, and give them a copy of "mysql workbench" or "phpMyAdmin" or something similar.

- 20 hours ago

[deleted]

nogotnu - 20 hours ago

You actually reopened BI-tools (Tableau and others)