Ecto's fragment allowing SQL injection

When Ecto queries get more complex and require clauses like CASE...WHEN...ELSE...END, we tend to depend on Ecto's fragment to solve it.

e.g. query = from t in <Model>, select: fragment("SUM(CASE WHEN status = ? THEN 1 ELSE 0 END)", 2)

In fact the most popular Stack Overflow post about this topic suggests to create a macro like this:

defmacro case_when(condition, do: then_expr, else: else_expr) do
  quote do
    fragment(
      "CASE WHEN ? THEN ? ELSE ? END",
      unquote(condition),
      unquote(then_expr),
      unquote(else_expr)
    )
  end
end

so you can use it this way in your Ecto queries:

query = from t in <Model>,
  select: case_when t.status == 2
    do 1
    else 0
  end

at the same time, in another post, I found this:

(Ecto.Query.CompileError) to prevent SQL injection attacks, fragment(...) does not allow strings to be interpolated as the first argument via the `^` operator, got: `"exists (\n        SELECT 1\n        FROM #{other_table} o\n        WHERE o.column_name = ?)"

Well, it seems Ecto's team figured out people are using fragment to solve complex queries, but they don't realize it can lead to SQL injection, so they don't allow string interpolation there as a way to protect developers.

Then comes another guy who says "don't worry, use macros."

I'm not an elixir expert, but that seems like a workaround to DO USE string interpolation, escaping the fragment protection.

Is there a way to use fragment and be sure the query was parameterized?

SQL injection, here, would result of string interpolation usage with an external data. Imagine where: fragment("column = '#{value}'") (instead of the correct where: fragment("column = ?", value)), if value comes from your params (usual name of the second argument of a Phoenix action which is the parameters extracted from the HTTP request), yes, this could result in a SQL injection.

But, the problem with prepared statement, is that you can't substitute a paremeter (the ? in fragment/1 string) by some dynamic SQL part (for example, a thing as simple as an operator) so, you don't really have the choice. Let's say you would like to write fragment("column #{operator} ?", value) because operator would be dynamic and depends on conditions, as long as operator didn't come from the user (harcoded somewhere in your code), it would be safe.

I don't know if you are familiar with PHP (PDO in the following examples), but this is exactly the same with $bdd->query("... WHERE column = '{$_POST['value']}'") (inject a value by string interpolation) in opposite to $stmt = $bdd->prepare('... WHERE column = ?') then $stmt->execute([$_POST['value']]); (a correct prepared statement). But, if we come back to my previous story of dynamic operator, as stated earlier, you can't dynamically bind some random SQL fragment, the DBMS would interpret "WHERE column ? ?" with > as operator and 'foo' as value like (for the idea) WHERE column '>' 'foo' which is not syntactically correct. So, the easiest way to turn this operator dynamic is to write "WHERE column {$operator} ?" (inject it, but only it, by string interpolation or concatenation). If this variable $operator is defined by your own code (eg: $operator = some_condition ? '>' : '=';), it's fine but, in the opposite, if it involves some superglobal variable which comes from the client like $_POST or $_GET, this creates a security hole (SQL injection).

TL;DR

Then comes another guy who says "don't worry, use macros."

The answer of Aleksei Matiushkin, in the mentionned post, is just a workaround to the disabled/forbidden string interpolation by fragment/1 to dynamically inject a known operator. If you reuse this trick (and can't really do otherwise), as long as you don't blindly "inject" any random value coming from the user, you'll be fine.

UPDATE:

It seems, after all, that fragment/1 (which I didn't inspect the source) doesn't imply a prepared statement (the ? are not placeholder of a true prepared statement). I tried some simple and stupid enough query like the following:

from(
  Customer,
  where: fragment("lastname ? ?", "LIKE", "%")
)
|> Repo.all()

At least with PostgreSQL/postgrex, the generated query in console appears to be in fact:

SELECT ... FROM "customers" AS c0 WHERE (lastname 'LIKE' '%') []

Note the [] (empty list) at the end for the parameters (and absence of $1 in the query) so it seems to act like the emulation of prepared statement in PHP/PDO meaning Ecto (or postgrex?) realizes proper escaping and injection of values directly in the query but, still, as said above LIKE became a string (see the ' surrounding it), not an operator so the query fails with a syntax error.

Recommended topics

Hot tags