I've got an Ecto migration where I'd like to modify some columns but also migrate some data. For example:
import Ecto.Query
defmodule MyApp.Repo.Migrations.AddStatus do
alter table(:foo) do
add(:status, :text)
end
foos = from(f in MyApp.Foo, where: ...)
|> MyApp.Repo.all
Enum.each(foos, fn(foo) ->
# There's then some complex logic here to work
# out how to set the status based on other attributes of `foo`
end)
end
Now, the problem here is that by calling MyApp.Repo.all
the migration essentially uses a separate database connection to the one which the alter table...
statement used (EDIT: This assumption is false, see the accepted answer). Consequently, no status
column so the whole migration blows up! Note, we're using a postgres
database so DDL statements are transactional.
I could do this as two separate migrations or a mix
task to set the data, leaving the migration only for the schema change, but would prefer not to in order to ensure data consistency.
Any thoughts on how to use the same database connection for MyApp.Repo
queries in this manner?
EDIT: Note, I'm working on a small set of data and downtime is acceptable in my use case. See José's response below for some good advice when this is not the case.
flush()
afteralter
work? – Bayreuthflush()
works! Why does that work? Am I wrong about there being two different connections here and it's simply not run the steps yet? – Calumniationflush()
should work. The whole migration is wrapped in a transaction, which means any DDL statement as well as Repo expressions work on the same connection/transaction. The issue is that we group the ddl statements together until the end of the transaction or until a flush. @Bayreuth should submit an answer. :) I will edit my answer to remove the wrong advice. – Otiliaflush()
commit the transaction? What if there's some exception raised during the rest of that migration? Will the changes before theflush()
be rollbacked too? – Andalusiteflush()
does not commit the transaction, it just executes the statements that have been queued up before that point. Think of each statement in the migration as getting pushed into a queue, then all run once everything's been queued up.flush()
essentially lets you execute what's already been queued before adding subsequent statements. The entire migration is wrapped in a transaction, so in the event of an exception everything (regardless offlush()
) that had been executed would get rolled back. – CalumniationEcto.Migration
DSL functions likealter()
orcreate()
. The reason I needed aflush()
is that in the example above I'm doing other stuff that assumes everything above had been run. So the migration was failing because I'd put stuff into the queue but not actually run it on the database when I tried to subsequently use it in theEnum.each()
. – Calumniationflush()
. Now you have cleared that up for me. 👍🏻 – Andalusite