The best way to export rails database data
Asked Answered
S

2

9

RAILS 5.1

I have a RAILS application thats using PostgreSQL as database. I want to export/dump the RAILS database data from the RAILS perspective. So I'm independend from the database. Later then I want to use this export/dump file to load/import/seed the data back into the database.

I have tried the following GEMs:

  • seed_dump
    It works, but it can't handle HABTM model relations.

  • yaml_db , It works, but the yaml format is not the format understood by a rails db:seed

Sihon answered 22/10, 2017 at 17:4 Comment(7)
Just use the pg backups tools. The "from the RAILS perspective" thing sounds like you are storing a bunch of other junk in the same db - don't.Weismann
No, I use this db only for this RAILS application. There is no other bunch of junk saved in this database. The pg backup tools stores the database data in a postgreSQL specific sql dump format, can't be used to load into a MySQL database. This is the big advantage when saving the database data "from the RAILS perspective".Sihon
Don't use mysql for test/dev if you are deploying to postgres. Your development environment should mirror your production as close as possible. Problem solved.Weismann
"For instance, avoid using different services between development and production, even when adapters theoretically abstract away any differences in services. Using SQLite locally and PostgreSQL in production; or local process memory for caching in development and Memcached in production, may seem innocuous at first but can be the cause of a variety of production issues.". This is from the Heroku docs but is generally applicable.Weismann
There are plenty of real world differences between MySQL and Postgres that can cause big issues. MySQL is notoriusly lax and will allow many ambigeous queries that Postgres will not.Weismann
Using MySQL beside PostgreSQL is not really my intention. I want only emphasise the advantage to export/dump database data in perspective of RAILS. I would like to use seed_dump, but seed_dump struggles with HABTM relation. What I really want to do, is to export/dump with a GEM like seed_dump. Now I can edit this file, and load/import it back with rails db:seed.Sihon
Sounds like a solution looking for a problem.Weismann
A
7

Here's a practical example of exporting to JSON. I use rake tasks to do this sort of thing. In this example I'm dumping a users table.

namespace :dataexport do
  desc 'export users who have logged in since 2017-06-30'
  task :recent_users => :environment do
    puts "Export users who have logged in since 2017-06-30"

    # Get a file ready, the 'data' directory has already been added in Rails.root
    filepath = File.join(Rails.root, 'data', 'recent_users.json')
    puts "- exporting users into #{filepath}"

    # The key here is to use 'as_json', otherwise you get an ActiveRecord_Relation object, which extends
    # array, and works like in an array, but not for exporting
    users = User.where('last_login > ?', '2017-06-30').as_json

    # The pretty is nice so I can diff exports easily, if that's not important, JSON(users) will do
    File.open(filepath, 'w') do |f|
      f.write(JSON.pretty_generate(users))
    end

    puts "- dumped #{users.size} users"
  end
end

And then import

namespace :dataimport do
  desc 'import users from recent users dump'
  task :recent_users => :environment do
    puts "Importing current users"

    filepath = File.join(Rails.root, 'data', 'recent_users.json')
    abort "Input file not found: #{filepath}" unless File.exist?(filepath)

    current_users = JSON.parse(File.read(filepath))

    current_users.each do |cu|
      User.create(cu)
    end

    puts "- imported #{current_users.size} users"
  end
end

Sometimes as part of the import process I'll want a clean table to import into, in which case I'd start the taske with:

ActiveRecord::Base.connection.execute("TRUNCATE users")

This would not be the best way to handle Really Big tables, greater than, oh, 50,000 rows, and/or with lots of text fields. In which case the db native dump/import tools would be more appropriate.

Here's a HABTM example for the sake of completeness. There's still a linking table, but it has no model, so the only way to do something with it is raw SQL. Let's imagine our users have many roles, and vice versa (users M:M roles), for example:

class User < ApplicationRecord
  has_and_belongs_to_many :roles
end

class Role < ApplicationRecord
  has_and_belongs_to_many :users
end

There would necessarily be a joining table called users_roles which would have two columns, user_id and role_id. See the Rails Guide on HABTM

To export, we have to execute SQL directly:

users_roles = ActiveRecord::Base.connection.execute("SELECT * from users_roles").as_json
# and write the file as before

And execute SQL to import

# read the file, same as before
user_roles.each do |ur|
  ActiveRecord::Base.connection.execute("insert into users_roles (user_id, role_id) values ('#{ur[0]}', '#{ur[1]}')")
end

See this answer for more on inserting with raw SQL

Azure answered 23/10, 2017 at 16:2 Comment(5)
Ohhh - it seems to be not so complicated as I thought after the explanation of Ryan. I understand. And how I can save HABTM and Has_many_through relations ?Sihon
In the case of relations the M:M relations, HABTM or has_many/through, you would export all three tables. The "relation" is the foreign keys in the joining table. During the import, the record will get created with the ID (primary key) that was saved in the export, provided it doesn't conflict with an existing ID. And as long as the IDs are the same, the keys in joining table will continue to work. Note that prior to Rails 4, you would need to use User.create(cu, :without_protection => true) to ensure the ID got imported intactAzure
Could you please give a practical example how to export HABTM, Has_many/through relations ? The HABTM has no model that could be used like User.where().as_json.Sihon
As for HABTM, I don't like it, and the one time I tried it, I ripped it out 30 minutes later. It has no particular advantage over has_many/through while having some obvious disadvantages, like, of course, the join table doesn't have a model. To do anything with it you're looking at raw SQL, which is fine, but I like letting ActiveRecord do the work. Nevertheless, I've updated the answer.Azure
Thank you very much to complete this json based dump/load approach. For my RAILS application I'm feeling good with HABTM, it works very well. When I want to dump my complete RAILS database data from RAILS perspective I'm forced to go this raw SQL way, to include also the HABTM hidden join tables. But then I have the advantage to edit the json based backup files and load the complete RAILS app back via a rails task.Sihon
F
1

I would agree with the people saying use the built in database tools to do it. Or figure out if there's a way to tell the database to export to CSV, and then import that way.

However, if you really want a database agnostic way, here's another way: use your own API.

By that I mean that in 2017 your Rails app really should not just output HTML, but also output JSON. Maybe you want to write a SPA type app in the future, or a mobile app. Making sure there's a Javascript representation of the object, in addition to the HTML version, is a great idea.

So, if you have /projects in your app, write a script that requests /projects as JSON. Save each object as its own file, then in your remote system post everything back in.

If there's anything not in the JSON representation (ie you don't list all the users on the project) make sure to hit those endpoints too and save everything to a separate JSON file.

Then write a player script that POSTS all those files to your destination service.

That's one way to do it. There other way is to write it entirely in Ruby in ActiveRecord - this might be useful as some kind of user data export functionality, so this could be an awesome way to do it too, but in my mind, "can we build a Javascript frontend or mobile app for this?" is usually asked way before, "can the user get their data out" ;)

Fallow answered 22/10, 2017 at 21:3 Comment(1)
Hi Ryan, thanks for your reply. At the moment my RAILS app presents only as HTML. Representation for mobile devices will coming later, next year. So I have no desire to write additional code, only to dump/export my database date. The GEM seed_dump fits my demand at best, but struggles with HABTM relations. I'm looking for a GEM something like seed_dump.Sihon

© 2022 - 2024 — McMap. All rights reserved.