how do i import data into mongodb from sql server?
Asked Answered
M

3

15

how do i import data into mongodb from sql server?

i have these tables in sql database with following columns

States, Cities, CityAreas

States
Id Name

Cities
Id Name StatesId

CitiArea

Id Name CityId

and I want data in mongoDb Like.


{
      State:"Orissa",
      Cities:{
                CitiName:"Phulbani",
                CitYArea:{
                              "Phulbani","Phulbani2","Pokali","Madira"
                         }
             }
}

is there any tools or do i need to write code for this transformation of data?

Margoriemargot answered 5/8, 2014 at 12:49 Comment(6)
For now i guess you will have to export the data to text files from the SQL Server and then write code to import it in MongoDB.Froe
that's not legal JSON - unless you mean for Cities to only contain one entry per city,state. If you mean to embed cities in state record, then it must be an array.Fridlund
btw, which SQL database are you using, and what version of MongoDB?Fridlund
I think you want arrays ([ ]) not objects ({ }) for Cities and CityArea.Gamophyllous
it seems unlikely to want to embed all of the states with all of the cities and city areas that are in them. we're talking over 1000 areas in the larger states easily.Fridlund
I'm assuming that areas are arrays inside city, but city is top level rather than being embedded in state since that is likely to be more manageable for querying, updates, index strategy, etc.Fridlund
F
23

There several possible ways to approach this from writing code in your favorite language of choice using appropriate APIs to select data, transform it and then insert it into MongoDB.

You can also do it using SQL, MongoDB query language and the shell. One straightforward way is to select flat data via SQL, dump it into CSV file, import it into MongoDB and use aggregation framework to transform it into the format you want.

If you are lucky enough to use a database that supports arrays or other ways of grouping rows into single list types, then you can make a single select and turn it into JSON or MongoDB insert statement.

For these examples, I'm going to assume you want the format equivalent to a document for each city:

{
      State:"Orissa",
      City:{
           Name:"Phulbani",
           Area:[
                 "Phulbani","Phulbani2","Pokali","Madira"
                ]
           }
}

Sample data in RDBMS:

asya=# select * from states;
 id |     name      
----+---------------
  1 | California
  2 | New York
  3 | Massachusetts
(3 rows)

asya=# select * from cities;
 id |     name      | states_id 
----+---------------+-----------
  1 | Los Angeles   |         1
  2 | San Francisco |         1
  3 | San Diego     |         1
  4 | New York      |         2
  5 | Brooklyn      |         2
  6 | Buffalo       |         2
  7 | Boston        |         3
(7 rows)

asya=# select * from cityarea;
 id |        name        | city_id 
----+--------------------+---------
  1 | Beacon Hill        |       7
  2 | Backbay            |       7
  3 | Brookline          |       7
  4 | Park Slope         |       5
  5 | Little Italy       |       4
  6 | SOHO               |       4
  7 | Harlem             |       4
  8 | West Village       |       4
  9 | SoMa               |       2
 10 | South Beach        |       2
 11 | Haight Ashbury     |       2
 12 | Cole Valley        |       2
 13 | Bunker Hill        |       1
 14 | Skid Row           |       1
 15 | Fashion District   |       1
 16 | Financial District |       1
(16 rows)

The easy way with arrays:

SELECT 'db.cities.insert({ state:"' || states.name || '", city: { name: "' || cities.name || '", areas : [ ' || array_to_string(array_agg('"' || cityarea.name || '"'),',') || ']}});'
FROM states JOIN cities ON (states.id=cities.states_id) LEFT OUTER JOIN cityarea ON (cities.id=cityarea.city_id) GROUP BY states.name, cities.name;

gives you output that can go straight into MongoDB shell:

 db.cities.insert({ state:"California", city: { name: "Los Angeles", areas : [ "Financial District","Fashion District","Skid Row","Bunker Hill"]}});
 db.cities.insert({ state:"California", city: { name: "San Diego", areas : [ ]}});
 db.cities.insert({ state:"California", city: { name: "San Francisco", areas : [ "Haight Ashbury","South Beach","SoMa","Cole Valley"]}});
 db.cities.insert({ state:"Massachusetts", city: { name: "Boston", areas : [ "Beacon Hill","Brookline","Backbay"]}});
 db.cities.insert({ state:"New York", city: { name: "Brooklyn", areas : [ "Park Slope"]}});
 db.cities.insert({ state:"New York", city: { name: "Buffalo", areas : [ ]}});
 db.cities.insert({ state:"New York", city: { name: "New York", areas : [ "Little Italy","West Village","Harlem","SOHO"]}});

The longer way if you don't have support for array or list types is to select joined data:

asya=# SELECT states.name as state, cities.name as city, cityarea.name as area 
FROM states JOIN cities ON (states.id=cities.states_id) 
LEFT OUTER JOIN cityarea ON (cities.id=cityarea.city_id);
     state     |     city      |        area        
---------------+---------------+--------------------
 California    | Los Angeles   | Financial District
 California    | Los Angeles   | Fashion District
 California    | Los Angeles   | Skid Row
 California    | Los Angeles   | Bunker Hill
 California    | San Francisco | Cole Valley
 California    | San Francisco | Haight Ashbury
 California    | San Francisco | South Beach
 California    | San Francisco | SoMa
 California    | San Diego     | 
 New York      | New York      | West Village
 New York      | New York      | Harlem
 New York      | New York      | SOHO
 New York      | New York      | Little Italy
 New York      | Brooklyn      | Park Slope
 New York      | Buffalo       | 
 Massachusetts | Boston        | Brookline
 Massachusetts | Boston        | Backbay
 Massachusetts | Boston        | Beacon Hill
(18 rows)

I used a left outer join on cityarea because in my sample data I had a city without any areas listed but I wanted to get all state, city pairs even if there was not an area listed for it.

You can dump this out interactively or via a command line (use appropriate syntax for your RDBMS). I'll do it interactively:

asya=# \a
Output format is unaligned.
asya=# \f
Field separator is "|".
asya=# \f ,
Field separator is ",".
asya=# \t
Showing only tuples.
asya=# \o dump.txt                                                                                                                              
asya=# SELECT states.name as state, cities.name as city, cityarea.name as area 
FROM states JOIN cities ON (states.id=cities.states_id) 
LEFT OUTER JOIN cityarea ON (cities.id=cityarea.city_id);
asya=# \q

I now have a comma separated file with state, city and area as the three fields. I can load it into MongoDB via mongoimport utility:

asya$ mongoimport -d sample -c tmpcities --type csv --fields state,city,area < dump.txt 
connected to: 127.0.0.1
2014-08-05T07:41:36.744-0700 check 9 18
2014-08-05T07:41:36.744-0700 imported 18 objects

Now to transform to format I want, I use aggregation:

mongo sample
MongoDB shell version: 2.6.4
connecting to: sample1
> db.tmpcities.aggregate(
{$group:{_id:"$city", state:{$first:"$state"}, areas:{$push:"$area"}}},
{$project:{state:1,_id:0,city:{name:"$_id", areas:"$areas"}}},
{$out:'cities'})
> db.cities.find({},{_id:0})
{ "_id" : "Boston", "state" : "Massachusetts", "areas" : [ "Brookline", "Backbay", "Beacon Hill" ] }
{ "_id" : "New York", "state" : "New York", "areas" : [ "West Village", "Harlem", "SOHO", "Little Italy" ] }
{ "_id" : "Buffalo", "state" : "New York", "areas" : [ "" ] }
{ "_id" : "Brooklyn", "state" : "New York", "areas" : [ "Park Slope" ] }
{ "_id" : "San Diego", "state" : "California", "areas" : [ "" ] }
{ "_id" : "San Francisco", "state" : "California", "areas" : [ "Cole Valley", "Haight Ashbury", "South Beach", "SoMa" ] }
{ "_id" : "Los Angeles", "state" : "California", "areas" : [ "Financial District", "Fashion District", "Skid Row", "Bunker Hill" ] }
Fridlund answered 5/8, 2014 at 15:38 Comment(4)
Thanks you so much Mr. Asya Kamsky for this solution.Margoriemargot
It's "Ms" =) Google her name and you'll see more her blogs and talks.Chiclayo
may i know the disadvantage of using the above process, as in my process i need to include id also. Is it ok to map 'ID' instead of using '_id'Nymphomania
this should be marked as the answer to the questionChelsiechelsy
B
3

Try Mongify. It takes care of all the foreign key and referential integrity constraints which exist in SQL while migrating the data in MongoDB.
As per its documentation:

Mongify helps you move your data without worrying about the IDs or foreign IDs. It allows you to embed data into documents, including polymorphic associations.

Hope it helps.

Botvinnik answered 17/6, 2015 at 15:38 Comment(0)
A
2

New answer to a very old question: You can use MongoDB Relational Migrator for this. Relational Migrator allows you to move data from relational databases, while also transforming from a relational to a document schema. The tool can recommend a MongoDB schema, or you can create your own.

Antoninus answered 26/6, 2023 at 2:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.