How to use mongoimport for specific fileds from tsv file?
Asked Answered
E

3

6

I have a .tsv file as follows,

Name   City   Mobile   Country
A      Hyd    877777   IN
B      Ban    78899    IN

Now, i don't want all the fields to be stored. I need some specific fields. I want import only Name, City and Mobile fields to Mongo DB using mongoimport. I used the below command, but its not working

mongoimport --db test --collection persons  --type tsv --file persons.tsv --fields Name,City,Mobile

Final document stored in Mongo DB as follows:

{
    "_id" : ObjectId("55accf948c59222984066646"),
    "Name" : "A",
    "Ciry" : "Hyd",
    "Mobile" : "87777"
}

Could you please help me to solve this?

Execrate answered 20/7, 2015 at 10:55 Comment(0)
T
11

This is not possible as using mongoimport you can only import whole file containing data in the database and not specific contents of a file.

To import your tsv file into the database as given above you can use:

mongoimport --db test --collection persons  --type tsv --file persons.tsv --headerline

Explaination

--headerline

If using --type csv or --type tsv, uses the first line as field names. Otherwise, mongoimport will import the first line as a distinct document.

If you attempt to include --headerline when importing JSON data, mongoimport will return an error. --headerline is only for csv or tsv imports.

If your tsv file only contains the data to be imported and not the field names as the header , you can use the fields property in mongoimport

example : mongoimport --db test --collection persons --type tsv --file persons.tsv --fields Name,City,Mobile,Country

Explaination:

--fields <field1[,field2]>, -f <field1[,field2]>

Specify a comma separated list of field names when importing csv or tsv files that do not have field names in the first (i.e. header) line of the file.

If you attempt to include --fields when importing JSON data, mongoimport will return an error. --fields is only for csv or tsv imports.

Trillion answered 20/7, 2015 at 11:2 Comment(4)
I am not using --headerline, i am using --fields only.. still it is not workingExecrate
I am saying you have to use --headerline ,else it will treat the first line as a distinct document and not as a field header.Trillion
If i use --headerline , it will store all the fields in the Mongo DB, i don't want all the fields to be stored. I need some specific fields. If possible, could you please share the command?Execrate
I m so sorry that I misunderstood the question. According to my understanding the question you are asking is not possible as there are four fields in the tsv file and you only want data of three fields. Updating the answer for convinience.Trillion
S
0

The mongoimport utility does not have any feature for doing this kind of manipulation of your input like you want to do in itself. This is "by design" as there are other tools that can handle this for you.

Notably there is the "pipe" | operator which is supported in both Unix variants and Windows command prompt shells to name a few. So mongoimport itself can read from "standard input" rather than a given --file from "piped" input from another process doing the filtering.

A simple "perl" example ( but follow the same in scripting of choice ):

perl -pe 'chomp($_); @p = split(/\t/,$_); pop(@p); $_ = join("\t",@p) . "\n";' < persons.tsv

That will "strip" the last field from your source persons.tsv so the output is:

Name    City    Mobile
A       Hyd     877777
B       Ban     78899

Then simply "combine" the statement with the "pipe" | in order to pass that "input" into mongoimport:

perl -pe 'chomp($_); @p = split(/\t/,$_); pop(@p); $_ = join("\t",@p) . "\n";' < persons.tsv | \
mongoimport --db test --collection persons --type tsv  --headerline --ignoreBlanks

Which hapily creates the data:

2015-07-21T09:53:40.726+1000    connected to: localhost
2015-07-21T09:53:40.741+1000    imported 2 documents
$ mongo
MongoDB shell version: 3.0.3
connecting to: test
> db.persons.find()
{ "_id" : ObjectId("55ad8a04ee3124750e1600e7"), "Name" : "A", "City" : "Hyd", "Mobile" : 877777 }
{ "_id" : ObjectId("55ad8a04ee3124750e1600e8"), "Name" : "B", "City" : "Ban", "Mobile" : 78899 }
Steamy answered 20/7, 2015 at 23:59 Comment(0)
F
0

Mongoimport full example:

mongoimport --port 7812 -u "turkeyUserAdmin" -p "Turkey@DB^&*" --authenticationDatabase "admin" --db "USA" --collection B2B --type tsv --fields Database_Individual_ID.string(),Name.string(),Company.string() --columnsHaveTypes --file F:/JAYBk/Project/B2B/Data/USA_B2B_DATA.tsv

Fillian answered 21/2, 2018 at 10:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.