i was trying to import freebase rdf to google refine but getting an error....but now how to extract topic names with notable type from 18 gb rdf to csv etc....any gui tool ?
getting error while importing rdf [closed]
Asked Answered
What error are you getting? Why does it have to be a GUI tool? If all you want is notable type & name, I'd have thought a simple one line grep command would do it for you. –
Ange
it is not importing in Google refine (*.gz size: 18 GB & uncompressed size: 146 GB)....but what & where to type the command..im not a linux user.... –
Voigt
one line grep command ? –
Voigt
146 GB is too big for OpenRefine (ex-Google Refine) to handle. If there is a GUI tool that will do this out of the box, I'm not familiar with it, but since this is a programming Q & A site, I'll give a shell programming solution. You don't need to know anything about Linux, but you do need to know how to use Unix shell commands (you could use Cygwin on Windows).
curl -L http://download.freebaseapps.com | gunzip | egrep 'notable_for|notable_type|rdfs:label'
will give you all the raw data that you need to assemble the solution. The lines with the key information look like this, but if you just want labels/names, you'll need to substitute them for the subject/object IDs in the first and last colum.
ns:m.01nsxs2 ns:common.topic.notable_types ns:m.0kpv17.
i ran the command provided by you..But how to get clear text with topic name & notable type eg:(Gmail: Software) in csv ?..Currently it is giving:
ns:g.1254yxnny ns:common.notable_for.display_name "Zeneszám"@hu. ns:g.1254yxnny ns:common.notable_for.display_name "Utwór muzyczny"@pl. ns:g.1254yxnny ns:common.notable_for.display_name "Nummer (muziek)"@nl. ns:g.1254yxnny ns:common.notable_for.display_name "संगीत ट्रैक"@hi.
–
Voigt © 2022 - 2024 — McMap. All rights reserved.