I downloaded a bunch of data from Nature in the form of N-Quads, .nq files. These contain RDF graphs, right? How do I access this data, and how can I translate RDF graphs into a more usable format (preferably something like Boost or igraph for R/C++/Python)?
N-Quad graphs - how do I use them?
The typical workflow is something like this:
- Import the N-Quads dump into a SPARQL-capable triple store like OpenLink Virtuoso or Apache Fuseki
- Write SPARQL queries that extract the data you need
- Transform the SPARQL results, which you can get in a simple XML or JSON (or CSV, depending on the store) format, into whatever format you need
Alternatively, send the SPARQL queries directly from your applications and do with the data whatever needs to be done. There are SPARQL client libraries for most languages, but even if you don't have one, it's a fairly simple matter of %-encoding the query and constructing a query URL.
OK, thanks. Is there some tutorial that you know of that I could try? Also, is the general (abstract) strategy to run through the n-quads text document and add links to the output graph based on the subject-predicate-object syntax? –
Toiletry
In general, yes. But N-Quads has a fourth field for each “triple”, called “graph”. Different N-Quads files will use this for different purposes, often to name a context or source for the particular triple. It's probably a good idea to first figure out what the fourth field is used for in the given file, either by inspecting the file or looking for documentation from the publisher. Depending on what you want to use the data for, the fourth field may or may not be important. –
Overnice
© 2022 - 2024 — McMap. All rights reserved.