How to convert Text file/document into RDF triples?
Asked Answered
O

1

6

I want to create "RDF triple" by taking "text file/document" as an input. It means a tool like portege-OWL, which will convert Text file into RDF triple.

And may I know the steps how to do this. And what are all the tools required to do this.

Any one assist on this is appreciated.

Thanks, Babu

Outrush answered 9/1, 2014 at 5:35 Comment(0)
M
8

You should give an example of your ontology in text. I suppose your input is not in RDF/XML format. If your data are saved in a custom text file format, then you probably would not find a tool able to do this conversion. This it totally expected because no tool could actually guess what is the format of an unstructured document, unless it follows one. Therefore, I suppose you would have to write a custom convertor.

You could write the convertor in any language you like, since the output could be an RDF/XML document describing the ontology. RDF/XML is actually an XML file which means that the only thing you need is to create an XML file, using an XML parsing library. Then, you could use the RDF/XML document to import it on the Protege and do whatever you want. Since your programming language is Java, you could do it using the JAXP or any other XML library (here you could find alternatives).

An alternative pathway would be to use Apache Jena which is a java API to handle ontologies (including RDF models) and then you could also process the ontology model created. I believe that Jena is a better way to do it (if you are familiar with it).

Anyway, I don't believe there is a tool to help you. You have to do the source text parsing, the hard way. No tool would be able to identify which part of the source text is meant to declare an RDF class or a property in a custom text format. Perhaps your job would be easier if you used some text parsing library like FFP but still you would have to do it yourself.

Hope I helped!

Mellen answered 9/1, 2014 at 6:34 Comment(5)
Thanks Pantelis, I am using Java language to create RDF triple. And I know little bit XML knowledge as well. My requirement is like, the given input is some "Resume/wiki page/doc file" and the result will be "RDF triple". As you mentioned we can do it by using JAXP or any other XML library, let me know on which tool/editor(Protege or any other) I can do it. Please provide some sample code to do the same. Thanks, BabuOutrush
Check my edited answer. If you found my answer helpful, please upvote it. It is crucial for other readers navigating to your question, to be able to identify a helpful answer.Mellen
Pantelis' answer is correct: Jena is a good library to produce RDF with, but parsing the input requires a better specification than the one user2664196 has been provided with. "A generic file containing text" does not provide enough detail for any tool to reliably parse the file. I suspect that part will require some bespoke code.Evieevil
Have there been any updates to the ecosystem of RDF production tools where raw text files can be digested to produce outputs of RDF triples?Romain
@Vass: By the term "raw text files" you refer to free text unstructured data? If yes, then the answer to your question is too difficult. If you refer to tabular data formed as simple csv files, then things are a lot simpler.Mellen

© 2022 - 2024 — McMap. All rights reserved.