How to define person's names in text (Java)
Asked Answered
S

10

5

I have some input text, which contains one or more human person names. I do not have any dictionary for these names. Which Java library can help me to define names from my input text? I looked through OpenNLP, but did not find any example or guide or at least description of how it can be applied into my code. (I saw javadoc, but it is pretty poor documentation for such a project.)

I want to find names from some random text. If the input text is "My friend Joe Smith went to the store.", then I want to get "Joe Smith". I think there should be some large enough dictionaries on smart engines, based on smaller dictionaries, that can understand human names.

Snippet answered 9/12, 2009 at 18:14 Comment(3)
Are you trying to identify, for example, a name which appears in a sentence? i.e., given "My friend Joe went to the store." you want "Joe"?Pu
Please clarify your question! You can't "define" names, they were created historically. "Julius", for example, is a Roman name. How does the text look, and what kind of processing are you to do with it?Hypersonic
I think the author wants to extract the names of people from unstructured text using a Java library, possibly using OpenNLP, but he can't find an example or good documentation on how to achieve this.Springclean
K
4

I'd look into LingPipe. Check out this demo. By the way, what you are trying to do is called "named entity recognition". It's a difficult CS problem to get right.

Klinger answered 9/12, 2009 at 18:20 Comment(0)
M
3

OpenNLP has Named Entity recognition. Check the section English Name Finding in the docs. But my experience suggests, it identifies entities but there are no tags associated with it. (To be precise, I found the tags to ambiguously assigned.) So, if you have the sentence "My friend Joe Smith went to the Walmart store", OpenNLP identifies two named entities - "Joe Smith" and "Walmart". I couldn't get it tag "Joe Smith" as Person and "Walmart" as Organization.

As suggested by Matt, you can try LingPipe, though it's a commercial tool. Some of the open source alternatives are MorphAdorner and Stanford NER.

Mercuric answered 11/12, 2009 at 3:26 Comment(0)
H
2

While we're waiting for details on what you're doing, here are a couple of links to lists of common first names, at least in the USA demographic:

I think you'll need these (and/or more) to check against, as your task doesn't sound like something a NLP can do for you without reference information.

Hypersonic answered 9/12, 2009 at 18:21 Comment(0)
B
1

You can check Person extraction from free text here http://code.google.com/p/graph-expression/wiki/Examples

Brindle answered 20/5, 2011 at 5:24 Comment(0)
R
1

OpenNlp has a person type in their NER model. download the project and models from the opennlp web site, and get the models from the models website (there is a link on the Opennlp page). Then go here, http://www.asksunny.com/drupal/?q=node/4 it is a good example of how to load the models and perform NER. NER is never perfect, so don't be dissapointed.

Rothko answered 14/9, 2011 at 2:27 Comment(0)
G
1

I would suggest you using stanford Name Entity Recognizer(NER). Stanford NER provides many classifiers.One of the classifiers provided by stanford NER can identify name,location and organization from a given text.

You can find an online demo for stanford NER in this link http://nlp.stanford.edu:8080/ner/

Gemology answered 29/1, 2014 at 14:21 Comment(0)
F
0

You can also look through OpenCyc and WordNet projects as more interesting from semantic view point.

Floc answered 10/12, 2009 at 17:15 Comment(0)
E
0

This problem is addressed in named entity recognition in natural language processing and at the moment it is considered to be a bit hard problem. However there are many tools you can use for that. I have used stanford NER for this and it is a good software.

Exclusion answered 20/7, 2012 at 10:52 Comment(0)
M
0

OpenCalais service may be useful. Try their submission tool at: http://www.opencalais.com/documentation/calais-submission-tool

This tool recognizes much more than just person names.

Maximo answered 8/2, 2014 at 0:53 Comment(0)
U
0

Try Stanford NER, a text processing library

http://nlp.stanford.edu:8080/ner/

Ultrasonic answered 18/7, 2014 at 5:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.