Late response but for future visitors such as me and after a long time searching.
Use OpenNlP model, that was the best option in my case and it worked with all the text samples here including crucial one mentioned by @nbz in the comment,
My friend, Mr. Jones, has a new dog. This is a test. This is a T.L.A. test. Now with a Dr. in it."
Separated by a line space:
My friend, Mr. Jones, has a new dog.
This is a test.
This is a T.L.A. test.
Now with a Dr. in it.
You need the .jar
libraries to import into your project as well as the trained model en-sent.bin
.
This is a tutorial which can easily integrate you into a quick and efficient run:
https://www.tutorialkart.com/opennlp/sentence-detection-example-in-opennlp/
And one for setup-ing in eclipse:
https://www.tutorialkart.com/opennlp/how-to-setup-opennlp-java-project/
This is how the code looks like:
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import com.fasterxml.jackson.databind.exc.InvalidFormatException;
import opennlp.tools.sentdetect.SentenceDetectorME;
import opennlp.tools.sentdetect.SentenceModel;
/**
* Sentence Detection Example in openNLP using Java
* @author tutorialkart
*/
public class SentenceDetectExample {
public static void main(String[] args) {
try {
new SentenceDetectExample().sentenceDetect();
} catch (IOException e) {
e.printStackTrace();
}
}
/**
* This method is used to detect sentences in a paragraph/string
* @throws InvalidFormatException
* @throws IOException
*/
public void sentenceDetect() throws InvalidFormatException, IOException {
String paragraph = "This is a statement. This is another statement. Now is an abstract word for time, that is always flying.";
// refer to model file "en-sent,bin", available at link http://opennlp.sourceforge.net/models-1.5/
InputStream is = new FileInputStream("en-sent.bin");
SentenceModel model = new SentenceModel(is);
// feed the model to SentenceDetectorME class
SentenceDetectorME sdetector = new SentenceDetectorME(model);
// detect sentences in the paragraph
String sentences[] = sdetector.sentDetect(paragraph);
// print the sentences detected, to console
for(int i=0;i<sentences.length;i++){
System.out.println(sentences[i]);
}
is.close();
}
}
Since you implement the libraries it works offline too which is a big plus as the correct answer by @Julien Silland says it's not a straight-forward process and having a trained model do it for you is the best option.