text-segmentation Questions
7
I'd like to remove the first word from a string using PHP.
Tried searching but couldn't find an answer that I could make sense of.
eg: "White Tank Top" so it becomes "Tank Top"
...
Salesgirl asked 25/7, 2011 at 22:26
16
Solved
I'm trying to convert a string to a list of words using python. I want to take something like the following:
string = 'This is a string, with words!'
Then convert to something like this :
list ...
Marxist asked 31/5, 2011 at 0:9
10
Solved
How do I split a sentence and store each word in a list? e.g.
"these are words" ⟶ ["these", "are", "words"]
To split on other delimiters, see Split a strin...
Heterochromous asked 13/4, 2009 at 12:48
17
Solved
I want to extract the first word of a variable from a string. For example, take this input:
<?php $myvalue = 'Test me more'; ?>
The resultant output should be Test, which is the first word...
Capablanca asked 19/3, 2010 at 11:26
2
I have a large repository of documents in PDF format. The documents come from different sources, and have no one single style. I use Tika to extract the text from the documents, and now I'd like to...
Discomposure asked 23/1, 2017 at 8:16
6
Solved
How can I break a document (e.g., paragraph, book, etc) into sentences.
For example, "The dog ran. The cat jumped" into ["The dog ran", "The cat jumped"] with spacy?
Kileykilgore asked 19/9, 2017 at 1:14
12
Solved
How can I replace a particular line of text in file using php?
I don't know the line number. I want to replace a line containing a particular word.
Republican asked 9/6, 2010 at 8:9
5
How do you go about parsing an HTML page with free text, lists, tables, headings, etc., into sentences?
Take this wikipedia page for example. There is/are:
free text: http://en.wikipedia.org/wik...
Anecdotage asked 30/6, 2012 at 20:20
10
Solved
What's the best way to slice the last word from a block of text?
I can think of
Split it to a list (by spaces) and removing the last item, then reconcatenating the list.
Use a regular expressi...
Zachery asked 7/6, 2011 at 14:26
12
Solved
I have been trying to get my EditText box to word wrap, but can't seem to do it.
I have dealt with much more complicated issues while developing Android applications, and this seems like it should...
Comfort asked 18/7, 2010 at 16:50
1
Solved
What is the difference between Tokenization and Segmentation in NLP. I searched about them but I didn't really find any differences
.
Infamous asked 20/11, 2021 at 17:32
7
Solved
I am trying to extract all the sentence containing a specified word from a text.
txt="I like to eat apple. Me too. Let's go buy some apples."
txt = "." + txt
re.findall(r"\."+".+"+"apple"+".+"+"\...
Sajovich asked 16/4, 2013 at 9:3
1
Solved
Works:
#!/usr/bin/env python3
from uniseg.graphemecluster import grapheme_clusters
def albanian_digraph_dh(s, breakables):
for i, breakable in enumerate(breakables):
if s.endswith('d', 0, i) and...
Kiva asked 23/8, 2019 at 8:17
7
Solved
I have extracted the list of sentences from a document. I am pre-processing this list of sentences to make it more sensible. I am faced with the following problem
I have sentences such as "more re...
Crossness asked 30/10, 2013 at 6:14
7
Solved
I am trying to write a function to clean up user input.
I am not trying to make it perfect. I would rather have a few names and acronyms in lowercase than a full paragraph in uppercase.
I think t...
Mur asked 21/3, 2011 at 20:46
6
Solved
I have an array of strings, of different lengths and contents.
Now i'm looking for an easy way to extract the last word from each string, without knowing how long that word is or how long the stri...
Cotyledon asked 2/3, 2012 at 13:40
2
Solved
Is anyone aware of any JavaScript implementations of UAX #29, Unicode Text Segmentation? I'm specifically interested in Word Boundaries.
I was hopeful when I came across XRegExp, but it seem...
Finfoot asked 5/5, 2014 at 10:18
5
I am looking for a regex that matches first word in a sentence excluding punctuation and white space. For example: "This" in "This is a sentence." and "First" in "First, I would like to say \"Hello...
Lame asked 8/2, 2013 at 6:38
1
Solved
Given the paragraph from Wikipedia:
An ambitious campus expansion plan was proposed by Fr. Vernon F.
Gallagher in 1952. Assumption Hall, the first student dormitory, was
opened in 1954, and Rockwe...
Bleareyed asked 13/11, 2017 at 22:21
1
Solved
I'm trying to build a handwriting recognition system using python and opencv.
The recognition of the characters is not the problem but the segmentation.
I have successfully :
segmented a word int...
Foredo asked 18/9, 2017 at 15:9
5
Solved
I remember skimming the sentence segmentation section from the NLTK site a long time ago.
I use a crude text replacement of “period” “space” with “period” “manual line break” to achieve sentence ...
Collum asked 25/5, 2014 at 20:57
3
Please have a look at the following.
String[]sentenceHolder = titleAndBodyContainer.split("\n|\\.(?!\\d)|(?<!\\d)\\.");
This is how I tried to split a paragraph into sentences. But, there is ...
Large asked 29/1, 2014 at 11:57
6
I need to find a dynamic programming algorithm to solve this problem. I tried but couldn't figure it out. Here is the problem:
You are given a string of n characters s[1...n], which you believe to...
Dissolution asked 15/3, 2011 at 11:2
2
So first off I'm very new to Python so if I'm doing something awful I'm prefacing this post with a sorry. I've been assigned this problem:
We want to devise a dynamic programming solution to the f...
Lovmilla asked 5/3, 2014 at 9:55
2
Solved
I am new in the NLP domain, but my current research needs some text parsing (or called keyword extraction) from URL addresses, e.g. a fake URL,
http://ads.goole.com/appid/heads
Two constraints are...
Winograd asked 20/12, 2013 at 3:31
1 Next >
© 2022 - 2025 — McMap. All rights reserved.