text-parsing Questions

2

Solved

I want to parse strings similar to the following into separate variables using regular expressions from within Bash: Category: entity;scheme="http://schemas.ogf.org/occi/core#";class="kind";title=...
Elias asked 3/1, 2012 at 21:26

3

Solved

I have this simple input I have {red;green;orange} fruit and cup of {tea;coffee;juice} I use Perl to identify patterns between two external brace delimiters { and }, and randomize the fields ins...
Wack asked 24/12, 2015 at 13:2

1

Solved

I have this simple example of chunking in nltk. My data: data = 'The little yellow dog will then walk to the Starbucks, where he will introduce them to Michael.' ...pre-processing ... data_tok...
Brietta asked 18/4, 2016 at 15:50

6

Solved

How do get the first column of every line in an input CSV file and output to a new file? I am thinking using awk but not sure how.
Rudelson asked 26/7, 2012 at 11:47

4

Solved

Need to read the txt file in https://raw.githubusercontent.com/fonnesbeck/Bios6301/master/datasets/addr.txt and convert them into a data frame R with column number as: LastName, FirstName, street...
Prescind asked 28/10, 2015 at 5:6

5

Solved

I have a file.txt fhadja ksjfskdasd adasda sada s adasaaa I need to extract only the words that are 6 character length from there. EXAMPLE of what i need to obtain as a result: fhadja adasda ...
Baculiform asked 30/9, 2015 at 19:49

3

Solved

Give an input sentence, that has BIO chunk tags: [('What', 'B-NP'), ('is', 'B-VP'), ('the', 'B-NP'), ('airspeed', 'I-NP'), ('of', 'B-PP'), ('an', 'B-NP'), ('unladen', 'I-NP'), ('swallow', 'I-N...
Leifeste asked 1/9, 2015 at 13:45

2

Solved

I wanted to reimplement some of my ASCII parsers in Haskell since I thought I could gain some speed. However, even a simple "grep and count" is much slower than a sloppy Python implementation. Can...
Curiel asked 16/7, 2015 at 7:52

5

Solved

Value also contains some letters I have search through so many questions but I couldn't find it. I have string like this: Ab2cds value=284t810 shfn4wksn value=39h1047 hs2krj8dne value=3700p13...
Antonio asked 8/7, 2015 at 9:52

8

Solved

I'm trying to normalize/expand/hydrate/translate a string of numbers as well as hyphen-separated numbers (as range expressions) so that it becomes an array of integer values. Sample input: $a...
Michaella asked 2/7, 2015 at 9:9

2

Solved

Using C# regex to match and return data parsed from a string is returning unreliable results. The pattern I am using is as follows : Regex r=new Regex( @"(.*?)S?(\d{1,2})E?(\d{1,2})(.*)|(.*?)S?...
Librium asked 23/5, 2015 at 9:40

4

Has anybody attempted to extract text from a PDF using an OCR library and Java? What did you find to be the most reliable library for text extraction. Most of the approaches I've seen (tesser...
Balls asked 22/4, 2009 at 16:38

7

I'm trying to break up a paragraph into sentences. Here is my code so far: import java.util.*; public class StringSplit { public static void main(String args[]) throws Exception{ String testStr...
Impress asked 7/12, 2010 at 5:13

13

Solved

I need to be able to parse both CSV and TSV files. I can't rely on the users to know the difference, so I would like to avoid asking the user to select the type. Is there a simple way to detect whi...
Osculation asked 17/4, 2009 at 19:52

9

Solved

I have a .txt file that has the following details: ID^NAME^DESCRIPTION^IMAGES 123^test^Some text goes here^image_1.jpg,image_2.jpg 133^hello^some other test^image_3456.jpg,image_89.jpg What I'd ...
Nutwood asked 14/3, 2011 at 13:51

9

Solved

What would be the best way in Python to parse out chunks of text contained in matching brackets? "{ { a } { b } { { { c } } } }" should initially return: [ "{ a } { b } { { { c } } }" ] putti...
Reluct asked 30/10, 2009 at 18:18

3

Solved

I have written a program to parse a text file which contains a sample C program with if, else and while condition. I have 2 ArrayLists and my program will parse through the file. I'm using M...
Primitive asked 2/5, 2014 at 6:38

4

I have a file with pipe-separated fields. I want to print a subset of field 1 and all of field 2: cat tmpfile.txt # 10 chars.|variable length num|text ABCDEFGHIJ|99|U|HOMEWORK JIDVESDFXW|8|C|CHOR...
Faucal asked 1/4, 2014 at 17:26

2

Solved

I am very curious to hear input from others on a problem I've been contemplating for some time now. Essentially I would like to present a user with a text document and allow him/her to make select...
Minervamines asked 1/7, 2011 at 18:29

6

I am making an application that deals with txt file data. The idea is that txt files may come in different formats, and it should be read into C++. One example might be 3I2, 3X, I3, which should...
Acidify asked 23/7, 2013 at 8:53

1

I found that NLKT in python does it via *raw_parse* function but I need to use Java. I found cleartk has a MaltParser wrapper but there is no documentation about it. I'm looking for a function or a...
Folkrock asked 30/6, 2013 at 17:6

3

Solved

I'm trying to read a tab separated text file line per line. The lines are separated by using carriage return ("\r\n") and LineFeed (\"n") is allowed within in tab separated text fields. Since I wa...
Unemployable asked 23/5, 2013 at 10:55

1

Solved

I want a function that looks something like this readFunc :: String -> (Float -> Float) which operates something like this >(readFunc "sin") (pi/2) >1.0 >(readFunc "(+2)") 3.0 ...
Subjacent asked 21/5, 2013 at 20:39

2

Solved

I need to extract certain bits of a byte and covert the extract bits back to a hex value. Example (the value of the byte is 0xD2) : 76543210 bit position 11010010 is 0xD2 Bit 0-3 defines the c...
Crumley asked 24/11, 2012 at 16:39

6

Solved

What's the easiest way to parse a string and extract a number and a letter? I have string that can be in the following format (number|letter or letter|number), i.e "10A", "B5", "C10", "1G", etc. ...
Confirmand asked 9/4, 2009 at 16:23

© 2022 - 2024 — McMap. All rights reserved.