term-document-matrix Questions
7
I have been using the tm package to run some text analysis.
My problem is with creating a list with words and their frequencies associated with the same
library(tm)
library(RWeka)
txt <- read....
Elboa asked 7/8, 2013 at 10:30
2
Following the many guides to creating biGrams using the 'tm' and 'RWeka' packages, I was getting frustrated that only 1-Grams were being returned in the tdm. Through much trial and error I discover...
Unicuspid asked 13/3, 2017 at 5:33
4
Solved
My file has over 4M rows and I need a more efficient way of converting my data to a corpus and document term matrix such that I can pass it to a bayesian classifier.
Consider the following code:
...
Whitmer asked 15/8, 2014 at 16:57
3
Solved
I have a question about queries in Solr. When I perform a query with multiple search terms that are all logically linked by OR (e.g. q=content:(foo OR bar OR foobar)) than Solr returns a list of do...
Lignify asked 30/7, 2014 at 13:27
1
I gather Text documents (in Node.js) where one document i is represented as a list of words.
What is an efficient way to compute the similarity between these documents, taking into account that new...
Superdreadnought asked 21/12, 2012 at 8:17
3
Solved
I am creating a Word Cloud based on Tweets from various different sports teams. This code executes successfully about 1 in 10 times:
handle <- 'arsenal'
txt <- searchTwitter(handle,n=1000,la...
Sydney asked 6/9, 2014 at 10:31
3
Solved
I am trying to create a term document matrix with NLTK and pandas.
I wrote the following function:
def fnDTM_Corpus(xCorpus):
import pandas as pd
'''to create a Term Document Matrix from a NLTK...
Mcchesney asked 9/4, 2013 at 10:46
3
Solved
I have been working through numerous online examples of the {tm} package in R, attempting to create a TermDocumentMatrix. Creating and cleaning a corpus has been pretty straightforward, but I consi...
Unwary asked 28/8, 2014 at 14:36
4
Solved
I tried using the tm_map. It gave the following error. How can I get around this?
require(tm)
byword<-tm_map(byword, tolower)
Error in UseMethod("tm_map", x) :
no applicable method for 'tm...
Ozan asked 30/11, 2012 at 6:35
2
Solved
I have two sets of data:
a set of tags (single words like php, html, etc)
a set of texts
I wish now to build a Term-Document-Matrix representing the number occurrences of the tags element in th...
Brookner asked 31/10, 2013 at 11:56
3
Solved
Based on the question More efficient means of creating a corpus and DTM I've prepared my own method for building a Term Document Matrix from a large corpus which (I hope) do not require Terms x Doc...
Peroxidase asked 5/4, 2015 at 23:37
1
Solved
I have been breaking my head over this one over the last few days. I searched all the SO archives and tried the suggested solutions but just can't seem to get this to work. I have sets of txt docum...
Mclain asked 9/11, 2014 at 23:30
1
Solved
In R I used the [tm package][1] for building a term-document matrix from a corpus of documents.
My goal is to extract word-associations from all bigrams in the term document matrix and return for...
Calendula asked 30/5, 2013 at 12:21
1
Solved
I have a corpus of 39 text files named by the year - 1945.txt, 1978.txt.... 2013.txt.
I've imported them into R and created a Document Term Matrix using TM package.
I'm trying to investigate how w...
Spaceless asked 22/5, 2013 at 15:31
1
Solved
I have a termDocumentMatrix created using the tm package in R.
I'm trying to create a matrix/dataframe that has the 50 most frequently occurring terms.
When I try to convert to a matrix I get thi...
Elke asked 16/7, 2012 at 16:42
1
© 2022 - 2024 — McMap. All rights reserved.