The removeCommonTerms function is found here for the TM package such that
removeCommonTerms <- function (x, pct)
{
stopifnot(inherits(x, c("DocumentTermMatrix", "TermDocumentMatrix")),
is.numeric(pct), pct > 0, pct < 1)
m <- if (inherits(x, "DocumentTermMatrix"))
t(x)
else x
t <- table(m$i) < m$ncol * (pct)
termIndex <- as.numeric(names(t[t]))
if (inherits(x, "DocumentTermMatrix"))
x[, termIndex]
else x[termIndex, ]
}
now I would like to remove too common terms with the Quanteda package. I could do this removal before creating the Document-feature matrix or with the document-feature matrix.
How to remove too common terms with the Quanteda package in R?