I'm doing for loop for 13K pdf files, where it reads, pre-processes text, finds similarities and writes in txt. However, when I run the for loop it gives an error
Error in poppler_pdf_text(loadfile(pdf), opw, upw) : Not enough space
What can be the reason?
- I tried to increase
memory_limit()
, it is also not the issue. - I tried to delete hidden files in the folder, like
Thumbs.db
, but same issue appears again. - I remove pdf files at every iteration.
folder_path <- "C: ...."
## get vector with all pdf names
pdf_folder <- list.files(folder.path)
## for loop over all pdf documents
for(s in 1:length(pdf_folder)){
## choose one pdf document from vector of strings
pdf_document_name <- pdf_folder[s]
## read pdf_document pdf into data.frame
pdf <- read_pdf(paste0(folder_path,"/",pdf_document_name))
print(s)
rm(pdf)
} ## end of for loop
# Error:
Error in poppler_pdf_text(loadfile(pdf), opw, upw) : Not enough space
The expected outcome is to read all pdf documents in the original path.
print(s)
like so:cat("counter: ", s)
. Then you'll be able to see where the loop fails, and investigate that pdf file. Even though it seems that this is a memory issue, you can see how many files your computer can handle, and chunk out the loop into a few parts so that you don't run out of memory running the entire thing at once. – Hamlett