Unzip password protected zip files in R
Asked Answered
E

5

5

A password cannot be specified in unzip (utils) function. The other function I am aware of, getZip (Hmisc), only works for zip files containing one compressed file.

I would like to do something like this to unzip all the files in foo.zip in Windows 8:

unzip("foo.zip", password = "mypass")
Eichhorn answered 6/6, 2016 at 19:30 Comment(3)
Maybe try system("7z x secure.7z")? See: https://mcmap.net/q/211273/-7-zip-command-to-create-and-extract-a-password-protected-zip-file-on-windows-closedFarmelo
Thanks, I still haven't managed it, but I think your suggestion pointed me in the right way (relying on 7z syntax).Eichhorn
Update your post with attempts and problems, or if you managed to solve it, you can add your own answer below.Farmelo
M
6

I found this question very useful but saw that no formal answers were posted, so here goes:

  1. First I installed 7z.
  2. Then I added "C:\Program Files\7-Zip" to my environment path.
  3. I tested that the 7z command was recognized from the command line.
  4. I opened R and typed in system("7z x secure.7z -pPASSWORD") with the appropriate PASSWORD.

I have multiple zipped files and I'd rather not the password show in the source code or be stored in any text file, so I wrote the following script:

file_list <- list.files(path = ".", pattern = ".7z", all.files = T)
pw <- readline(prompt = "Enter the password: ")
for (file in file_list) {
  sys_command <- paste0("7z ", "x ", file, " -p", pw)
  system(sys_command)
}

which when sourced will prompt me to enter the password, and the zip files will be decompressed in a loop.

Mesdemoiselles answered 2/6, 2018 at 21:49 Comment(1)
works like a charm, thanks for following up on this! For completion, you can add 7z to your environment vars with setx PATH "%PATH%;C:\Program Files\7-Zip\"Eichhorn
H
3

I found @Kim 's answer worked for me eventually but not first off. I thought I'd just add a few extra links/steps that helped me get there in the end.

Close and reopen R so that environment path is recognised

If you've already opened R when you do steps 1-3 you need to close and reload R for R to recognise the environment path for 7z. @wush978 's answer to this question r system doesn't work when trying 7zip was informative. I used Sys.getenv("PATH") to check that 7zip was included in the environment paths.

Step 4. I opened R and typed in system("7z x secure.7z -pPASSWORD") with the appropriate PASSWORD.

I actually found this didn't work so I modified it slightly following the instructions in this post which also explains how to specify an output directory https://mcmap.net/q/401858/-how-to-programmatically-extract-unzip-a-7z-7-zip-file-with-r.

If you have already extracted the files the system command prompts you to choose whether you want to replace the existing file with the file from the archive and provides options (Y)es / (N)o / (A)lways / (S)kip all / A(u)to rename all / (Q)uit?

So the modified step 4 (Y allows replacement of files)

system("7z e -ooutput_dir secure.zip -pPASSWORD" Y)

Putting this altogether as a modified set of instructions

  1. Install 7z.
  2. Added "C:\Program Files\7-Zip\" to my environment path using menu options (instructions here https://www.opentechguides.com/how-to/article/windows-10/113/windows-10-set-path.html)
  3. Closed and reopened R studio. Typed Sys.getenv("PATH") to check path to 7zip recognised in the environment (as per @wush978 's answer to question r system doesn't work when trying 7zip)
  4. Typed in the console system("7z e -oC:/My Documents/output_dir secure.zip -pPASSWORD") with the appropriate PASSWORD (as per instructions here https://mcmap.net/q/401858/-how-to-programmatically-extract-unzip-a-7z-7-zip-file-with-r)

And here is a modified version of @Kim 's neat function (including specified output directory and check for existing files):

My main script

output_dir <- "C:/My Documents/output_dir " #space after directory name is important
zippedfiles_dir <- "C:/My Documents/zippedfiles_dir/"

file_list <- paste0(output_dir , zippedfiles_dir , list.files(path = zippedfiles_dir, pattern = ".zip", all.files = T))

source("unzip7z.R")

Code inside source file unzip7z.R

pw = readline(prompt = "Enter the password: ")
for (file in file_list) {
  csvfile <- gsub("\\.zip", "\\.csv", gsub(".*? ", "", file)) #csvfile name (removes output_dir from 'file' and replaces .zip extension with .csv)

#check if csvfile already exists in output_dir, and if it does, replace it with archived version and if it doesn't exist, continue to extract.
  if(file.exists(csvfile)) { 
     sys_command = paste0("7z ", "e -o", file, " -p", pw, " Y")
  } else {
     sys_command = paste0("7z ", "e -o", file, " -p", pw)
  } 
  system(sys_command)
}
Heathenish answered 4/6, 2020 at 11:28 Comment(0)
D
0
password <- "your password"
system(
  command = paste0("unzip -o -P ", password, " ", "yourfile.zip"), 
  wait = TRUE
)
Dictatorial answered 23/8, 2018 at 7:21 Comment(0)
C
0
password <- "your password"
read.table(
  text = system(paste0("unzip -p -P ", password, " yourfile.zip ", "yourfile.csv"),
    intern = "TRUE"
  ), stringsAsFactors = FALSE, header = TRUE, sep = ","
)
Capacitate answered 1/10, 2019 at 7:22 Comment(0)
S
0

Here's a simplified version that works with JupyterLab on Linux. I suspect it would work cross platform.

library(stringr)
library(readr)
library(IRdisplay)

pw <- readline("Enter password to decrypt data:")
clear_output() #prevents echoing password to Jupyter output

"iris.csv.7z" %>%
    str_c("7z x -p",pw," -so ",.) %>%
    pipe() %>%
    read_csv

You can try this yourself by compressing a dataset to .7z with a password and providing that password when prompted. Using the -so flag means the decrypted data never hits disk. You should test the 7z command from your current directory if the code above doesn't work. That should give you a useful error message. My example command line is 7z x -piris -so iris.csv.7z where iris is the password.

Sampling answered 23/1 at 17:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.