7z list only filenames
Asked Answered
A

5

10

I'm using 7z version 18.05 and I would like to list only filenames of an archive content.

If I use the command 7z l myArchive.7z i get this output:

7-Zip 18.05 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-04-30

Scanning the drive for archives:
1 file, 146863932 bytes (141 MiB)

Listing archive: myArchive.7z

--
Path = myArchive.7z
Type = 7z
Physical Size = 146863932
Headers Size = 393
Method = LZMA:26
Solid = +
Blocks = 1

Date       Time     Attr          Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2017-12-06 08:55:47 D...A            0            0  myArchive
2017-12-06 08:55:42 D...A            0            0  myArchive\folder
2017-12-05 19:50:41 ....A     21816530    146863539  myArchive\folder\Test.dat
2017-12-06 08:55:42 ....A     21877463               myArchive\folder\Test2.dat
2017-12-05 19:51:05 ....A       153953               myArchive\folder\Test3.dat
2017-12-05 19:50:41 ....A         4193               myArchive\folder\Test4.dat
2017-12-06 08:55:47 ....A     24128956               myArchive\log.txt
2017-12-06 08:55:47 ....A        79980               myArchive\readme.txt
2017-12-05 19:51:05 ....A   3256759999               myArchive\folder\zTest.txt
------------------- ----- ------------ ------------  ------------------------
2017-12-06 08:55:47         3324821074    146863539  7 files, 2 folders

I don't know why 7z doesn't have a switch to list only filename. How to get only "Name" column? Any suggest with a dos command?

Antimagnetic answered 26/3, 2019 at 10:58 Comment(0)
J
20

Found this answer in a different thread: https://superuser.com/a/1073272/542975

There is an undocumented switch -ba which removes all of the header and table formatting, and only lists the row entries.

From there, you could parse every line and split it by whitespaces or tabs, or potentially go with a regex.

Jaquesdalcroze answered 7/5, 2020 at 9:29 Comment(0)
P
4

I was searching for an answer to this exact problem, and found the answer within the link provided by Nisse Knudsen, but for me the -ba undocumented switch did not work on its own to do what I needed it to, nor does this switch alone (nor Nisse's answer) appear to fully answer the OP's question - OP wants to know how to get just the Name (of each file) and -ba alone will not always work when parsing data from a .7z file. I would have made a comment rather than a full answer, but I have not earned enough rep to comment and I believe this information is still relevant and accurate to present, and my "comment" would have been too long anyway.

Referencing the link provided by Nisse (https://superuser.com/a/1073272/542975) using the -slt switch formats the output in a much more readable format (for looping/parsing purposes) which a simple For /f loop in a batch file can parse and give you what is needed.

Let me list a few changes in output for you to see what each switch is doing.

THIS: 7z.exe l "C:\Some Directory\Some FileIZipped.zip"

7-Zip [64] 15.12 : Copyright (c) 1999-2015 Igor Pavlov : 2015-11-19

Scanning the drive for archives:
1 file, 9986888 bytes (10 MiB)

Listing archive: C:\Some Directory\Some FileIZipped.zip

--
Path = C:\Some Directory\Some FileIZipped.zip
Type = zip
Physical Size = 9986888

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2017-07-18 12:19:04 ....A       240789       109401  A_RandomFile.doc
2017-07-05 13:32:42 ....A     19148800      9877487  Another_Random File with Spaces.mov
------------------- ----- ------------ ------------  ------------------------
2017-07-18 16:30:44           19389589      9986888  2 files

Becomes THIS: 7z.exe l -slt "C:\Some Directory\Some FileIZipped.zip"

7-Zip [64] 15.12 : Copyright (c) 1999-2015 Igor Pavlov : 2015-11-19

Scanning the drive for archives:
1 file, 9986888 bytes (10 MiB)

Listing archive: C:\Some Directory\Some FileIZipped.zip

--
Path = C:\Some Directory\Some FileIZipped.zip
Type = zip
Physical Size = 9986888

----------
Path = A_RandomFile.doc
Folder = -
Size = 240789
Packed Size = 109401
    ----[10 lines of jargon removed for clarity]----

Path = Another_Random File with Spaces.mov
Folder = -
Size = 19148800
Packed Size = 9877487
    ----[10 lines of jargon removed for clarity]----

Adding in the -ba command simplifies the format a little further, preventing the need to skip the header lines (I reference this in comments in the for loop as shown in the script sample at the end).

This further becomes: 7z.exe l -ba -slt "C:\Some Directory\Some FileIZipped.zip"

Path = A_RandomFile.doc
Folder = -
Size = 240789
Packed Size = 109401
    ----[10 lines of jargon removed for clarity]----

Path = Another_Random File with Spaces.mov
Folder = -
Size = 19148800
Packed Size = 9877487
    ----[10 lines of jargon removed for clarity]----
  

I am using this as a method of file-comparing an archive (zip/7z/rar) against the actual directory to make a mirror copy where the directory is the master. To do this I am parsing a file containing the output of my 7z list command. I suppose I could iterate the for loop directly from the 7z command instead, but I have found this to be slower in some situations when there's a large amount of data within the archives.

I have had multiple instances where trying to parse the standard output fails - it occurs when listing contents of a .7z archive as shown below. This is not -EASILY- resolved using a for loop parsing for spaces. What would be Token 5 for most lines (showing as the Compressed Space) end up becoming the filename which is reserved in .zip format archives as Token 6 so then you have a very messy situation which is a nightmare to plan for. This is also the exact problem the OP is referencing in the provided example given.

Example similar to what OP Provided:

7-Zip [64] 15.12 : Copyright (c) 1999-2015 Igor Pavlov : 2015-11-19

Scanning the drive for archives:
1 file, 446600 bytes (437 KiB)

Listing archive: C:\Some Directory\Some _OTHER_ FileIZipped.7z

--
Path = C:\Some Directory\Some _OTHER_ FileIZipped.7z
Type = 7z
Physical Size = 446600
Headers Size = 283
Method = LZMA2:22
Solid = +
Blocks = 1

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2020-08-28 15:06:46 D....            0            0  SomeDirectoryInside
2020-08-28 15:06:46 D....            0            0  SomeDirectoryInside\OtherDir
2020-08-28 15:06:46 D....            0            0  SomeDirectoryInside\Zips
2020-08-28 15:13:14 .....      1064960       446317  SomeDirectoryInside\Zips\Some_File.Doc
2020-08-28 15:08:02 .....       313080               SomeDirectoryInside\Zips\Some_Other_File.Doc
2020-08-28 15:07:34 .....      1561728               SomeDirectoryInside\Zips\Foo.mov
2020-08-28 15:07:46 .....       262144               SomeDirectoryInside\Zips\Fancy.Doc
2020-08-28 15:07:26 .....       262144               SomeDirectoryInside\Zips\Fancy2.Doc
------------------- ----- ------------ ------------  ------------------------
2020-08-28 15:13:14            3464056       446317  5 files, 3 folders

Below is a batch script sample I wrote to put the 7z.exe output into a file and then pulling the data from the file and getting just what I need. Forgive the multiple REM lines - I prefer this method of commenting instead of long single -line strings so readers do not have to scroll the code block to the right in order to read.

Because of how For /f iterates through data, we need to ensure token %%c is not blank. I am using this method because sometimes our files have spaces in the names, and we are parsing the 7z output using Spaces as the Delimiter.

Token 3* will give you two separate tokens you can check -- Tokens %%b [ Token 3 ] and %%c [ Token * ] - if %%c is blank - we know %%b has no spaces and can safely be echoed to whichever file we need or set as a variable to use later, etc.

@Echo Off
  REM Sending the output of 7z into a file to use later
  7z.exe l -slt "SomeFileIZipped.zip" >"ZipListRAW.txt"
  
  REM Example of 7z.exe command with '-ba' switch
  REM 7z.exe l -ba -slt "SomeFileIZipped.zip"
  
  REM If you do not use '-ba' in the 7z command above, you can simply skip the first
  REM 11-12 lines of the file to get ONLY the filenames (skips past first line containing
  REM "Path" which contains the original archive filename.
  
  For /f "Usebackq Skip=11 Tokens=1,3* Delims= " %%a in ("ZipListRAW.txt") do (
    REM Checking if %%a equals word "Path"
    If "%%a"=="Path" (
      If [%%c]==[] (
        Echo %%b
      ) ELSE (
        Echo %%b %%c
      )
    )
  )
Perloff answered 12/5, 2021 at 16:21 Comment(3)
Great to see you digging deeper into it, as well as, on Windows!Jaquesdalcroze
I'm trying to parse mhtml files with 7-zip and eDecoder plugin and found that 7z l listing can truncate filenames, with -ba or without. Opening same files with 7zFM shows full names. On the other hand, -slt switch does not include names or folders at all, showing "Position" number, "Content type" and such specifics instead, which is probably the fault of eDecoder plugin, not 7-zip itself. All I want is a list of untruncated file names or paths.Reitz
You might fare better to open that as its own question - parsing mhtml with 7-zip is possibly going to have its own niche use-case. Besides - that way you can share code snips, 7-zip output snips, etc. while getting the full help you need.Perloff
I
3

Updated Answer

Based on @Philippe's comment, it seems my original answer did not properly handle file paths that contain spaces. I took a look at the 7z source code and confirmed as I suspected that the data columns are of fixed width. The file path always starts after the 53rd character.

Therefore, you can use the following command to list all of the files (including those with spaces) in a given archive named archive.7z:

7z l -ba archive.7z | grep -vF 'D....' | grep -oP '(?<=^.{53}).*'

The first grep command removes directory entries from the list and the second grep command skips the first 53 characters and prints the remainder of each line, which will be the full file path including spaces.

If you want to print all of the directory paths in addition to the file paths, then simply remove the first grep command:

7z l -ba archive.7z | grep -oP '(?<=^.{53}).*'

Original Answer

I was able to do this by using the -ba flag to get output with a single line for each item in the archive, then using grep to parse each line to get only the filenames.

Consider this example, listing the contents of an archive that contains a single file baz nested within two levels of directories foo and bar.

user@host:~$  7z l -ba 'archive.7z'

which resulted in the following output:

2021-06-21 14:37:09 D....            0            0  foo
2021-06-21 14:37:41 D....            0            0  foo/bar
2021-06-21 14:37:41 ....A          881          524  foo/bar/baz

Then, using grep to only get the path at the end of each line:

user@host:~$ 7z l -ba 'archive.7z' | grep -oP '\S+$'

giving the output:

foo
foo/bar
foo/bar/user1.lnk

If you wanted to list all items including directories, then you're finished with just the above. In my case, I was actually trying to get a preview of the items that would be extracted using 7z e, which extracts only files without the directory structure, so I added:

user@host:~$ 7z l -ba 'archive.7z' | grep -vF 'D....' | grep -oP '\S+$' | xargs basename

which gives my desired output; for this example:

user1.lnk
Insectarium answered 21/6, 2021 at 15:8 Comment(2)
This doesn't work if the filenames contain spacesShowalter
@Showalter I updated my answer with a solution that properly handles file paths containing spaces. Thanks.Insectarium
B
0

If you don't like the sound of using an undocumented command line switch, you can do the following to parse out the filename from the full output. This awk script determines the start index of the Name column header, and uses that to extract the column from the table.

awk_script='{
    if (ix == 0) {
        ix = index($0, "Name");
    }
    p = (body == 1);
    if (ix > 0) {
        # The table body is delimited by dashed lines, after the "Name" column header has been seen
        body = (body + ($0 ~ / *-[ -]+/)) % 2;
    }
    if (p == 1 && body == 1) {
        # Only print if "body" was 1 before and after the previous block; otherwise, we are in
        # the table body delimiter line (or outside the table completely)
        print substr($0, ix);
    }
}'

7z l your-file.7z | awk "$awk_script"

PowerShell equivalient:

$ix=-1;
$body=$false;

& 7z l your-file.7z | foreach { `
    if ($ix -eq -1) {`
        $ix = $_.IndexOf("Name");`
    }`
    $p = $body;`
    if ($ix -gt 0) {`
        # The table body is delimited by dashed lines, after the "Name" column header has been seen`
        $body = ($body -ne ($_ -match ' *-[ -]+'))
    }`
    if ($p -and $body) {`
        # Only print if "body" was 1 before and after the previous block; otherwise, we are in`
        # the table body delimiter line (or outside the table completely)`
        write-output $_.Substring($ix)`
    }`
}
Bathy answered 17/2, 2021 at 10:6 Comment(3)
The linux script seems to only get the last file from the column.Samp
I just tried it out, and it seems to work fine for me... 🤷‍♂️ (This is with 7z v16.02)Bathy
Maybe it's cuz I'm using zsh, not bashSamp
K
-1

If you can install a PowerShell module on your machine, listing the file names is easy enough. This can be done on any modern-day, supported Windows system.

https://www.powershellgallery.com/packages/7Zip4Powershell/1.9.0 describes how to install the module.

Here is a .bat file script showing its usage and output.

C:>TYPE zipfnlist.bat
@ECHO OFF
SET "ZIP_FILENAME=.\7zIntf20.zip"
powershell -NoLogo -NoProfile -Command (Get-7Zip -ArchiveFileName "%ZIP_FILENAME%").FileName

C:>CALL zipfnlist.bat
bin
Properties
Ole32.cs
Program.cs
SevenZipFormat.cs
SevenZipInterface.cs
SevenZip.csproj
SevenZip.sln
bin\Debug
bin\Release
bin\Release\SevenZip.exe
Properties\AssemblyInfo.cs
Kinna answered 28/3, 2019 at 0:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.