Parsing FtpWebRequest ListDirectoryDetails line
Asked Answered
N

4

12

I need some help with parsing the response from ListDirectoryDetails in C#.

I only need the following fields.

  • File Name/Directory Name
  • Date Created
  • and the File Size.

Here's what some of the lines look like when I run ListDirectoryDetails:

d--x--x--x    2 ftp      ftp          4096 Mar 07  2002 bin
-rw-r--r--    1 ftp      ftp        659450 Jun 15 05:07 TEST.TXT
-rw-r--r--    1 ftp      ftp      101786380 Sep 08  2008 TEST03-05.TXT
drwxrwxr-x    2 ftp      ftp          4096 May 06 12:24 dropoff

Thanks in advance.

Nelrsa answered 18/6, 2009 at 15:48 Comment(0)
O
28

Not sure if you still need this, but this is the solution i came up with:

Regex regex = new Regex ( @"^([d-])([rwxt-]{3}){3}\s+\d{1,}\s+.*?(\d{1,})\s+(\w+\s+\d{1,2}\s+(?:\d{4})?)(\d{1,2}:\d{2})?\s+(.+?)\s?$",
    RegexOptions.Compiled | RegexOptions.Multiline | RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace );

Match Groups:

  1. object type:
    • d : directory
    • - : file
  2. Array[3] of permissions (rwx-)
  3. File Size
  4. Last Modified Date
  5. Last Modified Time
  6. File/Directory Name
Outsell answered 25/8, 2009 at 15:33 Comment(5)
Great Regex, added names for all the capturing groups to make it more undestandable when parsing... How does the ftpd which uses this format show years in the modify date?Edrick
if the year of the modified date is the current year, then it shows only the MMM dd and hh:mm, but if its from a previous year, then it shows the actual year, but no time.Outsell
With groups: ^(?<fileordir>[d-])(?<attribs>[rwxt-]{3}){3}\s+\d{1,}\s+.*?(?<filesize>\d{1,})\s+(?<date>\w+\s+\d{1,2}\s+(?:\d{4})?)(?<yearortime>\d{1,2}:\d{2})?\s+(?<filename>.+?)\s?$ If the year is the same, then it will show time, otherwise it will show year. That is by design. If you need an accurate timestamp, use WebRequestMethods.Ftp.GetDateTimestamp.Ratcliffe
Very useful! I ended up modifying it a bit though, adding separate groups for all date fields so I could paste all parts together exactly as "MMM dd yyyy hh:mm" (substituting missing fields with the current year and "00:00" if needed) and parse it with ParseExact. I also changed the permissions group to capture all 9 permissions.Hokum
Whoops, that should've been "HH:mm", of course. Anyway, my final adapted regex was @"^([d-])((?:[rwxt-]{3}){3})\s+\d{1,}\s+.*?(\d{1,})\s+(\w+)\s+(\d{1,2})\s+(\d{4})?(\d{1,2}:\d{2})?\s+(.+?)\s?$"Hokum
P
13

For this specific listing, the following code will do:

var request = (FtpWebRequest)WebRequest.Create("ftp://ftp.example.com/");
request.Credentials = new NetworkCredential("user", "password");
request.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
var reader = new StreamReader(request.GetResponse().GetResponseStream());

string pattern =
    @"^([\w-]+)\s+(\d+)\s+(\w+)\s+(\w+)\s+(\d+)\s+" +
    @"(\w+\s+\d+\s+\d+|\w+\s+\d+\s+\d+:\d+)\s+(.+)$";
Regex regex = new Regex(pattern);
IFormatProvider culture = CultureInfo.GetCultureInfo("en-us");
string[] hourMinFormats =
    new[] { "MMM dd HH:mm", "MMM dd H:mm", "MMM d HH:mm", "MMM d H:mm" };
string[] yearFormats =
    new[] { "MMM dd yyyy", "MMM d yyyy" };

while (!reader.EndOfStream)
{
    string line = reader.ReadLine();
    Match match = regex.Match(line);
    string permissions = match.Groups[1].Value;
    int inode = int.Parse(match.Groups[2].Value, culture);
    string owner = match.Groups[3].Value;
    string group = match.Groups[4].Value;
    long size = long.Parse(match.Groups[5].Value, culture);
    string s = Regex.Replace(match.Groups[6].Value, @"\s+", " ");
    
    string[] formats = (s.IndexOf(':') >= 0) ? hourMinFormats : yearFormats;
    var modified = DateTime.ParseExact(s, formats, culture, DateTimeStyles.None);
    string name = match.Groups[7].Value;

    Console.WriteLine(
        "{0,-16} permissions = {1}  size = {2, 9}  modified = {3}",
        name, permissions, size, modified.ToString("yyyy-MM-dd HH:mm"));
}

You will get (as of year 2016):

bin              permissions = d--x--x--x  size =      4096  modified = 2002-03-07 00:00
TEST.TXT         permissions = -rw-r--r--  size =    659450  modified = 2016-06-15 05:07
TEST03-05.TXT    permissions = -rw-r--r--  size = 101786380  modified = 2008-09-08 00:00
dropoff          permissions = drwxrwxr-x  size =      4096  modified = 2016-05-06 12:24

But, actually trying to parse the listing returned by the ListDirectoryDetails is not the right way to go.

You want to use an FTP client that supports the modern MLSD command that returns a directory listing in a machine-readable format specified in the RFC 3659. Parsing the human-readable format returned by the ancient LIST command (used internally by the FtpWebRequest for its ListDirectoryDetails method) should be used as the last resort option, when talking to obsolete FTP servers, that do not support the MLSD command (like the Microsoft IIS FTP server).

Many servers use a different format for the LIST command response. Particularly IIS can use DOS format. See C# class to parse WebRequestMethods.Ftp.ListDirectoryDetails FTP response.


For example with WinSCP .NET assembly, you can use its Session.ListDirectory or Session.EnumerateRemoteFiles methods.

They internally use the MLSD command, but can fall back to the LIST command and support dozens of different human-readable listing formats.

The returned listing is presented as collection of RemoteFileInfo instances with properties like:

  • Name
  • LastWriteTime (with correct timezone)
  • Length
  • FilePermissions (parsed into individual rights)
  • Group
  • Owner
  • IsDirectory
  • IsParentDirectory
  • IsThisDirectory

(I'm the author of WinSCP)


Most other 3rd party libraries will do the same. Using the FtpWebRequest class is not reliable for this purpose. Unfortunately, there's no other built-in FTP client in the .NET framework.

Pulpiteer answered 14/10, 2016 at 14:40 Comment(0)
H
3

Building on the regex idea of Ryan Conrad, this is my final reading code:

protected static Regex m_FtpListingRegex = new Regex(@"^([d-])((?:[rwxt-]{3}){3})\s+(\d{1,})\s+(\w+)?\s+(\w+)?\s+(\d{1,})\s+(\w+)\s+(\d{1,2})\s+(\d{4})?(\d{1,2}:\d{2})?\s+(.+?)\s?$",
            RegexOptions.Compiled | RegexOptions.Multiline | RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
protected static readonly String Timeformat = "MMM dd yyyy HH:mm";

/// <summary>
/// Handles file info given in the form of a string in standard unix ls output format.
/// </summary>
/// <param name="filesListing">The file listing string.</param>
/// <returns>A list of FtpFileInfo objects</returns>
public static List<FtpFileInfo> GetFilesListFromFtpListingUnix(String filesListing)
{
    List<FtpFileInfo> files = new List<FtpFileInfo>();
    MatchCollection matches = m_FtpListingRegex.Matches(filesListing);
    if (matches.Count == 0 && filesListing.Trim('\r','\n','\t',' ').Length != 0)
        return null; // parse error. Could throw some kind of exception here too.
    foreach (Match match in matches)
    {
        FtpFileInfo fileInfo = new FtpFileInfo();
        Char dirchar = match.Groups[1].Value.ToLowerInvariant()[0];
        fileInfo.IsDirectory = dirchar == 'd';
        fileInfo.Permissions = match.Groups[2].Value.ToCharArray();
        // No clue what "inodes" actually means...
        Int32 inodes;
        fileInfo.NrOfInodes = Int32.TryParse(match.Groups[3].Value, out inodes) ? inodes : 1;
        fileInfo.User = match.Groups[4].Success ? match.Groups[4].Value : null;
        fileInfo.Group = match.Groups[5].Success ? match.Groups[5].Value : null;
        Int64 fileSize;
        Int64.TryParse(match.Groups[6].Value, out fileSize);
        fileInfo.FileSize = fileSize;
        String month = match.Groups[7].Value;
        String day = match.Groups[8].Value.PadLeft(2, '0');
        String year = match.Groups[9].Success ? match.Groups[9].Value : DateTime.Now.Year.ToString(CultureInfo.InvariantCulture);
        String time = match.Groups[10].Success ? match.Groups[10].Value.PadLeft(5, '0') : "00:00";
        String timeString = month + " " + day + " " + year + " " + time;
        DateTime lastModifiedDate;
        if (!DateTime.TryParseExact(timeString, Timeformat, CultureInfo.InvariantCulture, DateTimeStyles.None, out lastModifiedDate))
            lastModifiedDate = DateTime.MinValue;
        fileInfo.LastModifiedDate = lastModifiedDate;
        fileInfo.FileName = match.Groups[11].Value;
        files.Add(fileInfo);
    }
    return files;
}

And the FtpFileInfo class that's filled:

public class FtpFileInfo
{
    public Boolean IsDirectory { get; set; }
    public Char[] Permissions { get; set; }
    public Int32 NrOfInodes { get; set; }
    public String User { get; set; }
    public String Group { get; set; }
    public Int64 FileSize { get; set; }
    public DateTime LastModifiedDate { get; set; }
    public String FileName { get; set; }
}
Hokum answered 6/9, 2017 at 8:10 Comment(0)
S
1

This is my algorithm to get the File/Dir name, Date Created, Attribute(File/Dir), Size. Hope this helps...

        FtpWebRequest _fwr = FtpWebRequest.Create(uri) as FtpWebRequest     
        _fwr.Credentials = cred;
        _fwr.UseBinary = true;
        _fwr.UsePassive = true;
        _fwr.KeepAlive = true;
        _fwr.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
        StreamReader _sr = new StreamReader(_fwr.GetResponse().GetResponseStream());

        List<object> _dirlist = new List<object>();
        List<object> _attlist = new List<object>();
        List<object> _datelist = new List<object>();
        List<long> _szlist = new List<long>();
        while (!_sr.EndOfStream)
        {
            string[] buf = _sr.ReadLine().Split(' ');
            //string Att, Dir;
            int numcnt = 0, offset = 4; ;
            long sz = 0;
            for (int i = 0; i < buf.Length; i++)
            {
                //Count the number value markers, first before the ftp markers and second
                //the file size.
                if (long.TryParse(buf[i], out sz)) numcnt++;
                if (numcnt == 2)
                {
                    //Get the attribute
                    string cbuf = "", dbuf = "", abuf = "";
                    if (buf[0][0] == 'd') abuf = "Dir"; else abuf = "File";
                    //Get the Date
                    if (!buf[i+3].Contains(':')) offset++;
                    for (int j = i + 1; j < i + offset; j++)
                    {
                        dbuf += buf[j];
                        if (j < buf.Length - 1) dbuf += " ";
                    }
                    //Get the File/Dir name
                    for (int j = i + offset; j < buf.Length; j++)
                    {
                        cbuf += buf[j];
                        if (j < buf.Length - 1) cbuf += " ";
                    }
                    //Store to a list.
                    _dirlist.Add(cbuf);
                    _attlist.Add(abuf);
                    _datelist.Add(dbuf);
                    _szlist.Add(sz);

                    offset = 0;
                    break;
                }
            }
        }
Sweep answered 13/8, 2013 at 6:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.