Get the latest FTP folder name in Python
Asked Answered
A

1

1

I am trying to write a script to get the latest file from the latest sub- directory of FTP server in Python. My problem is I am unable to figure out the latest sub-directory. There are two options available, sub-directories have ctime available. Also in directory name date is mentioned that on which date directory was created. But I do not know how to get the name of the latest directory. I have figured out the following way (hoping for the server side to be sorted by latest ctime). I have done it the following way which will work if first object is the latest directory.

import ftplib 
import os
import time

ftp = ftplib.FTP('test.rebex.net','demo', 'password')
ftp.cwd(str((ftp.nlst())[0])) #if directory is sorted in descending order by date.

But is there any way where I will find the exact directory by ctime or by date in directory name ?

Thanks a lot guys.

Anh answered 8/8, 2018 at 22:47 Comment(0)
D
3

If your FTP server supports MLSD command, a solution is easy:

  • If you want to base the decision on a modification timestamp:

    entries = list(ftp.mlsd())
    # Only interested in directories
    entries = [entry for entry in entries if entry[1]["type"] == "dir"]
    # Sort by timestamp
    entries.sort(key = lambda entry: entry[1]['modify'], reverse = True)
    # Pick the first one
    latest_name = entries[0][0]
    print(latest_name)
    
  • If you want to use a file name:

    # Sort by filename
    entries.sort(key = lambda entry: entry[0], reverse = True)
    

If you need to rely on an obsolete LIST command, you have to parse a proprietary listing it returns.

A common *nix listing is like:

drw-r--r-- 1 user group           4096 Mar 26  2018 folder1-20180326
drw-r--r-- 1 user group           4096 Jun 18 11:21 folder2-20180618
-rw-r--r-- 1 user group           4467 Mar 27  2018 file-20180327.zip
-rw-r--r-- 1 user group         124529 Jun 18 15:31 file-20180618.zip

With a listing like this, this code will do:

  • If you want to base the decision on a modification timestamp:

    lines = []
    ftp.dir("", lines.append)
    
    latest_time = None
    latest_name = None
    
    for line in lines:
        tokens = line.split(maxsplit = 9)
        # Only interested in directories
        if tokens[0][0] == "d":
            time_str = tokens[5] + " " + tokens[6] + " " + tokens[7]
            time = parser.parse(time_str)
            if (latest_time is None) or (time > latest_time):
                latest_name = tokens[8]
                latest_time = time
    
    print(latest_name)
    
  • If you want to use a file name:

    lines = []
    ftp.dir("", lines.append)
    
    latest_name = None
    
    for line in lines:
        tokens = line.split(maxsplit = 9)
        # Only interested in directories
        if tokens[0][0] == "d":
            name = tokens[8]
            if (latest_name is None) or (name > latest_name):
                latest_name = name
    
    print(latest_name)
    

Some FTP servers may return . and .. entries in LIST results. You may need to filter those.


Partially based on: Python FTP get the most recent file by date.


If the folder does not contain any files, only subfolders, there are other easier options.

  • If you want to base the decision on a modification timestamp and the server supports non-standard -t switch, you can use:

    lines = ftp.nlst("-t")
    latest_name = lines[-1]
    

    See How to get files in FTP folder sorted by modification time

  • If you want to use a file name:

    lines = ftp.nlst()
    latest_name = max(lines)
    
Depot answered 9/8, 2018 at 11:10 Comment(1)
Thanks a lot my friend. You are a real Gem :) I am a beginner(1 month old python baby) and your solutions have really helped me to understand it.Anh

© 2022 - 2024 — McMap. All rights reserved.