Argument is URL or path
Asked Answered
S

3

17

What is the standard practice in Python when I have a command-line application taking one argument which is

URL to a web page

or

path to a HTML file somewhere on disk

(only one)

is sufficient the code?

if "http://" in sys.argv[1]:
  print "URL"
else:
  print "path to file"
Southeastwards answered 21/10, 2011 at 13:8 Comment(0)
S
3

Depends on what the program must do. If it just prints whether it got a URL, sys.argv[1].startswith('http://') might do. If you must actually use the URL for something useful, do

from urllib2 import urlopen

try:
    f = urlopen(sys.argv[1])
except ValueError:  # invalid URL
    f = open(sys.argv[1])
Scythe answered 21/10, 2011 at 13:14 Comment(7)
The open() throws exception as well.Salerno
Don't forget except IndexError: as the user might not specify an argument, which will throw an index error. Or am I wrong?Myrt
@Griffin: I've considered that a separate problem for the purpose of this answer.Scythe
@rplnt: yes, and the OP might or might not want to check for IOError. I'm just showing how urlopen and open may be combined, not how to tackle the larger problem. This snippet is enough for writing a generic open_url_or_file function that simply re-raises what it gets from open.Scythe
@larsmans That may be, but from the looks of it the OP doesn't know how to use exception handlers. I don't see any reason not to include it since it won't work if an argument isn't specified.Myrt
@FredFoo's implementation is the most correct exception handling. Only handle the exceptions you know how to handle, otherwise let the caller handle exceptions. In this case, if there's a file open or read or permissions error, etc. Let the caller know rather than catching and hiding the exceptionClaimant
Note that if argument is url with 404 error, then the code slows down.Behavior
C
24
import urlparse

def is_url(url):
    return urlparse.urlparse(url).scheme != ""
is_url(sys.argv[1])
Chigetai answered 30/3, 2013 at 1:43 Comment(3)
Python 3 version: import urllib urllib.parse.urlparse(url).scheme != ""Oliverolivera
This returns true for Windows file paths like c:\users\user\foo.txt.Gibber
Better to check if urlparse(uri).scheme in ('http', 'https',) because of Windows uri or uri starts with file://.Mia
S
3

Depends on what the program must do. If it just prints whether it got a URL, sys.argv[1].startswith('http://') might do. If you must actually use the URL for something useful, do

from urllib2 import urlopen

try:
    f = urlopen(sys.argv[1])
except ValueError:  # invalid URL
    f = open(sys.argv[1])
Scythe answered 21/10, 2011 at 13:14 Comment(7)
The open() throws exception as well.Salerno
Don't forget except IndexError: as the user might not specify an argument, which will throw an index error. Or am I wrong?Myrt
@Griffin: I've considered that a separate problem for the purpose of this answer.Scythe
@rplnt: yes, and the OP might or might not want to check for IOError. I'm just showing how urlopen and open may be combined, not how to tackle the larger problem. This snippet is enough for writing a generic open_url_or_file function that simply re-raises what it gets from open.Scythe
@larsmans That may be, but from the looks of it the OP doesn't know how to use exception handlers. I don't see any reason not to include it since it won't work if an argument isn't specified.Myrt
@FredFoo's implementation is the most correct exception handling. Only handle the exceptions you know how to handle, otherwise let the caller handle exceptions. In this case, if there's a file open or read or permissions error, etc. Let the caller know rather than catching and hiding the exceptionClaimant
Note that if argument is url with 404 error, then the code slows down.Behavior
M
1

Larsmans might work, but it doesn't check whether the user actually specified an argument or not.

import urllib
import sys

try:
    arg = sys.argv[1]
except IndexError:
    print "Usage: "+sys.argv[0]+" file/URL"
    sys.exit(1)

try:
    site = urllib.urlopen(arg)
except ValueError:
    file = open(arg)
Myrt answered 21/10, 2011 at 14:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.