Passing a list as a url value to urlopen
Asked Answered
M

2

9

Motivation

Motivated by this problem - the OP was using urlopen() and accidentally passed a sys.argv list instead of a string as a url. This error message was thrown:

AttributeError: 'list' object has no attribute 'timeout'

Because of the way urlopen was written, the error message itself and the traceback is not very informative and may be difficult to understand especially for a Python newcomer:

Traceback (most recent call last):
  File "test.py", line 15, in <module>
    get_category_links(sys.argv)
  File "test.py", line 10, in get_category_links
    response = urlopen(url)
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 420, in open
    req.timeout = timeout
AttributeError: 'list' object has no attribute 'timeout'

Problem

Here is the shortened code I'm working with:

try:
    from urllib.request import urlopen
except ImportError:
    from urllib2 import urlopen

import sys


def get_category_links(url):
    response = urlopen(url)
    # do smth with response
    print(response)


get_category_links(sys.argv)

I'm trying to think whether this kind of an error can be caught statically with either smart IDEs like PyCharm, static code analysis tools like flake8 or pylint, or with language features like type annotations.

But, I'm failing to detect the problem:

  • it is probably too specific for flake8 and pylint to catch - they don't warn about the problem

  • PyCharm does not warn about sys.argv being passed into urlopen, even though, if you "jump to source" of sys.argv it is defined as:

     argv = [] # real value of type <class 'list'> skipped
    
  • if I annotate the function parameter as a string and pass sys.argv, no warnings as well:

     def get_category_links(url: str) -> None:
         response = urlopen(url)
         # do smth with response
    
    
     get_category_links(sys.argv)
    

Question

Is it possible to catch this problem statically (without actually executing the code)?

Metcalf answered 19/6, 2017 at 15:54 Comment(3)
Mypy catches it right away: error: Argument 1 to "get_category_links" has incompatible type List[str]; expected "str". See of there's an mypy extension available for PyCharm, else you can run mypy with your tests to catch such issues. When I passed a list to urlopen() I got: error: Argument 1 to "urlopen" has incompatible type List[str]; expected "Union[str, Request]"(github.com/python/typeshed/blob/master/stdlib/3/urllib/…).Adventist
@AshwiniChaudhary yup, awesome - confirmed mypy catches the problem! I wonder what prevents PyCharm to determine that sys.argv is a list and what helps mypy to know that it is. Cause, if I do hardcode a list instead of sys.argv - PyCharm finally warns about the mistype. Thanks so much - it deserves to be an actual answer.Metcalf
The reason why mypy know its type is because of its definition in the typeshed: github.com/python/typeshed/blob/master/stdlib/3/sys.pyi#L18Adventist
A
6

Instead of keeping it editor specific, you can use mypy to analyze your code. This way it will run on all dev environments instead of just for those who use PyCharm.

from urllib.request import urlopen
import sys


def get_category_links(url: str) -> None:
    response = urlopen(url)
    # do smth with response


get_category_links(sys.argv)
response = urlopen(sys.argv)

The issues pointed out by mypy for the above code:

error: Argument 1 to "get_category_links" has incompatible type List[str]; expected "str"
error: Argument 1 to "urlopen" has incompatible type List[str]; expected "Union[str, Request]"

Mypy here can guess the type of sys.argv because of its definition in its stub file. Right now some standard library modules are still missing from typeshed though, so you will have to either contribute them or ignore the errors related till they get added :-).


When to run mypy?

  1. To catch such errors you can run mypy on the files with annotations with your tests in your CI tool. Running it on all files in project may take some time, for a small project it is your choice.

  2. Add a pre-commit hook that runs mypy on staged files and points out issues right away(could be a little annoying to the dev if it takes a while).

Adventist answered 19/6, 2017 at 16:24 Comment(1)
I've heard a lot of awesome things about mypy, it's finally time to add it to the toolbelt! Thank you.Metcalf
D
0

Firstly, you need to check whether the url type is string or not and if string then check for ValueError exception(Valid url)

import sys
from urllib2 import urlopen

def get_category_links(url):
  if type(url) != type(""):  #Check if url is string or not
      print "Please give string url"
      return
  try:
      response = urlopen(url)
      # do smth with response
      print(response)
  except ValueError:        #If url is string but invalid
      print "Bad URL"

get_category_links(sys.argv)
Descendant answered 28/6, 2017 at 8:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.