Using MultipartPostHandler to POST form-data with Python
Asked Answered
O

6

51

Problem: When POSTing data with Python's urllib2, all data is URL encoded and sent as Content-Type: application/x-www-form-urlencoded. When uploading files, the Content-Type should instead be set to multipart/form-data and the contents be MIME-encoded.

To get around this limitation some sharp coders created a library called MultipartPostHandler which creates an OpenerDirector you can use with urllib2 to mostly automatically POST with multipart/form-data. A copy of this library is here: MultipartPostHandler doesn't work for Unicode files

I am new to Python and am unable to get this library to work. I wrote out essentially the following code. When I capture it in a local HTTP proxy, I can see that the data is still URL encoded and not multi-part MIME-encoded. Please help me figure out what I am doing wrong or a better way to get this done. Thanks :-)

FROM_ADDR = '[email protected]'

try:
    data = open(file, 'rb').read()
except:
    print "Error: could not open file %s for reading" % file
    print "Check permissions on the file or folder it resides in"
    sys.exit(1)

# Build the POST request
url = "http://somedomain.com/?action=analyze"       
post_data = {}
post_data['analysisType'] = 'file'
post_data['executable'] = data
post_data['notification'] = 'email'
post_data['email'] = FROM_ADDR

# MIME encode the POST payload
opener = urllib2.build_opener(MultipartPostHandler.MultipartPostHandler)
urllib2.install_opener(opener)
request = urllib2.Request(url, post_data)
request.set_proxy('127.0.0.1:8080', 'http') # For testing with Burp Proxy

# Make the request and capture the response
try:
    response = urllib2.urlopen(request)
    print response.geturl()
except urllib2.URLError, e:
    print "File upload failed..."

EDIT1: Thanks for your response. I'm aware of the ActiveState httplib solution to this (I linked to it above). I'd rather abstract away the problem and use a minimal amount of code to continue using urllib2 how I have been. Any idea why the opener isn't being installed and used?

Operation answered 25/3, 2009 at 5:14 Comment(0)
O
61

It seems that the easiest and most compatible way to get around this problem is to use the 'poster' module.

# test_client.py
from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
import urllib2

# Register the streaming http handlers with urllib2
register_openers()

# Start the multipart/form-data encoding of the file "DSC0001.jpg"
# "image1" is the name of the parameter, which is normally set
# via the "name" parameter of the HTML <input> tag.

# headers contains the necessary Content-Type and Content-Length
# datagen is a generator object that yields the encoded parameters
datagen, headers = multipart_encode({"image1": open("DSC0001.jpg")})

# Create the Request object
request = urllib2.Request("http://localhost:5000/upload_image", datagen, headers)
# Actually do the request, and get the response
print urllib2.urlopen(request).read()

This worked perfect and I didn't have to muck with httplib. The module is available here: http://atlee.ca/software/poster/index.html

Operation answered 27/3, 2009 at 1:31 Comment(8)
This is exactly what I needed! Kudos.Cadmar
I know this is an old post, but I'm getting this from poster: AttributeError: multipart_yielder instance has no attribute '__len__' wondering if anyone else is having this prob.Complaisant
@nalroff You hadn't called poster.streaminghttp.register_openers()Busman
i got en exception when using as is (TypeError: must be str, not generator), i fixed it by ''.join(datagen)Spoliate
Looks good, but how to submit also another data from form? For example, I have auth token, and few parameters such as file description, date when file should be deleted and so on.Bergman
Once the request sent to the url defined, how can we read the file data? When I use request.form['file'], I'm getting a unicode string which states the content-type as application/octet-stream but I couldn't find how to get the file from request object.Goings
I'm still getting the AttributeError in spite of calling register_openers. Any ideas?Lindquist
For a Python3 implementation, with either requests or urllib.request, have a look at: https://mcmap.net/q/354592/-how-do-i-replicate-a-curl-with-f-in-python-3Triumph
U
43

Found this recipe to post multipart using httplib directly (no external libraries involved)

import httplib
import mimetypes

def post_multipart(host, selector, fields, files):
    content_type, body = encode_multipart_formdata(fields, files)
    h = httplib.HTTP(host)
    h.putrequest('POST', selector)
    h.putheader('content-type', content_type)
    h.putheader('content-length', str(len(body)))
    h.endheaders()
    h.send(body)
    errcode, errmsg, headers = h.getreply()
    return h.file.read()

def encode_multipart_formdata(fields, files):
    LIMIT = '----------lImIt_of_THE_fIle_eW_$'
    CRLF = '\r\n'
    L = []
    for (key, value) in fields:
        L.append('--' + LIMIT)
        L.append('Content-Disposition: form-data; name="%s"' % key)
        L.append('')
        L.append(value)
    for (key, filename, value) in files:
        L.append('--' + LIMIT)
        L.append('Content-Disposition: form-data; name="%s"; filename="%s"' % (key, filename))
        L.append('Content-Type: %s' % get_content_type(filename))
        L.append('')
        L.append(value)
    L.append('--' + LIMIT + '--')
    L.append('')
    body = CRLF.join(L)
    content_type = 'multipart/form-data; boundary=%s' % LIMIT
    return content_type, body

def get_content_type(filename):
    return mimetypes.guess_type(filename)[0] or 'application/octet-stream'
Undersurface answered 25/3, 2009 at 11:29 Comment(3)
I think this approach is more "standalone" because it requires no new modules and can be "compiled" with py2exe to be run on windows, as an *.exe file. Adding 'poster' module is OK too but did not work work me. This method was best for me. Hurray to the author!Arsenault
Does the file "value" need to be encoded somehow or is it just the pure bytestream?Croak
@Arsenault Can someone explain this solution please.Radix
A
32

Just use python-requests, it will set proper headers and do upload for you:

import requests 
files = {"form_input_field_name": open("filename", "rb")}
requests.post("http://httpbin.org/post", files=files)
Aerostatic answered 8/8, 2013 at 8:4 Comment(1)
Specifying the name of the html input field is critical! ThanksHumes
W
1

I ran into the same problem and I needed to do a multipart form post without using external libraries. I wrote a whole blogpost about the issues I ran into.

I ended up using a modified version of http://code.activestate.com/recipes/146306/. The code in that url actually just appends the content of the file as a string, which can cause problems with binary files. Here's my working code.

import mimetools
import mimetypes
import io
import http
import json


form = MultiPartForm()
form.add_field("form_field", "my awesome data")

# Add a fake file     
form.add_file(key, os.path.basename(filepath),
    fileHandle=codecs.open("/path/to/my/file.zip", "rb"))

# Build the request
url = "http://www.example.com/endpoint"
schema, netloc, url, params, query, fragments = urlparse.urlparse(url)

try:
    form_buffer =  form.get_binary().getvalue()
    http = httplib.HTTPConnection(netloc)
    http.connect()
    http.putrequest("POST", url)
    http.putheader('Content-type',form.get_content_type())
    http.putheader('Content-length', str(len(form_buffer)))
    http.endheaders()
    http.send(form_buffer)
except socket.error, e:
    raise SystemExit(1)

r = http.getresponse()
if r.status == 200:
    return json.loads(r.read())
else:
    print('Upload failed (%s): %s' % (r.status, r.reason))

class MultiPartForm(object):
    """Accumulate the data to be used when posting a form."""

    def __init__(self):
        self.form_fields = []
        self.files = []
        self.boundary = mimetools.choose_boundary()
        return

    def get_content_type(self):
        return 'multipart/form-data; boundary=%s' % self.boundary

    def add_field(self, name, value):
        """Add a simple field to the form data."""
        self.form_fields.append((name, value))
        return

    def add_file(self, fieldname, filename, fileHandle, mimetype=None):
        """Add a file to be uploaded."""
        body = fileHandle.read()
        if mimetype is None:
            mimetype = mimetypes.guess_type(filename)[0] or 'application/octet-stream'
        self.files.append((fieldname, filename, mimetype, body))
        return

    def get_binary(self):
        """Return a binary buffer containing the form data, including attached files."""
        part_boundary = '--' + self.boundary

        binary = io.BytesIO()
        needsCLRF = False
        # Add the form fields
        for name, value in self.form_fields:
            if needsCLRF:
                binary.write('\r\n')
            needsCLRF = True

            block = [part_boundary,
              'Content-Disposition: form-data; name="%s"' % name,
              '',
              value
            ]
            binary.write('\r\n'.join(block))

        # Add the files to upload
        for field_name, filename, content_type, body in self.files:
            if needsCLRF:
                binary.write('\r\n')
            needsCLRF = True

            block = [part_boundary,
              str('Content-Disposition: file; name="%s"; filename="%s"' % \
              (field_name, filename)),
              'Content-Type: %s' % content_type,
              ''
              ]
            binary.write('\r\n'.join(block))
            binary.write('\r\n')
            binary.write(body)


        # add closing boundary marker,
        binary.write('\r\n--' + self.boundary + '--\r\n')
        return binary
Warr answered 29/3, 2015 at 17:49 Comment(0)
S
0

What a coincide, 2 years and 6 months ago I created the project

https://pypi.python.org/pypi/MultipartPostHandler2, that fixes MultipartPostHandler for utf-8 systems. I also have done some minor improvements, you are welcome to test it :)

Seineetmarne answered 25/9, 2014 at 1:18 Comment(4)
Hey, bro! I think your choosen package name is making me to do not check it.Diwan
sorry didn't understand , I couldn't modify MultipartPostHandler, so I have to call it MultipartPostHandler2Punctuate
pypi supports multiple versions per package, if the name is already taken, you should choose another good package name. Pypi is our's. We are all responsible to what we do with itDiwan
yeah I choose MultipartPostHandler2 , because I hadn't access to MultipartPostHandler to fix it , MultipartPostHandler2 is an second version of MultipartPostHandler , the code is the same just with some fixes.Punctuate
W
-1

To answer the OP's question of why the original code didn't work, the handler passed in wasn't an instance of a class. The line

# MIME encode the POST payload
opener = urllib2.build_opener(MultipartPostHandler.MultipartPostHandler)

should read

opener = urllib2.build_opener(MultipartPostHandler.MultipartPostHandler())
Wichita answered 27/11, 2016 at 13:27 Comment(1)
If you look at the original source code : pipe.rcc.fsu.edu/PostHandler/MultipartPostHandler.py you'll see an example of how to use the library.Chaddie

© 2022 - 2024 — McMap. All rights reserved.