How do I handle file upload via PUT request in Django?
Asked Answered
G

3

19

I'm implementing a REST-style interface and would like to be able to create (via upload) files via a HTTP PUT request. I would like to create either a TemporaryUploadedFile or a InMemoryUploadedFile which I can then pass to my existing FileField and .save() on the object that is part of the model, thereby storing the file.

I'm not quite sure about how to handle the file upload part. Specifically, this being a put request, I do not have access to request.FILES since it does not exist in a PUT request.

So, some questions:

  • Can I leverage existing functionality in the HttpRequest class, specifically the part that handles file uploads? I know a direct PUT is not a multipart MIME request, so I don't think so, but it is worth asking.
  • How can I deduce the mime type of what is being sent? If I've got it right, a PUT body is simply the file without prelude. Do I therefore require that the user specify the mime type in their headers?
  • How do I extend this to large amounts of data? I don't want to read it all into memory since that is highly inefficient. Ideally I'd do what TemporaryUploadFile and related code does - write it part at a time?

I've taken a look at this code sample which tricks Django into handling PUT as a POST request. If I've got it right though, it'll only handle form encoded data. This is REST, so the best solution would be to not assume form encoded data will exist. However, I'm happy to hear appropriate advice on using mime (not multipart) somehow (but the upload should only contain a single file).

Django 1.3 is acceptable. So I can either do something with request.raw_post_data or request.read() (or alternatively some other better method of access). Any ideas?

Gasteropod answered 20/4, 2011 at 14:28 Comment(0)
M
8

Django 1.3 is acceptable. So I can either do something with request.raw_post_data or request.read() (or alternatively some other better method of access). Any ideas?

You don't want to be touching request.raw_post_data - that implies reading the entire request body into memory, which if you're talking about file uploads might be a very large amount, so request.read() is the way to go. You can do this with Django <= 1.2 as well, but it means digging around in HttpRequest to figure out the the right way to use the private interfaces, and it's a real drag to then ensure your code will also be compatible with Django >= 1.3.

I'd suggest that what you want to do is to replicate the existing file upload behaviour parts of the MultiPartParser class:

  1. Retrieve the upload handers from request.upload_handlers (Which by default will be MemoryFileUploadHandler & TemporaryFileUploadHandler)
  2. Determine the request's content length (Search of Content-Length in HttpRequest or MultiPartParser to see the right way to do this.)
  3. Determine the uploaded file's filename, either by letting the client specify this using the last path part of the url, or by letting the client specify it in the "filename=" part of the Content-Disposition header.
  4. For each handler, call handler.new_file with the relevant args (mocking up a field name)
  5. Read the request body in chunks using request.read() and calling handler.receive_data_chunk() for each chunk.
  6. For each handler call handler.file_complete(), and if it returns a value, that's the uploaded file.

How can I deduce the mime type of what is being sent? If I've got it right, a PUT body is simply the file without prelude. Do I therefore require that the user specify the mime type in their headers?

Either let the client specify it in the Content-Type header, or use python's mimetype module to guess the media type.

I'd be interested to find out how you get on with this - it's something I've been meaning to look into myself, be great if you could comment to let me know how it goes!


Edit by Ninefingers as requested, this is what I did and is based entirely on the above and the django source.

upload_handlers = request.upload_handlers
content_type   = str(request.META.get('CONTENT_TYPE', ""))
content_length = int(request.META.get('CONTENT_LENGTH', 0))

if content_type == "":
    return HttpResponse(status=400)
if content_length == 0:
    # both returned 0
    return HttpResponse(status=400)

content_type = content_type.split(";")[0].strip()
try:
    charset = content_type.split(";")[1].strip()
except IndexError:
    charset = ""

# we can get the file name via the path, we don't actually
file_name = path.split("/")[-1:][0]
field_name = file_name

Since I'm defining the API here, cross browser support isn't a concern. As far as my protocol is concerned, not supplying the correct information is a broken request. I'm in two minds as to whether I want say image/jpeg; charset=binary or if I'm going to allow non-existent charsets. In any case, I'm putting setting Content-Type validly as a client-side responsibility.

Similarly, for my protocol, the file name is passed in. I'm not sure what the field_name parameter is for and the source didn't give many clues.

What happens below is actually much simpler than it looks. You ask each handler if it will handle the raw input. As the author of the above states, you've got MemoryFileUploadHandler & TemporaryFileUploadHandler by default. Well, it turns out MemoryFileUploadHandler will when asked to create a new_file decide whether it will or not handle the file (based on various settings). If it decides it's going to, it throws an exception, otherwise it won't create the file and lets another handler take over.

I'm not sure what the purpose of counters was, but I've kept it from the source. The rest should be straightforward.

counters = [0]*len(upload_handlers)

for handler in upload_handlers:
    result = handler.handle_raw_input("",request.META,content_length,"","")

for handler in upload_handlers:

    try:
        handler.new_file(field_name, file_name, 
                         content_type, content_length, charset)
    except StopFutureHandlers:
        break

for i, handler in enumerate(upload_handlers):
    while True:
        chunk = request.read(handler.chunk_size)
        if chunk:

            handler.receive_data_chunk(chunk, counters[i])
            counters[i] += len(chunk)
        else:
            # no chunk
            break

for i, handler in enumerate(upload_handlers):
    file_obj = handler.file_complete(counters[i])
    if not file_obj:
        # some indication this didn't work?
        return HttpResponse(status=500) 
    else:
        # handle file obj!
Mcnelly answered 23/4, 2011 at 1:19 Comment(5)
+1 thanks, that sounds like it might do it. I'll give it a go when I'm back at work and I'll report back with the results.Gasteropod
it worked like a charm, I'll edit it into your answer at the end of the day.Gasteropod
as requested, I've edited my code into your answer. If you have any improvements, feel free to edit them in. I didn't want to answer my own question with work derived from yours, hence the edit rather than a post.Gasteropod
@Ninefingers Thanks for your sample code. I'm trying to get to grips with the uploading via PUT with Django. I don't really understand how the client provides the file name- will the PUT be something like /upload/SOMEFILENAME.EXT and you'll get this file name with file_name = path.split("/")[-1:][0] ? You started a comment on this block in your code, but I don't think it's finished. I assume you're not passing it via the "filename=" part of the Content-Disposition header, right? Thanks so much for your help.Pshaw
@Pshaw Ah yes, sorry about that. In my code I was using a REST-based API, such that the file name is part of the URL - e.g. I would PUT to /path/to/file.txt. As such, I don't need to specify a filename in HTTP headers. However, if you were doing a PUT to say /files/upload you would - you can fetch content disposition I think with request.META.get("Content-Disposition", None) and then search that for filename=(P?<name>\.*) as a regexp - the result should be a named match. That's off the top of my head and might require some tweaking - hope that helps.Gasteropod
M
3

Newer Django versions allow for handling this a lot easier thanks to https://gist.github.com/g00fy-/1161423

I modified the given solution like this:

if request.content_type.startswith('multipart'):
    put, files = request.parse_file_upload(request.META, request)
    request.FILES.update(files)
    request.PUT = put.dict()
else:
    request.PUT = QueryDict(request.body).dict()

to be able to access files and other data like in POST. You can remove the calls to .dict() if you want your data to be read-only.

Manganin answered 28/2, 2017 at 15:50 Comment(1)
Simplest answer 👌Scarabaeoid
S
1

I hit this problem while working with Django 2.2, and was looking for something that just worked for uploading a file via PUT request.

from django.http import QueryDict
from django.http.multipartparser import MultiValueDict
from django.core.files.uploadhandler import (
    SkipFile,
    StopFutureHandlers,
    StopUpload,
)


class PutUploadMiddleware(object):
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        method = request.META.get("REQUEST_METHOD", "").upper()
        if method == "PUT":
            self.handle_PUT(request)
        return self.get_response(request)

    def handle_PUT(self, request):
        content_type = str(request.META.get("CONTENT_TYPE", ""))
        content_length = int(request.META.get("CONTENT_LENGTH", 0))
        file_name = request.path.split("/")[-1:][0]
        field_name = file_name
        content_type_extra = None

        if content_type == "":
            return HttpResponse(status=400)
        if content_length == 0:
            # both returned 0
            return HttpResponse(status=400)

        content_type = content_type.split(";")[0].strip()
        try:
            charset = content_type.split(";")[1].strip()
        except IndexError:
            charset = ""

        upload_handlers = request.upload_handlers

        for handler in upload_handlers:
            result = handler.handle_raw_input(
                request.body,
                request.META,
                content_length,
                boundary=None,
                encoding=None,
            )
        counters = [0] * len(upload_handlers)
        for handler in upload_handlers:
            try:
                handler.new_file(
                    field_name,
                    file_name,
                    content_type,
                    content_length,
                    charset,
                    content_type_extra,
                )
            except StopFutureHandlers:
                break

        for chunk in request:
            for i, handler in enumerate(upload_handlers):
                chunk_length = len(chunk)
                chunk = handler.receive_data_chunk(chunk, counters[i])
                counters[i] += chunk_length
                if chunk is None:
                    # Don't continue if the chunk received by
                    # the handler is None.
                    break

        for i, handler in enumerate(upload_handlers):
            file_obj = handler.file_complete(counters[i])
            if file_obj:
                # If it returns a file object, then set the files dict.
                request.FILES.appendlist(file_name, file_obj)
                break
        any(handler.upload_complete() for handler in upload_handlers)
Simpson answered 16/5, 2020 at 15:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.