How do I upload a file with metadata using a REST web service?
Asked Answered
I

7

297

I have a REST web service that currently exposes this URL:

http://server/data/media

where users can POST the following JSON:

{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873
}

in order to create a new Media metadata.

Now I need the ability to upload a file at the same time as the media metadata. What's the best way of going about this? I could introduce a new property called file and base64 encode the file, but I was wondering if there was a better way.

There's also using multipart/form-data like what a HTML form would send over, but I'm using a REST web service and I want to stick to using JSON if at all possible.

Imbroglio answered 15/10, 2010 at 0:21 Comment(1)
Sticking to using only JSON is not really required to have a RESTful web service. REST is basically just anything that follows the main principles of the HTTP methods and some other (arguably non-standardised) rules.Aventurine
E
229

I agree with Greg that a two phase approach is a reasonable solution, however I would do it the other way around. I would do:

POST http://server/data/media
body:
{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873
}

To create the metadata entry and return a response like:

201 Created
Location: http://server/data/media/21323
{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873,
    "ContentUrl": "http://server/data/media/21323/content"
}

The client can then use this ContentUrl and do a PUT with the file data.

The nice thing about this approach is when your server starts get weighed down with immense volumes of data, the url that you return can just point to some other server with more space/capacity. Or you could implement some kind of round robin approach if bandwidth is an issue.

Eolanda answered 15/10, 2010 at 1:26 Comment(11)
One advantage to sending the content first is that by the time the metadata exists, the content is already present. Ultimately the right answer depends on the organisation of the data in the system.Spears
Thanks, I marked this as the correct answer because this is what I wanted to do. Unfortunately, due to a weird business rule, we have to allow the upload to occur in any order (metadata first or file first). I was wondering if there was a way to combine the two in order to save the headache of dealing with both situations.Imbroglio
@Daniel If you POST the data file first, then you can take the URL returned in Location and add it to the ContentUrl attribute in the metadata. That way, when the server receives the metadata, if a ContentUrl exists then it already knows where the file is. If there is no ContentUrl, then it knows that it should create one.Eolanda
if you was to do the POST first, would you post to the same URL? (/server/data/media) or would you create another entry point for file-first uploads?Pretext
@Matt No. I would return a link header with rel="metadata" and it would tell me where to put the metadata.Eolanda
Say we wanted to upload multiple files, would it work by posting multiple meta data objects, like so -> { <meta_data_obj_1>, <meta_data_obj2 }, and in return get objects + there IDS { <meta_data_obj_1 + ID>, <meta_data_obj2 + ID> }. We would then simply PUT the file data to http://server/upload/file/<meta_data_obj_ID> one at a time? Is this how facebook and other major players do it?Truancy
Out of curiosity, @DarrellMiller, why do you recommend PUT over POST for sending file data? We are using ASP.NET Core 2.0 and many examples use POST for sending the file data. Is this just so that we are more closely following the W3C standards? Or, is there a technical reason for it?Tommi
This post SEEMs like a very elegant solution. But, what about the case where you need to pass across some authentication credentials or auth token? In this case, you would either have to pass them in the querystring or as headers, which really takes us back to the original problem. The original problem being that we need to upload some details with the file data. So, I can't really see how breaking the call up in to two separate parts POST->PUT solves the problem.Tommi
IMO this is not a perfect solution in the sense that it breaks data integrity. Meta data and payload should be in one transaction. The question didn't say anything about this, but I suppose this could be important for many other similar use cases. Making it two separate requests simply breaks this. Erik Allik's answer is a (much)better wayIndophenol
@Indophenol What if the metadata included the number of "likes" of an image? Would you treat it as a single resource then? Or more obviously, are you suggesting that if I wanted to edit the description of an image, I would need to re-upload the image? There are many cases where multi-part forms are the right solution. It is just not always the case.Eolanda
@DarrelMiller hmm... ok, make sense. I guess I had a narrow view. Though in my case( and I believe some of other people's cases) multipart form is still a better way, this could be a good solution for the cases you mentioned. Thanks for the reply. Really appreciated. Down vote removed. (update: can't remove downvote... sorry)Indophenol
A
130

Just because you're not wrapping the entire request body in JSON, doesn't meant it's not RESTful to use multipart/form-data to post both the JSON and the file(s) in a single request:

curl -F "metadata=<metadata.json" -F "[email protected]" http://example.com/add-file

on the server side:

class AddFileResource(Resource):
    def render_POST(self, request):
        metadata = json.loads(request.args['metadata'][0])
        file_body = request.args['file'][0]
        ...

to upload multiple files, it's possible to either use separate "form fields" for each:

curl -F "metadata=<metadata.json" -F "[email protected]" -F "[email protected]" http://example.com/add-file

...in which case the server code will have request.args['file1'][0] and request.args['file2'][0]

or reuse the same one for many:

curl -F "metadata=<metadata.json" -F "[email protected]" -F "[email protected]" http://example.com/add-file

...in which case request.args['files'] will simply be a list of length 2.

or pass multiple files through a single field:

curl -F "metadata=<metadata.json" -F "[email protected],some-other-file.tar.gz" http://example.com/add-file

...in which case request.args['files'] will be a string containing all the files, which you'll have to parse yourself — not sure how to do it, but I'm sure it's not difficult, or better just use the previous approaches.

The difference between @ and < is that @ causes the file to get attached as a file upload, whereas < attaches the contents of the file as a text field.

P.S. Just because I'm using curl as a way to generate the POST requests doesn't mean the exact same HTTP requests couldn't be sent from a programming language such as Python or using any sufficiently capable tool.

Aventurine answered 25/10, 2012 at 20:15 Comment(6)
I had been wondering about this approach myself, and why I hadn't seen anyone else put it forth yet. I agree, seems perfectly RESTful to me.Fertilizer
YES! This is very practical approach, and it isn't any less RESTful than using "application/json" as a content type for the whole request.Penates
..but that's only possible if you have the data in a .json file and upload it, which is not the caseCredulous
@mjolnic your comment is irrelevant: the cURL examples are just, well, examples; the answer explicitly states that you can use anything to send off the request... also, what prevents you from just writing curl -f 'metadata={"foo": "bar"}'?Aventurine
I'm using this approach because the accepted answer wouldn't work for the application I'm developing (the file cannot exist before the data and it adds unnecessary complexity to handle the case where the data is uploaded first and the file never uploads).Cauthen
this is the one.Autonomy
S
42

One way to approach the problem is to make the upload a two phase process. First, you would upload the file itself using a POST, where the server returns some identifier back to the client (an identifier might be the SHA1 of the file contents). Then, a second request associates the metadata with the file data:

{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873,
    "ContentID": "7a788f56fa49ae0ba5ebde780efe4d6a89b5db47"
}

Including the file data base64 encoded into the JSON request itself will increase the size of the data transferred by 33%. This may or may not be important depending on the overall size of the file.

Another approach might be to use a POST of the raw file data, but include any metadata in the HTTP request header. However, this falls a bit outside basic REST operations and may be more awkward for some HTTP client libraries.

Spears answered 15/10, 2010 at 0:36 Comment(3)
You can use Ascii85 increasing just by 1/4.Clayson
Any reference on why base64 increases the size that much?Nonjuror
@jam01: Coincidentally, I just saw something yesterday which answers the space question well: What is the space overhead of Base64 encoding?Spears
N
23

I don't understand why, over the course of eight years, no one has posted the easy answer. Rather than encode the file as base64, encode the json as a string. Then just decode the json on the server side.

In Javascript:

let formData = new FormData();
formData.append("file", myfile);
formData.append("myjson", JSON.stringify(myJsonObject));

POST it using Content-Type: multipart/form-data

On the server side, retrieve the file normally, and retrieve the json as a string. Convert the string to an object, which is usually one line of code no matter what programming language you use.

(Yes, it works great. Doing it in one of my apps.)

Noctilucent answered 17/5, 2019 at 22:59 Comment(3)
I'm way more surprised that no one expanded on Mike's answer, cause that's exactly how multipart stuff should be used: each part has it's own mime-type and DRF's multipart parser, should dispatch accordingly. Perhaps it's hard to create this type of envelope on the client side. I really should investigate...Anastigmatic
This is the best solution so far. Im wondering if we can unravel the JSON string values such that they appear as $_POST keys in PHP an HTML form would that contains both a file and a series of inputs.Impervious
I don't see how this is different than Erik's answer. The OP was talking about base64 encoding the file, not the metadata. It would be weird to base64 a json object but that's not what the OP was asking for. He wanted a way to send the file via content-type: application/jsonHeathenism
M
11

I realize this is a very old question, but hopefully this will help someone else out as I came upon this post looking for the same thing. I had a similar issue, just that my metadata was a Guid and int. The solution is the same though. You can just make the needed metadata part of the URL.

POST accepting method in your "Controller" class:

public Task<HttpResponseMessage> PostFile(string name, float latitude, float longitude)
{
    //See https://mcmap.net/q/55700/-how-to-accept-a-file-post for how to accept a file
    return null;
}

Then in whatever you're registering routes, WebApiConfig.Register(HttpConfiguration config) for me in this case.

config.Routes.MapHttpRoute(
    name: "FooController",
    routeTemplate: "api/{controller}/{name}/{latitude}/{longitude}",
    defaults: new { }
);
Maunsell answered 25/2, 2013 at 16:49 Comment(0)
H
10

If your file and its metadata creating one resource, its perfectly fine to upload them both in one request. Sample request would be :

POST https://target.com/myresources/resourcename HTTP/1.1

Accept: application/json

Content-Type: multipart/form-data; 

boundary=-----------------------------28947758029299

Host: target.com

-------------------------------28947758029299

Content-Disposition: form-data; name="application/json"

{"markers": [
        {
            "point":new GLatLng(40.266044,-74.718479), 
            "homeTeam":"Lawrence Library",
            "awayTeam":"LUGip",
            "markerImage":"images/red.png",
            "information": "Linux users group meets second Wednesday of each month.",
            "fixture":"Wednesday 7pm",
            "capacity":"",
            "previousScore":""
        },
        {
            "point":new GLatLng(40.211600,-74.695702),
            "homeTeam":"Hamilton Library",
            "awayTeam":"LUGip HW SIG",
            "markerImage":"images/white.png",
            "information": "Linux users can meet the first Tuesday of the month to work out harward and configuration issues.",
            "fixture":"Tuesday 7pm",
            "capacity":"",
            "tv":""
        },
        {
            "point":new GLatLng(40.294535,-74.682012),
            "homeTeam":"Applebees",
            "awayTeam":"After LUPip Mtg Spot",
            "markerImage":"images/newcastle.png",
            "information": "Some of us go there after the main LUGip meeting, drink brews, and talk.",
            "fixture":"Wednesday whenever",
            "capacity":"2 to 4 pints",
            "tv":""
        },
] }

-------------------------------28947758029299

Content-Disposition: form-data; name="name"; filename="myfilename.pdf"

Content-Type: application/octet-stream

%PDF-1.4
%
2 0 obj
<</Length 57/Filter/FlateDecode>>stream
x+r
26S00SI2P0Qn
F
!i\
)%[email protected]
[
endstream
endobj
4 0 obj
<</Type/Page/MediaBox[0 0 595 842]/Resources<</Font<</F1 1 0 R>>>>/Contents 2 0 R/Parent 3 0 R>>
endobj
1 0 obj
<</Type/Font/Subtype/Type1/BaseFont/Helvetica/Encoding/WinAnsiEncoding>>
endobj
3 0 obj
<</Type/Pages/Count 1/Kids[4 0 R]>>
endobj
5 0 obj
<</Type/Catalog/Pages 3 0 R>>
endobj
6 0 obj
<</Producer(iTextSharp 5.5.11 2000-2017 iText Group NV \(AGPL-version\))/CreationDate(D:20170630120636+02'00')/ModDate(D:20170630120636+02'00')>>
endobj
xref
0 7
0000000000 65535 f 
0000000250 00000 n 
0000000015 00000 n 
0000000338 00000 n 
0000000138 00000 n 
0000000389 00000 n 
0000000434 00000 n 
trailer
<</Size 7/Root 5 0 R/Info 6 0 R/ID [<c7c34272c2e618698de73f4e1a65a1b5><c7c34272c2e618698de73f4e1a65a1b5>]>>
%iText-5.5.11
startxref
597
%%EOF

-------------------------------28947758029299--
Haily answered 30/6, 2017 at 10:25 Comment(0)
G
1

To build on ccleve's answer, if you are using superagent / express / multer, on the front end side build your multipart request doing something like this:

superagent
    .post(url)
    .accept('application/json')
    .field('myVeryRelevantJsonData', JSON.stringify({ peep: 'Peep Peep!!!' }))
    .attach('myFile', file);

cf https://visionmedia.github.io/superagent/#multipart-requests.

On the express side, whatever was passed as field will end up in req.body after doing:

app.use(express.json({ limit: '3MB' }));

Your route would include something like this:

const multerMemStorage = multer.memoryStorage();
const multerUploadToMem = multer({
  storage: multerMemStorage,
  // Also specify fileFilter, limits...
});

router.post('/myUploads',
  multerUploadToMem.single('myFile'),
  async (req, res, next) => {
    // Find back myVeryRelevantJsonData :
    logger.verbose(`Uploaded req.body=${JSON.stringify(req.body)}`);

    // If your file is text:
    const newFileText = req.file.buffer.toString();
    logger.verbose(`Uploaded text=${newFileText}`);
    return next();
  },
  ...

One thing to keep in mind though is this note from the multer doc, concerning disk storage:

Note that req.body might not have been fully populated yet. It depends on the order that the client transmits fields and files to the server.

I guess this means it would be unreliable to, say, compute the target dir/filename based on json metadata passed along the file

Ganesa answered 7/10, 2020 at 12:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.