How to serialize requests Response object as HAR
Asked Answered
F

2

24

I want to serialize a requests Response object as json, preferably in HAR format.

import requests
resp = requests.get('http://httpbin.org/get')

har = to_har(resp)  # <--- magic

but couldnt find anything online with my google-fu powers.

it seems that all data exist on the Response object, i hope i dont need to implement the whole HAR spec and there exist some code/utility i can reuse.

a valid answer might give: existing library or refer to a starting point if nothing exists so far for python and/or requests.

currently my simpler 3min solution (not HAR format) serialization to Response object looks like this (might be good start point if nothing exists):

def resp2dict(resp, _root=True):
    d = {
        'text': resp.text,
        'headers': dict(resp.headers),
        'status_code': resp.status_code,
        'request': {
            'url': resp.request.url,
            'method': resp.request.method,
            'headers': dict(resp.request.headers),
        },
    }

    if _root:
        d['history'] = [resp2dict(h, False) for h in resp.history]
    return d

i post this as i think not only me struggle to serialize Response objects to json in general regardless of HAR format.

Foltz answered 4/6, 2019 at 11:16 Comment(2)
I think most people are happy with resp.json(), which may or may not comply with HAR, I don't know. You could also add fields to the dict returned by resp.json().Engeddi
resp.json() only works for json responses and only serialize the body of the response. you dont serialize the headers, url, request or redirect history. Its a different problem than im looking to solve. its actually deserializing the msg rather than serializingFoltz
R
2

currently my simpler 3min solution (not HAR format) serialization to Response object looks like this (might be good start point if nothing exists):

Looks like this is the best solution. I've checked every HAR-related library on PyPI and the only close solution I found (except har2requests) is marshmallow-har. Unfortunately, marshmallow_har.Response.__schema__ match the internal structure neither requests.Response nor urllib3.response.HTTPResponse. So, the solutions I see are:

  1. Use the ad-hoc solution as you already do. To make sure the result has the correct structure, marshmallow-har can be used.
  2. Make your own marshmallow schemas by providing attribute argument to the fields. I'd advise to fork and extend marshmallow-har but it uses factories and other strange magic and can't be easily extended. So, it's better to start from ground zero.

And consider open-sourcing your solution :)

Rectocele answered 4/8, 2020 at 9:24 Comment(0)
C
0

You can find a working solution written in python from this project https://github.com/scrapinghub/splash/blob/master/splash/har_builder.py

Following this working solution may provide you the clues you need

Cowfish answered 28/3, 2023 at 14:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.