FastAPI issues with MongoDB - TypeError: 'ObjectId' object is not iterable
Asked Answered
M

2

6

I am having some issues inserting into MongoDB via FastAPI.

The below code works as expected. Notice how the response variable has not been used in response_to_mongo().

The model is an sklearn ElasticNet model.

app = FastAPI()


def response_to_mongo(r: dict):
    client = pymongo.MongoClient("mongodb://mongo:27017")
    db = client["models"]
    model_collection = db["example-model"]
    model_collection.insert_one(r)


@app.post("/predict")
async def predict_model(features: List[float]):

    prediction = model.predict(
        pd.DataFrame(
            [features],
            columns=model.feature_names_in_,
        )
    )

    response = {"predictions": prediction.tolist()}
    response_to_mongo(
        {"predictions": prediction.tolist()},
    )
    return response

However when I write predict_model() like this and pass the response variable to response_to_mongo():

@app.post("/predict")
async def predict_model(features: List[float]):

    prediction = model.predict(
        pd.DataFrame(
            [features],
            columns=model.feature_names_in_,
        )
    )

    response = {"predictions": prediction.tolist()}
    response_to_mongo(
        response,
    )
    return response

I get an error stating that:

TypeError: 'ObjectId' object is not iterable

From my reading, it seems that this is due to BSON/JSON issues between FastAPI and Mongo. However, why does it work in the first case when I do not use a variable? Is this due to the asynchronous nature of FastAPI?

Meliamelic answered 14/3, 2022 at 12:15 Comment(4)
While it seems like a stretch, does ObjectId gets populated inside the response object when sent to insert_one? If that is the case, your first example ends up with it being inserted in a throw away dict, while in the second example it gets inserted into a dict you're still referencing.Amarillo
@Amarillo I wouldn't think thats the case because the response object is not being changed in-placeMeliamelic
Sounds like that's exactly what's happening based on the answer below :-)Amarillo
Happy to be proven wrong! Thanks a lot for your answer :DMeliamelic
K
13

As per the documentation:

When a document is inserted a special key, "_id", is automatically added if the document doesn’t already contain an "_id" key. The value of "_id" must be unique across the collection. insert_one() returns an instance of InsertOneResult. For more information on "_id", see the documentation on _id.

Thus, in the second case of the example you provided, when you pass the dictionary to the insert_one() function, Pymongo will add to your dictionary the unique identifier (i.e., ObjectId) necessary to retrieve the data from the database; and hence, when returning the response from the endpoint, the ObjectId fails getting serialized—since, as described in this answer in detail, FastAPI, by default, will automatically convert that return value into JSON-compatible data using the jsonable_encoder (to ensure that objects that are not serializable are converted to a str), and then return a JSONResponse, which uses the standard json library to serialise the data.

Solution 1

Use the approach demonstrated here, by having the ObjectId converted to str by default, and hence, you can return the response as usual inside your endpoint.

# place these at the top of your .py file
import pydantic
from bson import ObjectId
pydantic.json.ENCODERS_BY_TYPE[ObjectId]=str

return response # as usual

Solution 2

Dump the loaded BSON to valid JSON string and then reload it as dict, as described here and here.

from bson import json_util
import json

response = json.loads(json_util.dumps(response))
return response

Solution 3

Define a custom JSONEncoder, as described here, to convert the ObjectId into str:

import json
from bson import ObjectId

class JSONEncoder(json.JSONEncoder):
    def default(self, o):
        if isinstance(o, ObjectId):
            return str(o)
        return json.JSONEncoder.default(self, o)


response = JSONEncoder().encode(response)
return response

Solution 4

You can have a separate output model without the 'ObjectId' (_id) field, as described in the documentation. You can declare the model used for the response with the parameter response_model in the decorator of your endpoint. Example:

from pydantic import BaseModel

class ResponseBody(BaseModel):
    name: str
    age: int


@app.get('/', response_model=ResponseBody)
def main():
    # response sample
    response = {'_id': ObjectId('53ad61aa06998f07cee687c3'), 'name': 'John', 'age': '25'}
    return response

Solution 5

Remove the "_id" entry from the response dictionary before returning it (see here on how to remove a key from a dict):

response.pop('_id', None)
return response
Kidding answered 14/3, 2022 at 13:59 Comment(0)
R
1

Solution 4, from Chris's excellent answer, can also be accomplished with function output type hints. Thus:

from pydantic import BaseModel


class ResponseBody(BaseModel):
    name: str
    age: int


@app.get('/')
def example() -> ResponseBody:
    # you'd need to await this if you were using Motor (the Async MongoDB Driver)
    return db.my_collection.find_one(...)
Roanna answered 16/10, 2023 at 8:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.