What happens when an async put, results in a contention exception, after the request has ended, on Appengine with NDB?
Asked Answered
T

2

5

Using ndb, lets say I put_async'd 40 elements, with @ndb.toplevel, wrote an output to user and ended the request, however one of those put_async's resulted in a contention exception, would the response be 500 or 200? Or lets say If it it a task, would the task get re-executed?

One solution is get_result()'ing all those 40 requests before the request ending and catching those exceptions -if they occur- but I'm not sure whether it will affect performance.

Trifocal answered 1/9, 2012 at 4:32 Comment(0)
F
3

That's odd, I use toplevel and expect the opposite behavior. And that's what I observe. Did something change since the first answer to this question? As the doc says:

This in turn lets you send off the request and not worry about the result.

You can try the following unittest (using testbed):

@ndb.tasklet
def raiseSomething():
    yield ndb.Key('foo','bar').get_async()
    raise Exception()

@ndb.toplevel
def callRaiseSomething():
    future = raiseSomething()
    return "hello"

response = callRaiseSomething()
self.assertEqual(response, "hello")

This test passes. NDB logs a warning: "suspended generator raiseSomething(tests.py:90) raised Exception()", but it does not re-raise the exception.

ndb.toplevel only waits for the RPCs, but does nothing of the actual result. If your decorated function is itself a tasklet, it will call get_result() on it first. At this point exceptions will be raised. Then it will wait for remaining 'orphaned' RPCs, and will only log something if an exception is raised.

So my response is: the request will succeed (return 200)

Frostwork answered 15/8, 2013 at 13:42 Comment(6)
I think you need to add yield to call on raiseSomething() for toplevel to apply.Tardif
If you yield then you are just waiting for the result of "raiseSomething" before executing the rest of "callRaiseSomething", and so not using the "fire and forget" feature of toplevel.Frostwork
Yes, that's correct, using yield wouldn't make sense in this instance. Having worked through your answer, I've updated my answer. Strangely enough, calling put_async() in the handler decorated with @ndb.toplevel will raise 500 intermittently. I've upvoted your answer.Tardif
I noticed that some errors are not properly handled indeed, when they occur in ndb internal code (batcher). For instance if you use ndb.Key('foo', 0) instead, a BadRequestError will be raised but not from the tasklet, hence skipping the ndb.toplevel handling.Frostwork
@RobCurtis I'm curious to know what kind of error seems to intermittenly raise 500 in your testsFrostwork
decorate handler with ndb.toplevel. In the handler call put_async() on a model. Disable datastore writes in admin console (live). Call handler from browser. I sometimes get 500, and other times I get 200 with response. The Exception: CapabilityDisabledError: Datastore writes have been disabled by an application administrator. Writes can be re-enabled in the admin console.Tardif
T
4

As far as I understand, using @ndb.toplevel causes the handler wait for all async operations to finish before exiting. From the docs:

As a convenience, you can decorate the request handler with @ndb.toplevel. This tells the handler not to exit until its asynchronous requests have finished. This in turn lets you send off the request and not worry about the result. https://developers.google.com/appengine/docs/python/ndb/async#intro

So by adding @ndb.toplevel that the response doesn't actually get returned until after the async methods have finished executing. Using @ndb.toplevel removes the need to call get_result on all the async calls that were fired off (for convenience). So based on this, the request would still return 500 if the async queries failed, because all the async queries needed to complete before returning. Updated: below

If using a task (I assume you mean task queue) the task queue will retry the request if the request fails. So your handler could be something like:

def get(self):
    deferred.defer(execute_stuff_in_background, param,param1)
    template.render(...)

and execute_stuff_in_background would do all the expensive puts once the handler had returned. If there was a contention issue in the task, your original handler would still return 200.

If you suspect there is going to be a contention issue, perhaps consider sharding or using a fork-join queue implementation to handle the writes (see implementation here: http://www.youtube.com/watch?v=zSDC_TU7rtc#t=41m35)

Edit: Short answer The request will fail (return 500) if the async requests fail, because @ndb.toplevel waits for all results to finish before exiting. Updated:Having looked at @alexis's answer below, I re-ran my original test (where I turned off datastore writes and called put_async in the handler decorated with @ndb.toplevel), the response raises 500 intermittently (I assume this depends on execution time). Based on this and @alexis's answer below, don't expect the result to be 500 if an async task throws an exception and the calling function is decorated with @ndb.toplevel

Tardif answered 1/9, 2012 at 10:4 Comment(4)
Thanks a lot for the contention link, but other than that I'm afraid the question is not answered in your answer, the question is "will the request fail?"Trifocal
@Kaan, Cool, Updated the answer. I was't clear enough. I ran a few tests to see when the request handler returns. It appears that ALL async requests need to be finished before the handler returns the response. So yes, it will 500. Answer has been updated.Tardif
To test: Disable datastore writes and remove ndb.toplevel from your handler. The put_async() won't throw exception and handler will return 200. However, include ndb.toplevel on the handler (with datastore writes disabled) and the handler will return 500 because exceptions are only thrown when get_result is called (which is what @ndb.toplevel essentially does, it calls get_result).Tardif
Just one additional bit of info: just because you call put_async() 40 times doesn't mean there are 40 pending RPCs. It is quite possible that they are all combined into a single RPC, or, more likely, in a small number of them. Since a RPC fails or succeeds as a unit, all put_async() calls that were combined into the same RPC will all fail together (all raising the same exception). Finally, remember that this is a distributed system -- it's possible that you get an error from an RPC but in fact it was executed by the Datastore server.Ellette
F
3

That's odd, I use toplevel and expect the opposite behavior. And that's what I observe. Did something change since the first answer to this question? As the doc says:

This in turn lets you send off the request and not worry about the result.

You can try the following unittest (using testbed):

@ndb.tasklet
def raiseSomething():
    yield ndb.Key('foo','bar').get_async()
    raise Exception()

@ndb.toplevel
def callRaiseSomething():
    future = raiseSomething()
    return "hello"

response = callRaiseSomething()
self.assertEqual(response, "hello")

This test passes. NDB logs a warning: "suspended generator raiseSomething(tests.py:90) raised Exception()", but it does not re-raise the exception.

ndb.toplevel only waits for the RPCs, but does nothing of the actual result. If your decorated function is itself a tasklet, it will call get_result() on it first. At this point exceptions will be raised. Then it will wait for remaining 'orphaned' RPCs, and will only log something if an exception is raised.

So my response is: the request will succeed (return 200)

Frostwork answered 15/8, 2013 at 13:42 Comment(6)
I think you need to add yield to call on raiseSomething() for toplevel to apply.Tardif
If you yield then you are just waiting for the result of "raiseSomething" before executing the rest of "callRaiseSomething", and so not using the "fire and forget" feature of toplevel.Frostwork
Yes, that's correct, using yield wouldn't make sense in this instance. Having worked through your answer, I've updated my answer. Strangely enough, calling put_async() in the handler decorated with @ndb.toplevel will raise 500 intermittently. I've upvoted your answer.Tardif
I noticed that some errors are not properly handled indeed, when they occur in ndb internal code (batcher). For instance if you use ndb.Key('foo', 0) instead, a BadRequestError will be raised but not from the tasklet, hence skipping the ndb.toplevel handling.Frostwork
@RobCurtis I'm curious to know what kind of error seems to intermittenly raise 500 in your testsFrostwork
decorate handler with ndb.toplevel. In the handler call put_async() on a model. Disable datastore writes in admin console (live). Call handler from browser. I sometimes get 500, and other times I get 200 with response. The Exception: CapabilityDisabledError: Datastore writes have been disabled by an application administrator. Writes can be re-enabled in the admin console.Tardif

© 2022 - 2024 — McMap. All rights reserved.