I don't see why do you need a cached connection here and why not just reconnect on every request caching user's credentials somewhere, but anyway I'll try to outline a solution that might fit your requirements.
I'd suggest to look into a more generic task first - cache something between subsequent requests your app needs to handle and can't serialize into django
's sessions.
In your particular case this shared value would be a database connection (or multiple connections).
Lets start with a simple task of sharing a simple counter variable between requests, just to understand what's actually happening under the hood.
Amaizingly but neither answer has mentioned anything regarding a web server you might use!
Actually there are multiple ways to handle concurrent connections in web apps:
- Having multiple processes, every request comes into one of them at random
- Having multiple threads, every request is handled by a random thread
- p.1 and p.2 combined
- Various async techniques, when there's a single process + event loop handling requests with a caveat that request handlers shouldn't block for a long time
From my own experience p.1-2 are fine for majority of typical webapps.
Apache1.x
could only work with p.1, Apache2.x
can handle all of 1-3.
Lets start with the following django
app and run a single-process gunicorn webserver.
I'm going to use gunicorn
because it's fairly easy to configure it unlike apache
(personal opinion :-)
views.py
import time
from django.http import HttpResponse
c = 0
def main(self):
global c
c += 1
return HttpResponse('val: {}\n'.format(c))
def heavy(self):
time.sleep(10)
return HttpResponse('heavy done')
urls.py
from django.contrib import admin
from django.urls import path
from . import views
urlpatterns = [
path('admin/', admin.site.urls),
path('', views.main, name='main'),
path('heavy/', views.heavy, name='heavy')
]
Running it in a single process mode:
gunicorn testpool.wsgi -w 1
Here's our process tree - there's only 1 worker that would handle ALL requests
pstree 77292
-+= 77292 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1
\--- 77295 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1
Trying to use our app:
curl 'http://127.0.0.1:8000'
val: 1
curl 'http://127.0.0.1:8000'
val: 2
curl 'http://127.0.0.1:8000'
val: 3
As you can see you can easily share the counter between subsequent requests.
The problem here is that you can only serve a single request in parallel. If you request for /heavy/ in one tab, / won't work until /heavy is done
Lets now use 2 worker processes:
gunicorn testpool.wsgi -w 2
This is how the process tree would look like:
pstree 77285
-+= 77285 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 2
|--- 77288 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 2
\--- 77289 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 2
Testing our app:
curl 'http://127.0.0.1:8000'
val: 1
curl 'http://127.0.0.1:8000'
val: 2
curl 'http://127.0.0.1:8000'
val: 1
The first two requests has been handled by the first worker process
, and the 3rd one - by the second worker process that has its own memory space so you see 1 instead of 3.
Notice your output may differ because process 1 and 2 are selected at random. But sooner or later you'll hit a different process.
That's not very helpful for us because we need to handle multiple concurrent requests and we need to somehow get our request handled by a specific process that can't be done in general case.
Most pooling technics coming out of the box would only cache connections in the scope of a single process, if your request gets served by a different process - a NEW connection would need to be made.
Lets move to threads
gunicorn testpool.wsgi -w 1 --threads 2
Again - only 1 process
pstree 77310
-+= 77310 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1 --threads 2
\--- 77313 oleg /Users/oleg/.virtualenvs/test3.4/bin/python /Users/oleg/.virtualenvs/test3.4/bin/gunicorn testpool.wsgi -w 1 --threads 2
Now if you run /heavy in one tab you'll still be able to query / and your counter will be preserved between requests!
Even if the number of threads is growing or shrinking depending on your workload it should still work fine.
Problems: you'll need to synchronize access to the shared variable like this using python threads synchronization technics (read more).
Another problem is that the same user may need to to issue multiple queries in parallel - i.e. open multiple tabs.
To handle it you can open multiple connections on the first request when you have db credentials available.
If a user needs more connections than your app might wait on lock until a connection becomes available.
Back to your question
You can create a class that would have the following methods:
from contextlib import contextmanager
class ConnectionPool(object):
def __init__(self, max_connections=4):
self._pool = dict()
self._max_connections = max_connections
def preconnect(self, session_id, user, password):
# create multiple connections and put them into self._pool
# ...
@contextmanager
def get_connection(sef, session_id):
# if have an available connection:
# mark it as allocated
# and return it
try:
yield connection
finally:
# put it back to the pool
# ....
# else
# wait until there's a connection returned to the pool by another thread
pool = ConnectionPool(4)
def some_view(self):
session_id = ...
with pool.get_connection(session_id) as conn:
conn.query(...)
This is not a complete solution - you'll need to somehow delete outdated connections not used for a long time.
If a user comes back after a long time and his connection have been closed, he'll need to provide his credentials again - hopefully it's ok from your app's perspective.
Also keep in mind python threads
have its performance penalties, not sure if this is an issue for you.
I haven't checked it for apache2
(too much configuration burden, I haven't used it for ages and generally use uwsgi), but it should work there too - would be happy to hear back from you
if you manage to run it )
And also don't forget about p.4 (async approach) - unlikely will you be able to use it on apache, but it's worth investigation - keywords: django + gevent, django + asyncio. It has its pros/cons and may greatly affect your app implementation so it's hard to suggest any solution without knowing your app requirements in detail
connection
credentials, so it would be the same as looking up a user, for example (though in this case, looking up a DB connection). – Endosteum