BUG: Severe performance degradation when using Django 5.1 connection pool
Opened this issue · 2 comments
The new Django 5.1 version adds connection pools for postgresql.
I noticed a very interesting, and terrifying, problem. The number of requests/second dramatically decreases up to a point Django explodes when using django_prometheus and the new connection pooling.
I tested with this database settings:
DATABASES = {
'default': {
'ENGINE': 'django_prometheus.db.backends.postgresql',
# 'ENGINE': 'django.db.backends.postgresql',
'NAME': env('DB_NAME'),
'USER': env('DB_USER'),
'PASSWORD': env('DB_PASSWORD'),
'HOST': env('DB_HOST'),
'PORT': env('DB_PORT'),
'DISABLE_SERVER_SIDE_CURSORS': True,
'OPTIONS': {
'pool': True,
},
}
}And these requirements:
# Infrastructure
Django==5.1.3
adrf==0.1.8
gunicorn==20.1.0
uvicorn==0.22.0
# Metrics, Monitoring && logging
django-prometheus==2.3.1
prometheus-client==0.21.0
# db
psycopg==3.2.3
psycopg-pool==3.2.4I noticed the problem using this command: hey -c 1 -n 100 https://127.0.0.1/api/v1/test/
With this view:
class TestView(APIView):
permission_classes = ()
async def get(self, request) -> Response: # type: ignore
del request
origins = await Origin.objects.alast()
if not origins:
return Response(status=status.HTTP_204_NO_CONTENT)
return Response(OriginGetResponseSerializer({'origins': [origins]}).data, status=status.HTTP_200_OK)Where Origin is a simple model:
from uuid import uuid4
from django.db import models
class Origin(models.Model):
id = models.UUIDField(
primary_key=True,
default=uuid4,
editable=False,
unique=True,
blank=False,
null=False,
)
name = models.CharField(
unique=True,
max_length=32,
)
active_android = models.BooleanField(default=False)
active_ios = models.BooleanField(default=False)
active_web = models.BooleanField(default=False)
active_new_item = models.BooleanField(default=False)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
objects = models.Manager() # To comply with mypyNote that using django.db.backends.postgresql works as expected, steadily. Also, removing the OPTIONS key also avoids the issue.
When running the benchmark with hey -c 1 -n 100 https://127.0.0.1/api/v1/test/ I get a report like
Summary:
Total: 0.7236 secs
Slowest: 0.0113 secs
Fastest: 0.0063 secs
Average: 0.0072 secs
Requests/sec: 138.2019
Total data: 24400 bytes
Size/request: 244 bytes
Response time histogram:
0.006 [1] |■
0.007 [25] |■■■■■■■■■■■■■■■■■■■■■■■■■■
0.007 [39] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.008 [21] |■■■■■■■■■■■■■■■■■■■■■■
0.008 [9] |■■■■■■■■■
0.009 [4] |■■■■
0.009 [0] |
0.010 [0] |
0.010 [0] |
0.011 [0] |
0.011 [1] |■
Latency distribution:
10% in 0.0066 secs
25% in 0.0068 secs
50% in 0.0071 secs
75% in 0.0076 secs
90% in 0.0081 secs
95% in 0.0084 secs
99% in 0.0113 secs
Details (average, fastest, slowest):
DNS+dialup: 0.0000 secs, 0.0063 secs, 0.0113 secs
DNS-lookup: 0.0000 secs, 0.0000 secs, 0.0000 secs
req write: 0.0000 secs, 0.0000 secs, 0.0001 secs
resp wait: 0.0071 secs, 0.0062 secs, 0.0086 secs
resp read: 0.0000 secs, 0.0000 secs, 0.0001 secs
Status code distribution:
[200] 100 responsesThen, the Requests/sec start to decrease in subsequent runs:
Run 1:
Requests/sec: 53.0024
Run 2:
Requests/sec: 138.1491
Run 3:
Requests/sec: 128.2628
Run 4:
Requests/sec: 110.5167
Run 5:
Requests/sec: 114.9347
Run 6:
Requests/sec: 108.7823
Run 7:
Requests/sec: 103.1321
Run 8:
Requests/sec: 90.7509
Run 9:
Requests/sec: 91.4321
Run 10:
Requests/sec: 83.5613
Run 11:
Requests/sec: 86.0125
Run 12:
Requests/sec: 76.2282
Run 13:
Requests/sec: 78.4934
Run 14:
Requests/sec: 70.1448
Run 15:
Requests/sec: 68.1487Note I used this cute command to run the test: for i in {1..15}; do echo "Run $i:"; hey -c 1 -n 100 https://127.0.0.1/api/v1/test/ | awk '/Requests\/sec/'; echo ""; done
I am not sure why this happens, however, it has to do with connections and cursors (in this file of the repo?).
I have found this Django ticket where they mention that connections must be closed when not ran in django e.g. in other services like celery, etc. Could it be possible that the Prometheus logging is not closing the connections?
Just checked this with 2.4.1 release and still happens