These 4 optimizations improve performance, concurrency and response speed of a FastAPI API in production with Docker.
Gunicorn acts as a process manager above Uvicorn. Launches multiple workers (independent processes) to allow the API to handle several requests in parallel, not sequentially.
Main Impact: Improved concurrency under load. With 4 workers, you can handle 4 simultaneous requests plus async within each one.
gunicorn==23.0.0
docker-compose.yml
command: gunicorn main:app \ -w 4 \ -k uvicorn.workers.UvicornWorker \ --bind 0.0.0.0:8000 \ --access-logfile - \ --log-level info
Environments: development vs production
Docker Compose (production)
command: gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000
Docker Compose Override (Development) (automatically applied)
services: quierolibros_back_api: command: uvicorn main:app --host 0.0.0.0 --port 8000 --reload
# Development (automatically applies override) docker compose up # Production (ignores the override) docker compose -f docker-compose.yml up
Log duplication fix
With multiple workers, logs are repeated N times. To clear them, add this to the start of main.py:
import logginglogging.getLogger('uvicorn.access').propagate = Falselogging.getLogger('uvicorn.error').propagate = False
Note: Duplicate logs by worker are normal — each process is independent. The fix only affects the visual format.
<
p class=”has-large-font-size”>Database Connection Pool
Instead of opening and closing a new connection for every query, the pool maintains a set of open and reusable connections. This eliminates the latency of establishing the TCP+authentication connection for each request.
Primary Impact: Latency Reduction in Queries. With 4 workers and pool_size=20, the DB can receive up to 240 simultaneous connections under peak.
from sqlalchemy.ext.asyncio import create_async_engine engine = create_async_engine( SQLALCHEMY_DATABASE_URL_ASYNC, pool_pre_ping=True, # detects dropped connections pool_recycle=300, # recycles old connections (5 min) pool_size=20, # permanent connections in the pool max_overflow=40, # extra under peak, then close pool_timeout=30, # seconds waiting for a free connection )
Explained Parameters
| Parameter | Default | Recommended | Purpose |
|---|---|---|---|
pool_size |
5 | 20 | Always open connections |
max_overflow |
10 | 40 | Extra under peak, then close |
pool_timeout |
30 | 30 | Avoids infinite wait |
pool_recycle |
— | 300 | Recycles old connections |
pool_pre_ping |
— | True | Detects dropped connections |
Configure MariaDB to handle connections
With 4 workers × (20 + 40) = 240 maximum connections. Verify the current limit:
-- Verify current limit SHOW VARIABLES LIKE 'max_connections';
If it’s less than 240, increase it in the docker-compose.yml of the MariaDB service:
mariadb_quierolibros_back: command: --max-connections=300
Rapid JSON Serialization — orjson
What is it?
orjson is a library that replaces Python’s native JSON serializer. It is written in Rust, making it 3-5x faster than the native serializer. The difference becomes notably apparent especially when dealing with large lists of books or articles.
Main impact: 3-5x speedup in serializing JSON. This is particularly noticeable in endpoints that return large lists of records.
Add to requirements.txt:
orjson==3.10.18
Configuration in main.py
from fastapi import FastAPI
from fastapi.responses import ORJSONResponse
app = FastAPI( title='API V2',
description='REST API',
default_response_class=ORJSONResponse, # single line )
Note: A single change in main.py affects all endpoints automatically. No need to modify individual endpoints.
Use httpx asynchronously to avoid blocking the event loop while waiting for responses from external APIs:
import requests @app.get('/data') def get_data(): r = requests.get('https://api.externa.com') return r.json()
Well – no blocking
import httpx @app.get('/data') async def get_data(): async with httpx.AsyncClient() as client: r = await client.get('https://api.externa.com') return r.json()
Parallel calls with asyncio.gather
Make parallel external API calls instead of one by one if an endpoint makes multiple subsequent external calls:
r1 = await client.get('https://api1.com') # 300msr2 = await client.get('https://api2.com') # 300ms# Total: 600ms
