deployment and operations
PHP-FPM Pool Tuning
FPM pool settings control worker count and lifecycle. Tuning starts with measured traffic, request latency, and process memory rather than copied numbers. A configuration that works for a small admin tool may be wrong for an API handling bursts of traffic.
Measure Before Changing Limits
- Understand dynamic, static, and ondemand process management modes.
- Estimate safe worker count from available memory and measured worker usage.
- Monitor queueing, saturation, slow requests, and restarts.
Change One Variable At A Time
- Measure baseline.
- Change one setting in staging.
- Load-test and observe.
Read The Symptoms
- Too many workers can exhaust memory.
- Too few workers increase queue latency.
- Average memory can hide high-percentile spikes.
Capacity Estimate
safe workers ~= memory budget for PHP workers / measured high-percentile worker memory
Then verify with traffic, queue latency, saturation, and restart metrics.
Record the measured worker memory, host memory budget, expected traffic shape, and observed queue behaviour. A capacity estimate is a starting point for a staging test, not proof that production is safe.
Practice
Practice: Create An FPM Tuning Plan
An application starts queueing HTTP requests during traffic spikes. Write a tuning plan that gathers evidence before changing the pool limits.
Requirements
- Understand dynamic, static, and ondemand process management modes.
- Estimate safe worker count from available memory and measured worker usage.
- Monitor queueing, saturation, slow requests, and restarts.
- Measure baseline.
- Change one setting in staging.
- Load-test and observe.
Show solution
Measure request latency, FPM queueing, active workers, high-percentile worker memory, host memory, and restart behaviour. Estimate a safe worker budget from measured memory rather than average memory alone.
Change one pool setting in staging, repeat a representative load test, and compare the same signals. Increasing workers is not automatically an improvement: it can trade queue latency for host-wide memory pressure.