Connection Draining (Deregistration Delay) in AWS Load Balancers
Overview
Connection Draining (for Classic Load Balancers) or Deregistration Delay (for Application Load Balancers and Network Load Balancers) is a feature that ensures in-flight requests are completed before an instance is removed from service. This mechanism prevents abrupt connection drops when an instance is deregistered or marked unhealthy.
How It Works
- When an instance enters draining mode, the Elastic Load Balancer (ELB) stops routing new requests to it.
- Existing connections are given time (draining period) to complete their active requests.
- Once the draining period expires or all connections are closed, the instance is fully deregistered.
- Any new incoming requests are routed to other available instances.
Example Scenario
- Imagine you have three EC2 instances behind an ELB.
- One instance is placed in draining mode.
- Existing users connected to that instance finish their requests within the draining period.
- The ELB ensures that new users are directed to the remaining healthy instances.
Configuring Connection Draining
You can configure the Connection Draining time between 1 and 3,600 seconds (5 minutes by default).
- Set to
0 → Disables draining, immediately closing all active connections.
- Low values (e.g., 30 seconds) → Best for short-lived requests (e.g., small API calls).
- High values (e.g., several minutes) → Suitable for long-running requests (e.g., file uploads, streaming).
Key Considerations
- Shorter draining times help remove instances quickly but may interrupt long requests.