**One-line summary:** When gateway queue exceeds some threshold,
randomly pick a connection to notify, with probability roughly
proportional to how much bandwidth it's been using and optionally
proportional to the size of the packet being notified. Notification can
consist of marking a packet (if the transport protocol will cooperate
and use the mark as a hint) or dropping it otherwise.

- Motivations/goals:
- Only gateways have "unified" view of congestion over time
- Want to avoid bias against bursty traffic (DECbit has this problem since a burst on a single connection is more likely to cause the congestion bit to be set)
- Want to avoid "global synchronization" (if you drop packets from every connection, as in DropTail, then everyone backs off at once; would prefer to target "aggressive" users)
- Above 2 items can be achieved by separating congestion detection from choice of which connection to notify.
- Want to be able to control congestion even in case of non-cooperating transport protocol (by dropping)

- RED basic algorithm: On each packet arrival,
- If queue size is below low watermark MinTh, do nothing.
- If between MinTh and MaxTh, notify on this packet with probability Pa (see below);
- If above MaxTh, notify with probability 1.
- Let Pb vary from 0 to Pmax as the average queue size varies from MinTh to MaxTh, i.e. Pb=Pmax(avg-MinTh)/(MaxTh-MinTh).
- Pa is then Pb/(1-count x Pb), where count measures how many packets ago the last mark occurred.
- To measure queue size in bytes, let Pb=Pb x (PacketSize/MaxPacketSize). This makes large packets more likely to be marked than small ones.

- Various simulations indicate that:
- RED is good at keeping average queue size steady, even with heavy congestion;
- RED achieves higher throughput with shorter queues than DropTail;
- Under extreme congestion, RED still achieves about 0.6 link utilization in each direction for the bottleneck link, suggesting that the "global synchronization" is being avoided.

- Calculating exponential weighted moving average queue length:
- avg = (1-w)avg + wQ, where Q is current instantaneous measurement. How to pick w? If too large, averaging procedure will be susceptible to "transient" congestion (fast changes in queue length); if too small, RED will respond too slowly to real congestion.
- Upper bound: Form the geometric series that represents successive computations of avg over L packet arrivals; this gives avg in terms of L and w. Then simply set avg<MinTh to get upper bound on w.
- Lower bound: assume queue remains at 1 packet always. Then it takes -1/ln(1-w) arrivals until avg reaches 1-1/e=0.63. Determine how many arrivals we'd like this to take, this gives lower bound on w. Authors use w=0.002 for most simulations.

- Calculating packet-marking probability: Let each packet be
marked with probability Pb, and let X be the "intermarking" time.
- Method 1: X is a geometric random variable; Prob[X=n] =
(1-Pb)
^{n-1}Pb, E[x]=1/Pb. - Method 2: X is a uniform random variable in {1,2,...,1/Pb}. Prob[X=n]=Pb, E[X]=1/2 + 1/2Pb.
- Method 1 results in higher "clustering" of marked packets. Authors used method 2.

- Method 1: X is a geometric random variable; Prob[X=n] =
(1-Pb)
- Fairness: "not well defined" as a goal, but RED does not discriminate either against particular connections or classes of connections.
- Rules of thumb for getting effective performance:
- Pick w intelligently.
- Set MinTh high enough to maximize network power (power=throughput/delay ratio).
- Make MaxTh-MinTh large enough to avoid "global synchronization" effect.

- Identifying aggressive/misbehaving users: We know packets are marked with probability roughly proportional to their share of the bandwidth, so if some connection has a large fraction of recently-marked packets, it is likely to be misbehaving.

- Verbose: could have been half as long.