Back to index
Characterizing End-to-End Packet Delay In the Internet
Jean-Chrysostome Bolot, INRIA
One-line summary:
UDP roundtrip measurements and some clever analysis based on queueing
theory show that end-to-end delays and packet losses are correlated at
short time scales, random at long time scales.
Overview/Main Points
- UDP ping-ponged between INRIA and UMD. Each has unique timestamp
and serial number. Only roundtrip times were measured.
- Analysis strategy: Much is known about system under study, so
don't use plain autoregressive or moving-average models to study
RTT's in isolation; instead, use a model of two streams entering
a FIFO server queue, where one stream with a fixed delay models
the fixed part of the probe RTT, the other stream is
randomly-distributed Internet
traffic, and the server models the variation in probe RTT.
- For low traffic, Wn+1=Wn+en, where
Wn is waiting time of packet n, en is
exponential random variable with small variance and mean=0.
- Points will lie along the diagonal
rttn+1=rttn in phase plot.
- When d (interarrival time of probes) is small and other traffic
is B (which is very heavy), then
Wn+1=Wn+(B/m) where m is queue service
rate, or rttn+1-rttn=B/m. If no Internet
traffic is received betwen probe packets n+1 and n+k, then probe
packets n+1 thru n+k will leave the queue P/m seconds apart, so
that rttn+2-rttn+1=P/m - d.
- So points will lie along that "lower" diagonal on phase
plot.
- When d is very large, the maximum queueing delay is barely larger
than the interarrival times, and the points are scattered all
over the phase plane.
- Using the Lindley recurrence
Wn+1=Wn+Yn-Xn (where
Y is service time and X is interarrival time; see "graphical
proof" in paper):
- Assume that Internet traffic contributes b bits during
slot n (i.e. time interval [nd,(n+1)d]).
- Apply Lindley's recurrence twice and substitute, to get
Wn+1=Wn + (P+b)/m - d,
or rewriting,
bn=m(Wn+1-Wn+d). That
is, we can estimate Internet traffic from the distribution
of packet interarrival times!
- Note that above property doesn't hold if the queue does
not get a chance to empty during the time slot
[nd,(n+1)d]. So we must make d small enough, i.e. make
m*d smaller than some average value of b.
- Applying this analysis to the observed distribution of the
Wn for d=20ms, we can see probes that got stuck
behind a single FTP packet, probes that got stuck behind two FTP
packets, etc.
- As d gets bigger, correlations between successive packets get
worse and graph loses some structure. (For d=20ms, it's clearly
multimodal; for d=100ms, it's noisier.)
- Packet loss analysis: For large values of d, the
unconditional and conditional packet loss probs approach each
other. I.e. losses are random at large time scales, but
reasonably well correlated (ulp and clp are quite distinct) at
shorter scales.
- Unexplained result: even for enormous d, ulp is observed to be as
high as 10%.
- Loss gap stays close to 1 even for large values of d! So, e.g.,
audio apps should rely on FEC to fix most lost-packet errors.
Relevance
Another neat trick of indirect observation plus a clever model to deduce
end-to-end behavior properties for a large system.
Flaws
- Internet traffic offered during each queue slot is modeled as
exponential random variable. Is this reasonable?
Back to index