Back to index
Packet Loss Correlation in the MBone Multicast Network
Maya Yajnik, Jim Kurose, and Don Towsley, UMass Amherst
One-line summary: Losses on backbone or between NI and
receiving process are rare; the average loss event is a single packet,
but the average lost byte lives in a burst packet loss; most packets are
dropped close to the source.
Overview/Main Points
- MBone is a two-level "mesh of stars". Main backbone routers
operated by long-haul service providers are connected by
redundant unicast
tunnel routes; each backbone router is the hub of a star of unicast
tunnels to local mcast-aware routers.
- Data collected by counting dropped packets for specific MBone
sessions at multiple receiver sites, then comparing the loss traces.
- Expected packet loss along a link:
- Suppose A is a backbone router downstream of B, and a packet is
seen at all
of B's regional routers, but not at any of A's. Then the packet
either got lost between B and A, or else every copy independently
got lost between A and its regional routers.
- If Na and Nb are
the number of packets lost by all receivers downstream of A and B
respectively, then Na-Nb is the number of packets seen at B's
regional routers but not at A's, so the expected probability of
loss along link AB is (Na-Nb)/(N-Nb) where N is total number of
packets sent by source.
- Applying this to the top-level MBone hierarchy graph based
on the data collected, the
inter-backbone-router links have expected loss
probabilities <<1% (except for a bottleneck link
leaving UCB, which has ELP about 5%, and the transatlantic
link from U. Maryland to France, about 6.5%).
- End-host loss (i.e. loss in the network stack) was measured by
correlating losses from hosts connected to same local mcast
router, and found to be negligible.
- Spatial correlation among receivers that simultaneously lose a
given packet:
- 47% of all packets sent were dropped by at least one
receiver!
- Temporal independence of loss is assumed.
- Metric: if m is the number of receivers simultaneously
experiencing loss, and M is the number of receivers that
simultaneously lose a given packet, compute probability
P(M=m).
- Expected probability was computed for star (packet source
is hub of star), modified star
(packet source is spoke of star), and "full" (spatially
correlated, as in backbone example given above), in each
case with 11 receivers (the number of receivers
instrumented for data collection).
- The distribution of M for "full" and modified-star
topologies were closest to
the measured data. These both have the property that if a
loss occurs close to the source, all receivers will
experience that loss, suggesting that this is a dominant
MBone lossage mode.
- Spatial loss correlation between receiver pairs:
- Average correlation coefficient varies between .271 and
.666.
- But, when the loss that is common to all receivers
is removed from dataset, correlation practically vanishes!
- Again, conclusion is that spatial pairwise correlation is
rare, except for the effects of loss close the source.
- Temporal loss correlation at a single receiver:
- Solitary (single-packet) losses predominate, but
- packets lost due to long burst losses constitute a
significant fraction of total bytes lost.
- Periodic losses seen every 30 seconds; a paper by Floyd
and Jacobson is cited (The
Synchronization of Periodic
Routing Messages) to explain it.
- Compare: "The average file is small, but the average byte
lives in a big file."
Relevance
Many proposed mcast error control protocols recover by interacting with
nearby receivers rather than requesting source retransmission, so it's
worth profiling whether loss patterns make it reasonable to expect that
a nearby receiver will have gotten the packet if you didn't get it.
(Paper concludes that it is reasonable to expect this.)
Back to index