Back to index
(full WWW-9 proceedings)
Nina Bhatti, Anna Bouch, and Allan Kuchinsky
One-line summary: Users get annoyed with an e-commerce website
if page views take >11s (though variance is large); they are more tolerant of
delays if incremental page loading is used; they are less tolerant of delay the
longer their session with the Web site. Authors define a "utility
function" to measure how "useful" (acceptable performance) an
overall session is to the user, but they don't seem to use their definition for
anything.
Overview/Main Points
- There is significant prior research in the cognition literature on various
aspects of this; this paper makes some findings that are generally
consistent with that literature.
- There are also correlations between latency tolerance and other factors,
including how sophisticated the users are and other elements of their user
profiles. For this study, authors used males age 18-68 who spend
>2hrs/week on the Internet, have bought at least 2 things online in the
last year, and rank themselves "intermediate" user skills.
- Experiment: users were presented a website identical to HP Shopping (a
production ecommerce store), but whose latencies of page delivery could be
tuned. Goal was to measure users' perceptions of quality of service
under various conditions:
- can user tolerance of delay be objectively quantified?
- what factors does that tolerance depend on?
- can we identify secondary effects of these perceptions on users'
likelihood to return to a site?
- Experiments:
- All users completed an identical set of tasks: look at a category of
items to compare a few choices, select an item, add it to shopping cart,
proceed to checkout (view cart).
- Using buttons on the web page, users would classify each page view as
"sufficiently fast", "OK", or "unaccpetably
slow". For one group of users, the delays were chosen
randomly from a predetermined set that was previously found to capture
"subjective" differnces in latency; for the control set, the
delays were chosen "smoothly" [not clear what this means, but
I assume it means that the difference in delay between any 2 successive
page loads is small]. In this experiment, complete delay would
elapse before any content of new page was shown.
- There was also an "I'm fed up" button that would
instantaneously finish loading the page. [I know, it seems like
knowing about this would swing the results...]
- In a second set of experiments, new page would load incrementally (banner
first, then text, then graphics). The total delay (for
whole page views) were same as for previous experiemnt, but the delay
for seeing the first element was a lot less [paper didn't seem to
specify how much].
- There were also focus group discussions and "verbal
protocols" (users would "think aloud" during experiment)
to assess secondary effects of QOS perception.
- Summary of relevant results:
- When pages load all-at-once, threshold for users to judge delay as
"low/unacceptable" is around 11 sec, consistnet with earlier
results suggseting that ~10s is threshold at which user becomes
distracted from task at hand. However, the variance is large
enough that authors cannot conclude that users will tolerate a specific
amount of latency before complaining.
- Users are much more tolerant of overall latency under incremental
loading: approximately speaking, users said "good enough" for
latencies of up to 39 sec under incremental loading (vs up to 5
sec for all-at-once), and the "unacceptable" threshold was as
high as 56s (vs 11s for all-at-once).
- The longer a user interacts with the site ("session length",
measured in number of page views), the lower his tolerance for
latency. The finding is statistically significant. [OTOH,
it's based on the time until the user cilcks the "I'm fed up"
button. It may be that once users learn they can use it without
incurring a penalty, they simply use it more often as the experiment
proceeds.] Graph is reproduced below, showing clear plateaus when
you graph number of page views vs. percent of users who find a fixed
delay acceptable.
- "Qualitative data" [sic] shows that although users are less
likely to abandon their shopping cart for delay reasons, they are still
pissed off and therefore less likely to come back in the
future.
- User tolerance of delay is affected by their expectations of
delay. In particular, users are less tolerant of delay if they are
"on a mission" vs. just browsing, and they "expect"
some operations to be faster than others (i.e. going back to a
previously viewed page or adding something to shopping cart "should
be" fast, whereas comparing several items "should be"
slower.)
- Authors define a "utility function" of a sessoin based on user
tolerance for delay. For a session consisting of page views v1, v2,
..., vN, utiltiy (-1<=utility<=1) is defined as the mean difference
between a user's tolerance threshold for access i and the actual
latency of access i, for i=1...N. Utility=0 is
"just acceptable", 1 is "exceeds expectations", -1 is
"unacceptable relative to expectations". [This is weird
since it means faster-than-acceptable accesses can compensate for
slower-than-acceptable ones...speaking as a user I don't think I'd forgive
slow accesses just because some other accesses were faster than
expected!] Recall that the threshold for access i may decrease
as i increases (result 3 above). Their table illustrating this
was very confusing, and overall the "utility" definition doesn't
seem to have been used to generate any specific neat result.
Relevance to ROC
- There seems to be a real threshold of "user pain" such that if
we get recovery time below that threshold, we can keep user happy.
This represents a concrete target bound for MTTR.
- This threshold goes down the longer the session. If there's a way to
prioritize user requests, users whose sessions are longest should get
highest priority. (They're probably the ones closest to completing a
purchase anyway, and may be more likely to return if they remember a
positive experience.)
- Icnremental loading seems to help, so, structure system such that task
granularity is smaller and task can be completed icnrementally. Then,
failure-and-recovery as a "blip" during task completion may be
more forgivable, since the task started making visible progress
towards completion very rapidly.
Flaws
- User study size was quite small, but it is hard to do these things so I
forgive them.
- Difficult to factor out confounding factors specific to individual users,
hence high variance in delay-tolerance threshold.
- Having the "I'm fed up" button seems like it would distort
results once users become aware they can use it without incurring any
penalty. To me, this makes the "latency tolerance decreases over
time" result somewhat suspect.
Back to index