Nina Bhatti, Anna Bouch, and Allan Kuchinsky

One-line summary: Users get annoyed with an e-commerce website if page views take >11s (though variance is large); they are more tolerant of delays if incremental page loading is used; they are less tolerant of delay the longer their session with the Web site. Authors define a "utility function" to measure how "useful" (acceptable performance) an overall session is to the user, but they don't seem to use their definition for anything.

Overview/Main Points

There is significant prior research in the cognition literature on various aspects of this; this paper makes some findings that are generally consistent with that literature.
There are also correlations between latency tolerance and other factors, including how sophisticated the users are and other elements of their user profiles. For this study, authors used males age 18-68 who spend >2hrs/week on the Internet, have bought at least 2 things online in the last year, and rank themselves "intermediate" user skills.
Experiment: users were presented a website identical to HP Shopping (a production ecommerce store), but whose latencies of page delivery could be tuned. Goal was to measure users' perceptions of quality of service under various conditions:
- can user tolerance of delay be objectively quantified?
- what factors does that tolerance depend on?
- can we identify secondary effects of these perceptions on users' likelihood to return to a site?
Experiments:
- All users completed an identical set of tasks: look at a category of items to compare a few choices, select an item, add it to shopping cart, proceed to checkout (view cart).
- Using buttons on the web page, users would classify each page view as "sufficiently fast", "OK", or "unaccpetably slow". For one group of users, the delays were chosen randomly from a predetermined set that was previously found to capture "subjective" differnces in latency; for the control set, the delays were chosen "smoothly" [not clear what this means, but I assume it means that the difference in delay between any 2 successive page loads is small]. In this experiment, complete delay would elapse before any content of new page was shown.
- There was also an "I'm fed up" button that would instantaneously finish loading the page. [I know, it seems like knowing about this would swing the results...]
- In a second set of experiments, new page would load incrementally (banner first, then text, then graphics). The total delay (for whole page views) were same as for previous experiemnt, but the delay for seeing the first element was a lot less [paper didn't seem to specify how much].
- There were also focus group discussions and "verbal protocols" (users would "think aloud" during experiment) to assess secondary effects of QOS perception.
Summary of relevant results:
1. When pages load all-at-once, threshold for users to judge delay as "low/unacceptable" is around 11 sec, consistnet with earlier results suggseting that ~10s is threshold at which user becomes distracted from task at hand. However, the variance is large enough that authors cannot conclude that users will tolerate a specific amount of latency before complaining.
2. Users are much more tolerant of overall latency under incremental loading: approximately speaking, users said "good enough" for latencies of up to 39 sec under incremental loading (vs up to 5 sec for all-at-once), and the "unacceptable" threshold was as high as 56s (vs 11s for all-at-once).
3. The longer a user interacts with the site ("session length", measured in number of page views), the lower his tolerance for latency. The finding is statistically significant. [OTOH, it's based on the time until the user cilcks the "I'm fed up" button. It may be that once users learn they can use it without incurring a penalty, they simply use it more often as the experiment proceeds.] Graph is reproduced below, showing clear plateaus when you graph number of page views vs. percent of users who find a fixed delay acceptable.
4. "Qualitative data" [sic] shows that although users are less likely to abandon their shopping cart for delay reasons, they are still pissed off and therefore less likely to come back in the future.
5. User tolerance of delay is affected by their expectations of delay. In particular, users are less tolerant of delay if they are "on a mission" vs. just browsing, and they "expect" some operations to be faster than others (i.e. going back to a previously viewed page or adding something to shopping cart "should be" fast, whereas comparing several items "should be" slower.)
Authors define a "utility function" of a sessoin based on user tolerance for delay. For a session consisting of page views v1, v2, ..., vN, utiltiy (-1<=utility<=1) is defined as the mean difference between a user's tolerance threshold for access i and the actual latency of access i, for i=1...N. Utility=0 is "just acceptable", 1 is "exceeds expectations", -1 is "unacceptable relative to expectations". [This is weird since it means faster-than-acceptable accesses can compensate for slower-than-acceptable ones...speaking as a user I don't think I'd forgive slow accesses just because some other accesses were faster than expected!] Recall that the threshold for access i may decrease as i increases (result 3 above). Their table illustrating this was very confusing, and overall the "utility" definition doesn't seem to have been used to generate any specific neat result.

Relevance to ROC

There seems to be a real threshold of "user pain" such that if we get recovery time below that threshold, we can keep user happy. This represents a concrete target bound for MTTR.
This threshold goes down the longer the session. If there's a way to prioritize user requests, users whose sessions are longest should get highest priority. (They're probably the ones closest to completing a purchase anyway, and may be more likely to return if they remember a positive experience.)
Icnremental loading seems to help, so, structure system such that task granularity is smaller and task can be completed icnrementally. Then, failure-and-recovery as a "blip" during task completion may be more forgivable, since the task started making visible progress towards completion very rapidly.

Flaws

User study size was quite small, but it is hard to do these things so I forgive them.
Difficult to factor out confounding factors specific to individual users, hence high variance in delay-tolerance threshold.
Having the "I'm fed up" button seems like it would distort results once users become aware they can use it without incurring any penalty. To me, this makes the "latency tolerance decreases over time" result somewhat suspect.

Back to index

Integrating user-perceived quality into Web server design

Overview/Main Points

Relevance to ROC

Flaws