Researching Usability

Realism in testing search interfaces

Posted on: October 5, 2010

When carrying out usability studies on search interfaces, it’s often better to favour interview-based tasks over pre-defined ‘scavenger-hunt’ tasks. In this post I’ll explain why this is the case and why you may have to sacrifice capturing metrics in order to achieve this realism.

In 2006, Jared Spool of User Interface Engineering wrote an article entitled Interview-Based Tasks: Learning from Leonardo DiCaprio in it he explains that it often isn’t enough to create test tasks that ask participants to find a specific item on a website. He calls such a task a Scavenger-Hunt task. Instead he introduces the idea of interview-based tasks.

When testing the search interface for a library catalogue, a Scavenger Hunt task might read:

You are studying Russian Literature and your will be reading Leo Tolstoy soon. Find the English version of Tolstoy’s ‘War and Peace’ in the library catalogue.

I’ll refer to this as the Tolstoy Task in this post. Most of your participants (if they’re university students) should have no trouble understanding the task. But it probably won’t feel real to any of them. Most of them will simply type ‘war and peace’ into the search and see what happens.

Red routes

The Tolstoy Task is not useless, you’ll probably still witness things of interest. So it’s better than having no testing at all.

But it answers only one question – When users know the title of the book, author and how to spell them both correctly, how easy is it to find the English version of Leo Tolstoy’s War and Peace?

A very specific question like this can still be useful for many websites. For example a car insurance company could ask – When the user has all of his vehicle documents in front of him, how easy is it for them to get a quote from our website?

Answering this question would give them a pretty good idea of how well their website was working. This is because it’s probably the most important journey on the site. Most websites have what Dr David Travis calls Red Routes – the key journeys on a website. When you measure the usability of a website’s red routes you effectively measure the usability of the site.

However many search interfaces such as that for a university library catalogue, don’t have one or two specific tasks that are more important than any others. It’s possible to categorise tasks but difficult to introduce them into a usability test without sacrificing a lot of realism.

Interview-based tasks

The interview-based task is Spool’s answer to the shortfalls of the Scavenger Hunt task. This is where you create a task with the input of the participant and agree what successful completion of the task will mean before they begin.

When using search interfaces, people often develop search tactics based upon the results they are being shown. As a result they can change tactics several times. They can change their view of the problem based upon the feedback they are getting.

Whilst testing the Aquabrowser catalogue for the University of Edinburgh, participants helped me to create tasks that I’d never have been able to do so on my own. Had we not done this, I wouldn’t have been able to observe their true behaviour.

One participant used the search interface to decide her approach to an essay question. Together we created a task scenario where she was given an essay to write on National identity in the work of Robert Louis Stevenson.

She had decided that the architecture in Jekyll and Hyde whilst set in London, had reminded her more of Edinburgh. She searched for sources that referred to Edinburgh’s architecture in Scottish literature, opinion on architecture in Stevenson’s work and opinion on architecture in national identity.

The level of engagement she had in the task allowed me to observe behaviour that a pre-written task would never have been able to do.

It also made no assumptions about how she uses the interface. In the Tolstoy task, I’d be assuming that people arrive at the interface with a set amount of knowledge. In an interview-based task I can establish how much knowledge they would have about a specific task before they use the interface. I simply ask them.

Realism versus measurement

The downside to using such personalised tasks is that it’s very difficult to report useful measurements. When you pre-define tasks you know that each participant will perform the same task. So you can measure the performance of that task. By doing this you can ask “How usable is this interface?” and provide an answer.

With interview-based tasks this is often impossible because the tasks vary in subject and complexity. It’s often  then inappropriate to use them to provide an overall measure of usability.

Exposing issues

I believe that usability testing is more reliable as a method for exposing issues than it is at providing a measure of usability. This is why I favour using interview-based tasks in most cases.

It’s difficult to say how true to life the experience you’re watching is. If they were sitting at home attempting a task then there’d be nobody watching them and taking notes. Nobody would be asking them to think aloud and showing interest in what they were doing. So if they fail a task in a lab, can you be sure they’d fail it at home?

But for observing issues I feel it’s more reliable. If participants misunderstand something about the interface in a test, you can be fairly sure that someone at home will be making that same misunderstanding.

And it can never hurt to make something more obvious.

Advertisements

6 Responses to "Realism in testing search interfaces"

This is a really interesting article – I totally agree that setting tasks which are as natural as possible give a truer picture of what the real usability issues are rather than relying on the user effectively ‘role playing’ to a script (which could end up being more like running a badly designed user acceptance test than usability testing). I’ve got a couple of suggestions that might help with the fact that interview based tasks can make the output less quantifiable:

1) Consider starting with a non-natural (i.e. scripted) task (preferably letting them start wherever they’d normally start i.e. in the library catalogue or in Google). If the task is structured so that it should be an easy one to complete then it can act as a good warm up for the user and also flag unanticipated usability issues which arise from unexpected user behaviour. Then you can lead onto the more open, natural tasks.

2) Finish the usability interview by asking the user to fill in a basic questionnaire (5 questions max) which asks really high level questions such as ‘How easy was it to find what you were looking for’ (with a simple 5 scale tick box response). An open response box at the end which asks something as simple as ‘Any other comments you would like to add?’ or ‘If you could change one thing what would it be?’ gives users the chance to add something they didn’t get the chance to mention while they were completing the task.

The benefit of adding in these two elements is that it introduces an element of quantifiability which can act as a sense check against the interviewers’ perception of how well the session went. It also provides some headline figures that can be added into the report alongside any additional analysis of how easily users found what they were looking for. If the main aim of a usability study is to convince others that a) changes are needed or b) that the changes made are successful then quantitative findings together with the users’ voice can work well (in my experience).

Of course there are sometimes projects where the findings of usability studies are overwhelmingly convincing but the identified improvements still never get implemented … but that is a whole other issue 🙂

Hi Helen, thanks for the comment. I’m glad you found it interesting.

I wouldn’t advise giving people a warm up task before you hit them with an interview-based one in most cases. This is because your fast task is going to be the most fresh.

In reality the participant is unlikely to do 6 or 7 tasks in the one sitting at home. But this is what we essentially ask people to do in usability tests.

Your first task is the one where you can be confident you’re seeing more realistic behaviour. This is because they haven’t just used the same interface seconds ago to attempt a different task in which they can learn things in order to help them with the current one.

I know what you mean about improvements not being implemented. It’s hard to get your head around, but I accept that clients are not paying me to improve their websites. Instead they are paying me to help them find out how they can improve their websites.

If they choose not to do so, well that’s up to them. I wasn’t quite so diplomatic about it when I worked internally for an organisation though :0)

Hi David,

Good article, and good timing – I’ve been doing a lot of thinking lately about this interview-based approach to testing.

Just as you say, search interfaces are very exploratory in nature, and users will often adjust their view of the problem as they go along. I will employ this interview-based approach in order to test a completely new service with users; one that is both inherently exploratory, and for which I have no way to know yet exactly how it will be used.

In that case, this kind of “organic” way to design tasks makes the most sense, I think.

When I read Spool’s article, I didn’t get the feeling that it *could* be applied in a quantitative test, you know. I think it is just another angle on qualitative testing. For me, that’s fine – that is the kind of testing I do the most, with a view to exposing usability issues, and not looking for absolute measures of usability.

Thanks for mentioning the link to David Travis’ arcticle about “Red Routes” – I haven’t seen that before. I wrote something about this subject a while ago, and it is good to read a good article that explores that idea so well.

@Helen and @david – good luck with convincing clients to act on issues that research brings up! I think that’s an issue we all have to cope with, in our profession! 🙂

[…] Realism in testing with search interfaces […]

[…] The busiest day of the year was October 6th with 159 views. The most popular post that day was Realism in testing search interfaces. […]

[…] I’ll save myself the trouble of writing it again and just link to it. If you’re wondering why Lorraine hasn’t kept the blog up-to-date she now works at […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

del.icio.us bookmarks

Twitter feed

Error: Twitter did not respond. Please wait a few minutes and refresh this page.

Archive

%d bloggers like this: