Happiness tracking to measure the user experience

To measure the user experience of Deakin Library Search, I implemented a Happiness Tracking Survey. I adapted the HaTS: Large-scale In-product Measurement of User Attitudes & Experiences with Happiness Tracking Surveys that Google deployed in several products. We set out to “collect attitudinal data at a large scale directly in the product and over time”. Analysis of data can inform product decisions and measure progress towards product goals. Anyone who has worked with a library discovery system likely knows there are many decisions to make to optimise the user experience.

Adapting the questionnaire

Question 1

Satisfaction is measured through the question “Overall, how satisfied or dissatisfied are you with [product]?”

(Müller & Sedley, 2014)

For the [product] name we used Library Search. As Müller and Sedley recommend, I maintained the neutral wording of the question text, and used a 7-point scale “to optimize validity and reliability, while minimizing the respondents’ efforts”. However, the survey tool I chose doesn’t allow labeling every point on an opinion scale, only the mid-point and the polar extremes. It also numbers each point on an opinion scale. I accepted these differences to how the authors constructed their response scale.

Overall how satisfied or dissatisfied are you with Library Search?
Figure 1: Overall satisfaction measurement with a partially labeled 7-point scale.

Other best practices adhered to in my implementation include:

  • minimizing order bias by displaying scale items horizontally and with equal spacing
  • minimizing the effect of satisficing by labeling the midpoint “Neither satisfied nor dissatisfied” (instead of “Neutral”)
  • allowing for a more natural mapping to how respondents interpret biplor constructs by listing the negative extreme first

Same as in the authors case study, the satisfaction question is the only mandatory question. This ensures responses meet the primary objective, to be able to track changes in users’ attitudes and to associate those shifts to changes in the product.

Likely to recommend question—not applicable

Where there are competitors and alternatives, it may make sense for Google to ask “How likely are you to recommend [product] to a friend or colleague?” However, it is redundant to ask a Net Promoter Score question in the context of users of a library discovery system.

Open-ended questions gather qualitative data

To gather qualitative data about users’ experiences with a given product, HaTS also includes two open-ended questions.

(Müller & Sedley, 2014)
What, if anything, do you find frustrating or unappealing about Library Search? 
What new features would you like to see for Library Search?
Figure 2: Open-ended question about frustrations and new capabilities to be added to the product.

Müller and Sedley found “that asking about experienced frustrations and needed new capabilities in the same question increased the response quantity … and quality … and minimized the analysis effort as compared to using two separate questions”. Our implementation substituted the word features instead of capabilities in effort towards plain language. The survey tool I chose only shows one line at a time, and has a less intuitive instruction of SHIFT + ENTER to make a line break. These are not ideal as they may have resulted in shorter responses than a large multi-line text box.

Adding “(Optional)” in the beginning of the question maintains the number of responses to these questions, and produces increased response quality.

To better identify opportunities, the question of frustrations is presented first before asking about areas of appreciation.

What do you like best about Library Search?
Figure 3: Open-ended question capturing areas of appreciation.

Satisfaction with specific tasks

“HaTS also assesses different components of the user experience, in particular, … satisfaction with product-specific tasks.”

(Müller & Sedley, 2014)

To ensure reliable satisfaction scoring, HaTS asks respondents to first select tasks they have attempted over the last month. To avoid response order effects, the order of tasks should be randomised across respondents.

In the last month, which of the following tasks have you tried to accomplish with Library Search?
Figure 4: Product tasks selection question.

Using logic flows, only selected tasks appear in the subsequent satisfaction scoring questions. The survey tool I chose doesn’t have a conditional grid matrix to score several tasks at once. Instead, each selected task is presented one at a time. The advantage of this way is it avoids overwhelming respondents.

How satisfied or dissatisfied are you with doing the following tasks in Library Search? Access full text
Figure 5: Satisfaction with product tasks selected in the previous question.

Respondent characteristics—not included

HaTS can be configured to ask users “to self-report some of their characteristics in the context of using the product being measured.” For example, “In the last month, on about how many days have you used [product]?” Rather than directly asking respondents about their product usage, “it is preferred to pipe data directly into the survey database.” However, such data is lacking from products that don’t require authentication. At the time of implementation, authentication was not required to use Deakin Library Search. Besides, this was our first time collecting masses of data on arguably the most important tool in a digital library. Layering on user data felt overwhelming and overkill for an academic library.

Intercept survey tool

The tool I chose for our HaTS was Typeform. Unlike its competitors, Typeform has a more conversational feel, has easy to create logic jumps, and works great on every device.

Library Search - Happiness tracking survey questions and logic in Typeform.
Figure 6: Complete questionnaire built in Typeform.

However, while Typeform works great on mobile devices, configuring the invitation to display on the mobile view of Library Search proved impossible.

Invitation on search results page

Only the desktop view settings could be adjusted in Deakin Library Search. Working with the vendor, I added custom HTML, CSS, and JavaScript to make the invitation appear in a banner at the top of search engine results pages.

“Help us improve [product] Take our survey!” is the invitation banner to our happiness tracking survey.
Figure 7: Invitation link injected at the top of search results.

A browser cookie records whether someone takes the survey or hides the invitation. This means they won’t see another invitation for 12 weeks in the same browser and device. This is the best we could do to avoid over-sampling issues and effects of survey fatigue. When using public devices such as on-campus computers, repeat invitations may be noticed. (Later, Single Sign On (SSO) for Library Search was scheduled for implementation. Requiring authentication could also enable random sampling from the entire user base.)

Take our survey!

I designed the Typeform questionnaire to launch in a pop-up modal. This maintains the context of the product being evaluated.

Automating survey data storage

Then I set up a Typeform to Airtable Integration (via Zapier) to automate pushing survey data into an Airtable database. In August 2017, the first month of operation, 237 responses were submitted. This exceeded the limit of 100 automation tasks per month on a Zapier free plan.

Happiness tracking survey response rate should decline throughout sampling period and due to business related seasonal variation.
Figure 9: Frequency of responses quickly ramps down to less than 50 in October, and less in November.

After the initial peak in August, the frequency of responses per month drops off quickly. The main reason for fewer responses in subsequent months is that people who had either taken the survey or dismissed the invitation would not see another invitation for 12 weeks. The academic calendar also causes seasonal variations of usage. Deakin University trimester 2 exams end by late October. Based on these factors, I expect the next surge of responses to coincide with early trimester 1 in 2018. I recommend automating survey data storage. For instance, you can now send data from a Typeform PRO account directly to Airtable.

Satisfaction scale data

Over the four months, I gathered data to establish baselines for satisfaction overall and with product-specific tasks. The satisfaction scales from extremely dissatisfied to extremely satisfied, numerically coded from 1 to 7.

Happiness tracking dashboard reporting the satisfaction with Library Search overall, and with 6 specific tasks.
Figure 10: Dashboard reporting trend of overall satisfaction with Library Search, and scores for 6 specific tasks.

Overall satisfaction scored an average of just over four, the mid-point of the scale. The general sentiment towards Library Search is neither satisfied nor dissatisfied. Analysing qualitative responses might hint at where there is room for improvement.

Monitoring satisfaction trends with product-specific tasks is critical to inform product decisions. The following tasks and their baseline satisfaction are:

  • access full text 4.5
  • research a general topic 4.6
  • download a journal article 5.1
  • look for a specific book or journal article 4.5
  • download an ebook 4.4
  • request a book 5.7

Analysing qualitative feedback

Digging in to responses of what people find frustrating has been most fruitful. Collaborating with colleagues in the Digital Library & Repositories team, I clustered similar pain points together to build themes. Analysis quickly led to actionable insights such as:

  • session expires before activities are complete
  • book reviews (rather than books themselves) were dominating results
  • confusing and overwhelming interface

Putting research into action

Armed with user research evidence and working with the vendor, these issues could be addressed. A session keeper app was developed specifically for Deakin to resolve the annoying timeout issue. Also, to exclude book reviews by default was another new enhancement. This search feature was designed to resolve the issue of reviews cluttering results pages.

Want to start tracking happiness in your digital product?

Have you used intercept surveys for happiness tracking in a digital product? Tell us about it.

Would you like to start tracking happiness and learn how to optimise user experiences of your digital product? Would a database template help? Leave a comment.