Usability challenges in digital learning solutions

Usability is a key element in successful software. Ensuring the technical usability of a learning solution enables users to focus on their main task, learning. The purpose of this paper is to demonstrate the results of heuristic usability evaluations of digital learning solutions. Heuristic evaluations were conducted on 24 digital learning solutions from one country (Finland) and two country groups (Asian countries and Spanish speaking countries) concentrating on the usability of the user interface of each evaluated solution. The main results of this study indicate that a few heuristics cover the majority of all usability problems (UPs) observed in learning solutions, but these heuristics contain a relatively low proportion of the UPs rated as severe. The results also indicated differences in the usability problems (UPs) observed between different types of digital learning solutions and between digital learning solutions from different countries or country groups.


I. INTRODUCTION
Use of digital learning solutions in learning and teaching has become more popular over past decades (e.g.[12]).There is a wide variety of different digital learning solutions available, but also digital solutions that have not been originally designed for learning are utilized [7].However, in many cases digital solutions are used in ways their designers had not imagined [9].Digital solutions that have not been designed for educational use like social media tools [2], virtual worlds [28] and mobile devices [9] are also used in teaching and learning.The use of digital solutions that have not been designed for educational use can lead to challenges with usability [15], particularly in light of usage purpose and context [9].
Evaluating the usability of a digital solution can be approached via various techniques.Techniques include methods for user testing and usability inspections conducted by usability experts.User testing methods range from simple user testing situations [8] to usability questionnaire techniques ( [3] [27]).Usability inspection techniques are used mainly to assess the technical usability of a digital solution by means of heuristic usability evaluations [16], cognitive walkthroughs ( [20] [29]), time-testing [25] and error counting [4].These methods have value for various situations, with certain outcomes in mind and can be used on various types of software.
Usability challenges have been explored on various devices, software and services including medical devices [31], software for work contexts [19], e-learning platforms [5], digital textbooks [10] and e-learning courses [30].Common usability challenges in devices, software and services cover various topics including consistency, informing users about system status, providing feedback and more guidance to users, navigational structures and aesthetic integrity of the user interface ( [5][10] [30]).Although the topics covered in previous research vary, based on the set of heuristics used, a commonly shared feature seems to be that the majority of usability challenges have concentrated only on a small amount of key issues such as consistency and informing the user about system status ( [5][19] [30][31]).
Mayes and Fowler [13] argue that the usability of digital learning solutions cannot be measured similarly to software aimed for work contexts.They point out a paradox in digital learning solutions, in that usability is not necessarily a prerequisite for deep learning and argue that approaching learning as a conventional task can be a misguided approach, since learning is commonly a "by-product of doing something else" and that it is this "something-else" that should be supported [13].However, Kukulska-Hulme [9] raises the issue that for the most part, mobile learning happens on devices that have not been designed with educational use in mind.All devices and software, whether they are designed for educational use or not, could benefit from ensuring a basic level of technical usability, because it enables learners to focus on their learning tasks instead of tackling problems caused by technology [22].
In this study the aim is to further explore usability challenges in digital learning solutions.The paper is based on an ongoing Finnish research project "Systemic Learning Solutions (Systech)", which aims at developing research-based principles for the design and use of digital learning solutions (see [6]), where usability evaluation is part of the principles for the design of learning solutions.Main aim of the usability evaluation was to identify usability challenges or problems (UPs) and their severity with heuristic evaluations of digital learning solutions.The study also examined tentative differences in two background variables: firstly, between types of digital learning solutions and secondly, between countries in which the learning solutions were designed.
The following sections address these questions through breaking down and explaining the nature of heuristic evaluations, as well as outlining the empirical process of this study.The results are presented in terms of usability issue type and distribution of usability percentages.Differences between country groups are reflected in the results discussions, which subsequently inform our conclusion which focuses on existing heuristic evaluation methods while proposing improvements based on this study's findings.

II. HEURISTIC EVALUATION
Heuristic evaluation is a systematic method to evaluate the usability of a user interface of software [16].The heuristic evaluation of software user interfaces is conducted by a small number of evaluators, who go through the interface and judge how well its design complies with commonly accepted usability principles called 'heuristics' ( [1][17]).Heuristic evaluation is one of the most commonly used usability inspection methods, due to its low cost in comparison with other testing methods and intuitiveness of use [30].
Heuristic evaluations have been developed from extensive design principles [26] to more manageable sets of heuristics ( [16] [22]) that can be used in conducting these heuristic evaluations (Table I).Heuristic evaluations are commonly conducted in a way similar to that suggested by Nielsen and Molich [16], which have been further developed by Nielsen ([17][18] [20]).Furthermore, Nielsen's [20] work on improving the effectiveness and enhancing the explanatory power of heuristic evaluations has made heuristic evaluation a popular subject of study.

Heuristic Description
Visibility of the system status The system should always keep users informed about what is going on, through appropriate feedback within reasonable time.

Match between system and the real world
The system should speak the users' language, with words, phrases and concepts familiar to the user, rather than systemoriented terms.Follow real-world conventions, making information appear in a natural and logical order.
User control and freedom Users often choose system functions by mistake and will need a clearly marked "emergency exit" to leave the unwanted state without having to go through an extended dialogue.Support undo and redo.

Consistency and standards
Users should not have to wonder whether different words, situations, or actions mean the same thing.Follow platform conventions.
Error prevention Even better than good error messages is a careful design which prevents a problem from occurring in the first place.Either eliminate error-prone conditions or check for them and present users with a confirmation option before they commit to the action.
Recognition rather than recall Minimize the user's memory load by making objects, actions, and options visible.The user should not have to remember information from one part of the dialogue to another.Instructions for use of the system should be visible or easily retrievable whenever appropriate.
Flexibility and efficiency of use Explanation: Accelerators --unseen by the novice user --may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users.Allow users to tailor frequent actions.
Aesthetic and minimalistic design Dialogues should not contain information which is irrelevant or rarely needed.Every extra unit of information in a dialogue competes with the relevant units of information and diminishes their relative visibility.

Helping users recognize, diagnose, and recover from errors
Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution.

Help and documentation
Even though it is better if the system can be used without documentation, it may be necessary to provide help and documentation.Any such information should be easy to search, focused on the user's task, list concrete steps to be carried out, and not be too large One branch of heuristic evaluation study has focused on increasing the explanatory power of heuristics in analyzing the usability of digital learning solutions (e.g.[22][23] [24]).
Various attempts have been made to create a set of heuristics that includes both the technical [20] and pedagogical usability aspects [22].The main aim of these heuristic sets, that combine technical and pedagogical usability has been to emphasize the need for inclusion of pedagogical features when assessing the usability of digital learning solutions ([14] [22][23]).In addition, Magoulas, Chen, and Papanikolaou [11] have integrated heuristic evaluation with layered evaluation of adaptive learning environments.

III. RESEARCH DESIGN
The main aim of the study was to evaluate the amount and severity of usability problems (UPs) in digital learning solutions.In addition, the study aimed at exploring the tentative differences between country groups in which the evaluated digital learning solutions are designed and digital learning solution types.

A. Evaluation procedure
The usability evaluation of digital learning solutions were conducted via heuristic evaluation based on Nielsen's [21] ten usability heuristics (see Table I).The usability evaluations were conducted by two researchers who individually / independently evaluated each digital learning solution and reported their observations.Each of the observations was: marked with one or more heuristics to which it related to; a description of the usability problem (UP); a rating of the severity of the problem; and a suggestion on how to fix the problem.The severity of each UP was marked as either minor, moderate or major according to whether the digital learning solution could be used or if the UP prevents the use of the digital learning solution or a part of it.
The evaluators were researchers with a sizeable knowledge about usability and usability testing methods, but differed in their other expertise.One of the researchers was experienced in the fields of usability, user experience and design.The other researcher was experienced in the fields of usability, education and pedagogical use of information and communication technology.

B. Description of digital learning solutions
The heuristic evaluation was conducted for altogether 24 digital learning solutions from five countries.These digital learning solutions were selected based on suggestions from Systech research and company partners in five countries: Chile, Hong Kong, Finland, South Korea and Spain.These individual countries were later grouped based on cultural similarity to two country groups: Asian countries (Hong Kong and South Korea) and Spanish speaking countries (Chile and Spain).Finland was left as an individual country since the amount of digital learning solutions available from Finland (10) exceeded the combined totals of learning solutions for either of the other country groups Asian countries (8) and Spanish speaking countries (6).
These digital learning solutions represent a diverse sample of technological learning solutions, with different use contexts (from classroom use to extracurricular activities), usage purposes, intended learning outcomes and user groups (from preschoolers to adult learners).They were divided into two groups, namely 1) content learning solutions (altogether 12 digital learning solutions), and 2) tools and platforms (12 digital learning solutions).Content learning solutions focused on teaching a particular preset of data or skills, with none or only minimal options for users to modify content.The selection of content learning solutions represented online learning environments for various subjects (e.g.mathematics, languages and music).They offered experiences in content enrichment, games and exercises.Tools and platforms were solutions for creating or distributing content from multiple sources or they were collections of materials.The tools and platforms were course material and other content (e.g.routes) creation software, solutions for testing knowledge, video and game platforms and platforms for applied learning, such as physics simulations or driver education.

C. Analysis
The data consisted of 24 heuristic evaluation report sheets, where one sheet combined all the observations made by two evaluators about a digital learning solution.Evaluator data was combined and observations of the same usability problem were combined to remove redundancy.There were altogether 418 observed usability issues in the 24 evaluated digital learning solutions.These observations consisted of description of the issue, severity rating, suggested solution for the issue and one more heuristics it violated.One observation could be a violation of one or more heuristics and these occurrences of heuristics were counted as usability problems (UPs).The total amount of usability problems for all 10 heuristics was 509, which is higher than the amount of observations (418), showing that there were numerous instances where individual usability issues addressed more than one heuristic.
The data was analyzed according to the amount of UPs and severity ratings for each heuristic.The UP amount and severity ratings were further analyzed according to country group the digital learning solutions belonged to and the type of digital learning solution they represented.

A. Usability problems of digital learning solutions 1) Amount:
The data analysis revealed large variation in the amount and severity of usability problems across the ten heuristics (Table II).It was realized that five heuristics covered altogether 73 % of the observed usability problems.
The most frequent heuristic was consistency and standards with 27 % of total UPs.The distribution of other four most frequent heuristics varied between 10-12 %.For the remaining five heuristic the distribution varied between 5-7 %.
2) Severity: Variation in the severity ratings within heuristics was for the most part shared by heuristics and only two showed a different variation of severity ratings.Eight heuristics had a clear pattern of having high amounts of minor usability problems (54-74 %); a modest amount of moderate UPs (12-31%) and a relatively low amount of major usability problems (3-16%).Out of these eight heuristics only one heuristic match between system and the real world had more major (16%) than moderate usability problems (12%), while others had more moderate (19-31%) than major usability problems (3-16%).The greatest difference in severity ratings could be observed in two heuristics: 'error prevention' and 'helping users recognize, diagnose, and recover from errors', which have 40-52% of major usability problems, 26% moderate UPs and 22-34% of minor UPs.
3) Cross-analysis of amount and severity: The five most frequent heuristics also share the feature of having more than 59% of usability problems connected to them given a severity rating of being minor usability problems.The three heuristics with the lowest to third lowest percentage of all observations show a similar trend by having more than 53% of all observed usability problems rated as minor usability problems and under 16% rated as major usability problems.The remaining two heuristics that deal with errors, 'error prevention' and 'helping users recognize, diagnose, and recover from errors' both share a feature of having more than 39% of all usability problems rated as major usability problems, which will be discussed in more detail later on in this paper.

B. Description of significant heuristics/usability problems 1) Heuristic category -Consistency and standards:
The data analysis revealed large variation in the amount and severity of usability problems across the ten heuristics (see Table II).It was realized that five heuristics covered altogether 73 % of the observed usability problems.The most frequent heuristic was 'consistency and standards' with 27 % of total UPs.The distribution of other four most frequent heuristics varied between 10-12 %.For the remaining five heuristic the distribution varied between 5-7 %.When looking at differences between three groups of countries (Asian countries, Finland and Spanish speaking countries) some differences in the severity ratings between country groups can be observed (Table III).The distribution of severity ratings in the heuristic 'consistency and standards' shows that digital learning solutions from both Asian countries and Spanish speaking countries have a high number of UPs rated as minor (82-85%).Differing distribution can be observed in the Finnish solutions where there are 60 % of minor UPs and 35% of UPS with moderate severity.This difference could be further explored by looking at the distribution of usability problems within the heuristic consistency and standards between two types of digital learning solutions (Table IV).Overall trend in both content solutions and tools and platforms is similar when looking at UPs from all 24 digital learning solutions.Most of the UPs 70-78 % are rated minor, 20-26 % as moderate and 2-4 % as major.
2) Heuristic category: Preventing and recovering from errors: The heuristics 'helping users recognize, diagnose, and recover from errors' and 'error prevention' contain respectively 5 % and 8 % of all UPs (Table II).Even though the amount of UPs is relatively low in both heuristics the amount of UPs rated as major.'Helping users recognize, diagnose, and recover from errors' and 'error prevention' have a distribution of 22-34% of minor, 26% moderate and 40-52% major UPs.UPs for the two heuristics consisted of issues with input formatting, password generation and recovery, nonfunctional items and error situations and messages.
The variation between Asian countries, Finland and Spanish speaking countries show some differences in the severity ratings of the heuristics 'helping users recognize, diagnose, and recover from errors' and 'error prevention' can be observed (Table III).These two heuristics have both in Asian countries and Spanish speaking countries a similar distribution within both country groups.Digital learning solutions from Finland show a clearly different distributions between these two heuristics.'Error prevention' shows a pattern that is similar to the digital learning solutions from Asian countries in regards to the severity ratings, with all severity rating groups having almost one third of all UPs.However 'helping users recognize, diagnose, and recover from errors' shows a clear difference in distribution having 9 % minor, 18% moderate and 73 % major UPs.
When comparing digital learning solution types (content solutions and tools and platforms) in respect to the two heuristics, 'error prevention' and 'helping users recognize, diagnose, and recover from errors' (Table IV), there are merging patterns in the distribution of severity ratings.Content solutions have a similar pattern for both heuristics with percentages of minor (32-44 %) and major (39-44 %) being similar and the amount of moderate UPs being the smallest (11-29%).Tools and platforms have similar pattern in 'error prevention' with 40% minor, 20% moderate and 40% major UPs, but not in 'helping users recognize, diagnose, and recover from errors'.Tools and platforms a distribution of 7% minor, 36% moderate and 57% major UPs in helping users recognize, diagnose, and recover from errors.

V. DISCUSSION
The main results from this study verify the knowledge from earlier research ( [5][19] [31]) that a few heuristics cover the majority of all usability problems.Significant amount (27%) of UPs were categorized under one heuristic, namely 'consistency and standards', and the five heuristics with highest amount of UPs covered 73 % of all UPs.However, even though these heuristics covered the majority of all UPs more than half of the UPs in these heuristics were rated as minor.In general UPs in these heuristics were considered by the evaluators as issues that may hinder the learnability and efficiency of use and the overall user experience, but do not necessarily prevent completing tasks with the digital learning solution.
Heuristics that showed the largest proportion of major usability problems were 'error prevention' and 'helping users recognize, diagnose, and recover from errors'.These two heuristics represent 12 % of all UPs, with more than half of the UPs rated as major UPs.This would suggest that UPs related to heuristics dealing with errors are mainly perceived as UPs that should be fixed most urgently.However, in this study the amount of observations under heuristics 'helping users recognize, diagnose, and recover from errors' and 'error prevention' is too low to make conclusions about the differences between country groups and digital learning solution types.The results of this study suggest that there is a difference in the distribution of severity ratings of these two heuristics compared to the other eight heuristics that could be further explored with additional research.In previous research there has also been indications that the distribution of severity ratings might vary between heuristics ( [19][30]).
The two types of digital learning solutions, tools and platforms and content learning solutions, showed a similar distribution of amount and severity ratings in almost all of the heuristics analyzed in more detail.Only one heuristic 'helping users recognize, diagnose, and recover from errors' demonstrated a shift in tools and platforms having more major UPs and moderate UPs than minor UPs.The category of tools and platforms consisted of a variation of digital learning solutions and in future research endeavors it might be relevant to divide the digital learning solutions in more precise subcategories.
There are four major limitations to this study: amount of digital learning solutions, digital learning solution types, number of evaluators and the set of heuristics used in the study.The first limitation is the sample size from each country or country group is not the same (6-10 digital learning solutions), which hinders the cross cultural analysis of the results.In future research the amount of learning solutions from each country or country group should be the same.Second limitation concerns the variation of digital learning solution types of from each country or country group and in future research each country should be represented by the same amount of each learning solution type.Furthermore the categorization of digital learning solutions might require additional research, since two large groups, content learning solutions and tools and platforms, might not be enough to explain the differences between digital learning solutions.Third limitation is the amount of evaluators, which in this study was two, while the recommended amount for heuristic evaluation is at least three evaluators [17], and in future research at least three usability experts will be used.The fourth limitation is the set of heuristics [21] used, which has been designed with the technical usability in mind and do not take pedagogical concerns into account.Pedagogical concerns in digital learning solutions will be addressed by further research of the digital learning solutions with pedagogical experts.
The suggested minimum number of evaluators for heuristic evaluation is three as was discovered by Nielsen [17] However as Nielsen's [17] results suggested, double specialists can find a significantly higher amount of UPs than regular usability specialists.Double specialists in Nielsen's [17] study consisted of usability experts who also had experience of the software type being evaluated.In this study two usability researchers, who had further experience of either learning solutions or interface design, which would classify them as double specialists in their respective fields.This would in general support the use of only two usability experts.However, additional experts could have benefitted the overall coverage of all UPs in the evaluated digital learning solutions and therefore in future research endeavors this matter should be addressed.
In general the set of ten heuristics [21] was considered by the evaluators to be useful, but for some usability problems it was difficult to find a suitable category and a broader set of heuristics might be needed.The evaluators noted that in particular problems regarding situations where errors had already occurred or features were not functioning at all, the current heuristics did not offer a category suitable to describe these types of UPs.These types of observations were categorized under the closest suitable heuristic such as error prevention, even though they do not completely fit the category.

TABLE I .
NIELSEN'S [21] TEN USABILITY HEURISTICS FOR USER INTERFACE DESIGN

TABLE II .
UPS AND SEVERITY RATINGS

TABLE III .
DIFFERENCES IN USABILITY PROBLEMS FOR THREE HEURISTICS IN FINLAND AND TWO COUNTRY GROUPS

TABLE IV .
USABILITY PROBLEMS IN CONTENT SOLUTIONS AND TOOLS AND PLATFORMS FOR THREE HEURISTICS