Analysing and Visualising Security and Usability in IRIS

Despite a long standing need to incorporate human factors into security risk analysis, taking a balanced approach to analysing security and usability concerns remains a challenge. Balancing security and usability is difficult due to human biases in security perception, and managing the sheer volume of data arising from risk and task analysis. This paper presents an approach for qualitatively and quantitively analysing and visualising the results of risk and task analysis. We demonstrate this approach using a realistic example, and we discuss how these techniques fit within the larger context of secure systems design.


I. INTRODUCTION
Risk analysis is a powerful tool for reasoning about the trade-offs involved in secure systems design, but there is a long standing need to incorporate human factors into this process. Unfortunately, risk and usability ratings for a system design are sufficiently coloured by analyst perceptions that human error can easily creep into any valuation. Without a means to rapidly score and visualise the current state of analysis, the security-usability balance can become uneven as risk analysis becomes more advanced.
To many engineers, usability is synonymous with user interface design [1]. Usability is, however, more than just about designing interfaces, it is a quality concerning the people interacting with these interfaces and how they use them to perform tasks. Recent work suggests that enforcing usability should be a common responsibility during analysis activities [2], but there are currently no tools, conceptual or otherwise, which measure the impact to usability of secure design decisions; without this information, enforcing responsibility is difficult as the sole focus will be on mitigating risks.
In [3], we introduced IRIS (Integrating Requirements and Information Security), an integrated framework for usable and secure software engineering. This paper discusses the specifics of how risk and usability are analysed within IRIS, and how the results of this analysis can be visualised. Section II describes related work on analysing and visualising risk and usability, before sections III and IV describe the process used by IRIS for qualitatively rating and quantitatively scoring risks and tasks. Section V describes how these ratings and scores are visualised in IRIS, before presenting an example of this approach in section VI. Finally, in section VII, we discuss how the techniques described in this paper can be applied within the larger context of secure systems design.

II. RELATED WORK
To many people, 'risk' is synonymous with some form of threat, danger or hazard. Techno-scientific approaches to risk expand on this synonym by incorporating the possibility of harm occurring; for example, Bradbury [4] defines risk as "the product of the probability and consequences (the magnitude and severity) of an adverse event". By extension, risk analysis is contingent on data about the likelihood of danger, and the impact this might have on a context of study. However, a weakness of existing quantitative approaches to risk analysis is that the nature of risk may become lost in precisely this data. When we think of an attack, we may consider the system as a whole, rather that subsystems of particular interest to an attacker or a defender. When stakeholders contribute data as part of a risk analysis exercise, they are likely to value particular assets over others. Moreover, when considering the protection of assets, or threats to them, stakeholders think in terms of particular properties which need to be safeguarded. Therefore, a framework for quantitative risk analysis data needs to be sensitive to the values participants place on assets, as well as security properties associated with threats, vulnerabilities and risk mitigation approaches.
Our work involves understanding how risk analysis can help or hinder the usability of work carried out by different people. Task analysis is concerned with the study of work performance, and aims to model how work, hereafter known as 'tasks', can be used to change the application domain [5]. Although different representations for task analysis exist, the most prevalent mechanism used for describing tasks in both HCI and Secure Software Engineering is the scenario, a textual narrative which describes how a stakeholder interacts with the domain under study. It has, however, been argued that scenarios make little reference to the stakeholders being described within them [6]. As introducing a security control impacts usability for people performing the task, rather than just the task per se, humanising scenarios is important. One means of capturing this human dimension is augmenting scenarios with personas; these provide a descriptive model of how archetypical users behave, think, what they wish to accomplish, and why [7]. By combining personas and scenarios, not only do we avoid introducing undue assumptions into analysis, we can realistically categorise the usability of tasks with respect to their indicative users.
Before stakeholders can measure the impact of usability of secure system design decisions, they need to identify the impact itself. Although there appears to be little work on visualising task and usability metrics, techniques from information visualisation have been applied to security risk analysis [8] [9], and risk analysis as part of the design process [10]. Hogganvik's recent thesis on the subject of visualising risk analysis [11] looked at how the colour and shape of model elements can be used to make information more accessible. Results from this work concluded that parsimony is important with respect to the number of model symbols and colours employed, and that colour may be a useful means of distinguishing the value of different risks.

Score
Threat Likelihood A risk rating can be assigned based on likelihood and severity tables in IEC 61508 [12] (see figure 1). However, this rating does not reflect values held about individual assets or threats. To score risks with respect to the perceived value of the assets threatened, we define a security property as a row vector c i a o , where c, i, a, and o represent the values held for confidentiality, integrity, availability and accountability respectively. Each element n is valued 0 ≤ n ≤ 3 based on whether the value held for that element is none, low, medium or high. The likelihood of the threat being realised, L r is computed using the equation where L t = the likelihood of the threat t associated with risk r, andm t is the mean likelihood value for the set of countermeasures mitigating the likelihood of L t occurring. The values of L t andm t exist within the range 0 ≤ n ≤ 5, and map to the likelihood categories in figure 1. The severity of the vulnerability exposed by risk r is computed using the equation where S v = the severity of the vulnerability v associated with risk r, andm s is the mean severity for the set of countermeasures mitigating the severity of S v . Like threat severity, vulnerability values exist within the range 0 ≤ n ≤ 4 and map to the vulnerability categories in figure 1.
Risk impact is described by a security property, representing the values held in the assets at risk from risk r. Risk impact is computed using the equation where P t = the security properties of the threat associated with risk r, P a = the security properties of the vulnerable or threatened assets at risk, andm p = the mean security properties for the countermeasures targeting the risk's threat or vulnerability.
Finally, the calculation for the Risk Score of risk r, R r , is computed, as the product of the threat likelihood, the severity of the vulnerability, and the risk impact to the threatened assets.
Each element of row vector is added together, and the sum is normalised to an integer between 1 and 9. If, during the above computations, negative numbers are calculated, these values are resolved to 0.

IV. TASK ANALYSIS
When defining tasks, four properties are set for each persona participating in a scenario. Each of these properties, described in figure 2 map to one of the usability components of ISO 9241-11 [13].
Each qualitative value x associated with a property maps to a natural number in the range 0 ≤ x ≤ 3, which correspond to the qualitative values of None, Low, Medium, and High. To ensure equal weighting for all 3 usability components, the usability of a task U t is computed using the equation  whereā +b 2 is the mean task efficiency,c is the mean task satisfaction, andd is the mean task effectiveness. Variables a, b, c, and d refer to the task duration, frequency, demands, and goal conflict respectively. The mean value is taken across all personas carrying out the task in question. In general, the higher the value of U t , the less usable a task is for the personas associated with it. Because values like low, medium, and high are ambiguous with respect to duration and frequency, more meaningful values are used. For duration, the qualitative ratings used are Seconds, Minutes, and Hours or Longer are associated with the values 1,2, and 3 respectively. For frequency, the ratings used are Monthly or less, Daily -Weekly, and Hourly or more.
When mitigating risks, one or more roles are associated with each mitigating countermeasure; these roles will, in some way, be directly affected by the countermeasure being designed. By associating roles with countermeasure within IRIS, candidate personas and their tasks can be identified. For each task-persona pairing, countermeasure usability properties can be specified. These properties, described in figure 3 are similar, but not identical to, those specified in figure 2.
Based on this countermeasure usability data, it is possible to calculate the countermeasure usability factor C t . The right hand side of the equation which computes C t is identical to U t , i.e.
however the values of are different.ā +b 2 is the mean contribution to task efficiency,c is the mean contribution to task satisfaction, andd is the mean contribution to task effectiveness. Like U t , the variables a, b, c, and d refer to the task duration, frequency, demands, and goal conflict respectively. The mean contributing value is taken across all countermeasures affecting the task in question. Unlike U t ,  Based on these equations, we compute the task summative usability SU t to be Like U t , the higher the score, the less usable the task is for the associated personas. After calculating U t and SU t , the score is normalised to a natural number in the range 0 ≤ n ≤ 9. Given the potential of a task to increase or decrease usability, this value remains unchanged irrespective of it being a high positive or negative number. The on-going results of risk and task analysis are visualised within IRIS using a risk analysis model. This model provides a quick-look view of the current analysis.

V. VISUALISING RISK AND TASK ANALYSIS
With so much quantitative and qualitative information associated with risk analysis, visual clutter can be problem as the model expands. Minimal distinctions in colour can be used to reduce visual clutter, and small contrasts enrich the visual signal increasing the number of possible distinctions [14]. Therefore, we map the normalised values for R r and SU t to the respective risk and task usability colour charts in figure 4. The higher the risk or task usability score, the deeper the hue of red or blue. Threat likelihood and vulnerability severity scores map to a similar colour chart to that of risk. An example of how these colours are applied to elements on the IRIS risk analysis model is provided in figure 5.  As figure 5 illustrates, information about the security properties is also coded within asset and threat elements. Histograms indicate whether or not values are held for each property and, if so, whether that property is low, medium, or high. The colours selected for the confidentiality, integrity, availability, and accountability histograms are the 3 primary colours, together with black; the use of black and primary colours provide the maximum differentiation between property types [15].

VI. EXAMPLE: SCORING AND VISUALISING THE UPLOAD OF CLINICAL DATA
In this example, we describe an example of how usability and risk can be scored based on the formulae in the previous sections. Our example looks at the usability and security associated with uploading sensitive data to a computational grid. We have previously defined a task called 'upload data', which is carried out by a pre-defined persona: Alice. This task begins with Alice anonymising a clinical data set, which involves removing as much personalised data as possible while still enabling her analysis software to work. Alice then uploads this data, tagging this as available only to members of her research group. In the example, Alice spends several minutes on this task, which she usually carries out each working day. Given the persona profile, this is a low demand task which generally supports her goals.
We now consider the risk of 'Unauthorised Certificate Access'. In this example, this risk is realised by a social engineer obtaining access to a client workstation and installing a user digital certificate. Such an attacker could be a journalist more interested in the newsworthiness of the attack than the value of the compromised data. The attacker exploits a vulnerability arising from the difficulty of obtaining digital certificates. Rather than undertaking the onerous task of legitimately obtaining a digital certificate, which involves obtaining permission from line management, filling out forms, and sending several confirmatory emails to a Certificate Authority, many users instead choose share digital certificates among themselves. Such insecure behaviour arises not from malevolence, but from a desire to carry out their work without undue hindrance. The attacker exploits this 'certificate ubiquity' vulnerability.
Realising the risk might involve a journalist obtaining site access, including access to a workstation, and masquerading as a new member of staff who, frustratingly, has not been given access to all the tools she needs to carry out her work. In [16], we elaborate this attack using a Misuse Case [17] to describe how an attacker pretends to be a new post-doctoral researcher whose new supervisor happens to be away from the office when she arrives for her first day at work.
This risk can be assessed both qualitatively and quantitatively. We can evaluate the risk qualitatively based on threat likelihood and vulnerability severity. These values were occasional and critical respectively, giving rise to a rating of Undesirable.
We assess this risk quantitatively by calculating its risk score. The likelihood and severity scores mapped to 2 and 3 respectively. The threat targeted only the confidentiality of these assets, but two assets were targeted by this threat; these were the client workstation ( 1 0 3 0 ) and a user certificate ( 2 0 3 1 ). The asset exposed by the vulnerability is the user certificate Using this information, we calculated the risk score R r : After rounding R r down, the normalised score resolved to 9.
In this example, we mitigate this risk by making user certificates host based, such that an issued certificate can only be used for a given client workstation. This countermeasure is considered highly effective at targeting the certificate Figure 6. IRIS Risk Analysis model example ubiquity vulnerability, motivating the value of 3 (High) for m s . This countermeasure also fosters, albeit only to a minor extent, values of confidentiality and accountability, giving rise to a score of 1 0 0 1 form p . Based on this information, the risk score can be re-evaluated.
These results show that while certain security properties remain threatened, the severity of the vulnerability was rendered inert, thereby reducing the risk score to the lowest possible value. As this countermeasure appeared to be effective, we chose to generate a new asset for this policy. The security properties of this new asset were based on the values placed on the countermeasure, which varied based on the contexts the new asset was situated in. This countermeasure also impacts the usability of uploading data. Because Alice now needs to make sure she uploads data only from a particular machine, introducing this countermeasure is a slight hindrance to her other goals.
As such, the summative task usability can now be evaluated.
As a consequence, the risk is mitigated however the task usability is slightly worsened due to the extra effort involved in complying with the new policy. A risk analysis model showing the results of this, and related analysis, in IRIS is provided in figure 6.

VII. DISCUSSION
The techniques we have described can support the specification of security requirements, and secure systems design in general. However, these need to be implemented as part of tool-support, and this tool-support needs to be integrated into the secure systems design process. Moreover, the design process needs to capture well-grounded data to ensure analysis is reflective of the environments within which the system will operate.
IRIS framework builds upon a meta-model for integrated usability, requirements, and risk analysis, which is founded in best practice in HCI, Security Requirements Engineering, and Information Security. In related work [3], we have demonstrated tool support which implements the techniques described by this paper. Initial validation of this tool involved a retrospective analysis of a recently completed UK e-Science project.
We have devised a design process to elicit and specify the requirements of secure and usable software systems. This process incorporates techniques from Contextual Design [18], Goal Oriented Requirements Engineering [19], Security Requirements Engineering [17], and HCI Security [20]. Applying this process involves carrying out a scoping workshop with stakeholders, followed by Contextual Interviews [18] with representative users in their work contexts. Data from these sessions are used to elicit candidate tasks, assets, threats, and vulnerabilities, and generate personas [21]. When Contextual Inquiry is complete, AEGIS risk analysis workshops [20] are held with representative stakeholders. The techniques described by the paper are used to inform security requirements and design decisions in real-time. A critical infrastructure case study evaluating this design process is on-going, and future work will share the results of this study.

VIII. CONCLUSION
Reasoning about security and usability is a challenge during risk analysis, not least because analyst bias and data explosion can occur when analysis becomes developed. This challenge motivates the need to rapidly analyse and visualise the impact that task and risk analysis can have on each other. In this paper has introduced techniques for analysing risks and tasks, and visualising the results of this analysis. We have illustrated this approach with a working example, and discussed how this work fits into the larger context of secure systems design.