A Privacy Awareness System for Software Design

There have been concerting policy and legal initiatives to mitigate the privacy harm resulting from badly designed software technology. But one main challenge to realizing these initiatives is the di±culty in translating proposed principles and regulations into concrete and veri¯able evidence in technology. This is partly due to the lack of systematic techniques and tools to address privacy in the software design, hence making it di±cult for the designer to measure disclosure risk in a more intuitive way, taking into account the privacy objective that matters to each end user. To bridge this gap, we propose a framework for verifying the satisfaction of user privacy objectives in software design. Our approach is based on the (un)awareness that users acquire when information is disclosed, as it relates to the communication properties of objects in a design. This property is used to determine the expected privacy utility that users will derive from the design for a speci¯ed privacy objective. We demonstrate through case studies how this approach can help designers determine which design decision undermines users' privacy expectations and better design alternatives.


Introduction
Seemingly innocuous design decisions during software engineering can unintentionally a®ect user privacy. This is aggravated with ubiquitous systems such as wearables, cars and services that we depend on now becoming privacy threats because of their ability to seamlessly communicate with each other [25]. Bad software design may undermine information protection by distorting user expectations and obscuring privacy harm. When this happens, users are tempted to give up on privacy and lose con¯dence in the technology. For example, studies have shown that most adults do not believe online service providers will keep their data private and secure [39].
Whereas it is common for software engineers to design the systems and not consider privacy, but rather delineated as the end users' problem for not being able to exercise control. Even worse is where privacy is only considered as an afterthought later in the development life cycle. In part, this is due to lack of incentives to invest in secure design since end users cannot readily tell which software preserves privacy and which does not at the point of sale [2]. This can make the designers comfortable ignoring or indulging in privacy-corrosive design and hoping for the best [25]. A more savory reason is that designers often lack appropriate tools and analysis techniques to determine the extent to which a design representation preserves privacy [40,41]. Irrespective of the rationale for ignoring privacy, it demonstrates the impact of the lack of systematic consideration of privacy during software design.
It is also a common practice to align users' privacy needs and regulatory compliance through privacy policies À À À describing data collection, processing and distribution activities carried out by the software. The speci¯cation of such policies is majorly business-driven and carried out by legal experts with little or no insights from the software engineer. This makes it di±cult to link policy statements with demonstrable evidence of their satisfaction in software design [4]. A further undesirable observation is that these statements have become more complicated over the years and now are rarely read by users. For example, Google's privacy policy statement has grown from 600 to 4000 words over the past 20 years, and their adjustments in response to GDPR have seen a 30% increase in the number of words. a The use of policy-based solutions in this way is symptomatic of masking deeply rooted privacy design problems in software with super¯cial privacy policy statements as solutions.
Consequently, the privacy by design paradigm has become a vital part of the dialog on what counts as good and e®ective privacy, and is often used as a slogan for systems built with privacy considered from the onset. Its opposite is responding to privacy harm after it has occurred. This initiative was passed by the Privacy Commissioners and Data Protection Authorities as an essential component of fundamental privacy protection. The objective was to help companies protect privacy by embedding a set of seven foundational principles into the design speci¯cation of technologies, business practices and physical infrastructures [14]. One of the principles even argues for the need to support privacy promises with a systematic and demonstrable evidence.
But the challenge with privacy by design is the lack of any underpinning or engineering understanding of those principles during software design [15,25]. This research is therefore motivated by the increased need for demonstrable evidence of privacy preservation in software design. The expectation is that such evidence will a https://policies.google.com/privacy/archive?hl=en-US.
1558 I. Omoronyia, U. Etuk & P. Inglis help software designers make appropriate design choices. Speci¯cally, this paper investigates a framework for demonstrating that interactions between objects in a software design enhance or abate the ability to realize a privacy objective. Such evidence can then be used by designers to distinguish between a good and a bad design from a privacy standpoint. This research is timely. Given the multitude of design patterns used in designing software, it remains unclear how the interaction between objects impacts on the ability of the software to preserve privacy; in particular, service-oriented software used for \wiring up" everyday objects to be part of the Internet of Things, social networks, mobile systems and e-commerce.
The central thesis of this research is that a software design that preserves privacy uses the appropriate disclosure protocol to enable interaction between objects. Such disclosure protocol maximizes the satisfaction of a privacy objective. The framework for investigating this assumption in a software design is shown in Fig. 1.
The framework takes as input a software design representing the behavior of a system. This input is used to generate an intermediate interaction model . The model identi¯es interacting objects and provides insights on the interaction history that the design will generate when implemented. In Sec. 4.1, we demonstrate how this history di®ers from the perspective of each object, and how it evolves based on the dynamic assignment of roles as information subject, sender or recipient. Di®ering interaction history implies that the (un)awareness of interacting parties (and therefore the extent of privacy realized) also di®ers. The interaction model represents this distinct (un)awareness in the memory of objects in Sec. 4.2. We demonstrate how a spectrum of (un)awareness ranging from being fully unaware to fully aware can be measured. When actions speci¯ed in a design are invoked, there is a transformation in the memory of associated objects along this spectrum. From an analytical viewpoint, these actions are synonymous to information-°ow transactions de¯ned by an object either requesting, consenting, sending or notifying another object about information. In Sec. 5, we outline how these memory transformations occur. Typically, information disclosure will consist of an ordered set of one or more transactions to de¯ne a disclosure protocol. Section 6 outlines precedence rules for transactions in a disclosure protocol, and also the consistency of object memory after transformation.
We implemented PriSAT b as a tool for reasoning about privacy in software design. The tool generates an intermediate interaction model representing a software design and facilitates the mapping of disclosure protocols to interaction between objects speci¯ed in the design. Given a privacy objective and assigned disclosure protocols, PriSAT generates the expected privacy utility that each object will derive from the design. In this manner, the designer can compare di®erent design alternatives to make informed design choice.
Our evaluation consists of two case studies. In Sec. 7.2, a followership network design on Twitter is reverse engineered. The case study focuses on the scenario where users interact with their followers privately. Our choice is based on the view that Twitter represents an example of a system where the traditional assumption that the impact of privacy can be localized to avoid contagion on other users' privacy becomes di±cult to hold [58]. This is mainly due to the temporal and spatial distributions of objects, as well as the autonomous nature of associated users. On Twitter, it is easy for information once disclosed to reach unintended recipients, and users may be unsure if an information-°ow path will ultimately lead to privacy violation. The study showed that the design's ability to realize a privacy objective on Twitter varied as information°owed in the network. This suggests that using the same disclosure protocol to foster interaction between any objects introduced asymmetry in the level of privacy each gets on the network. It was also observed that there is a maximum limit on the level of privacy that a design can o®er. We demonstrated how a lightweight refactoring of the design can improve this limit.
The second case study in Sec. 7.3 presents a scenario analysis involving a family of interaction patterns for designing service-oriented software systems. These ranged from patterns where interactions between the information producer and consumer are facilitated or mediated via a broker using a push or pull mechanism, or a hybrid of both, with or without binding. Popular frameworks and standard speci¯cations such as Message Queuing Telemetry Transport (MQTT) [18] for facilitating interaction between objects in Internet of Things, RESTful APIs [31] for e-commerce applications and messaging systems [33] are designed using one or more of these patterns. Using scenario analysis, our aim is to highlight the limits of the privacypreserving capabilities of these patterns. This study demonstrates how design choices in a service-oriented software system inhibit or enhance the ability for an information producer, broker and/or consumer to realize a privacy objective. We show how the choice of a design pattern can be determined based on a balance of functional and privacy expectations. On the whole, this research will bene¯t academia, industry and policy makers with interest in addressing privacy by design challenges of real-world software systems.

Background
Software design and privacy are the core concepts of this research. There are numerous views on what constitutes a good software design [25,48,56,12]. Broadly, software design comprises the speci¯cation of how systems are architected, how it functions, how it communicates and how the architecture, function and communication a®ect users. A design is judged to be good if it leads to software that is correct (does what it should), robust (tolerant of misuse),°exible (adaptable to shifting requirements), reusable, e±cient, reliable and usable [16,22]. Ultimately, a good design depends on the software engineer making the right design decisions. This is important since every design decision re°ects an intent on how the software is to function or be used, as well as users' expectations as to how the software is compatible with contextual norms.
Likewise, there are diverse views of privacy in software. But we focus on a viewpoint that is based on (un)awareness, as it relates to communication properties of objects in a design and how end users interact via the objects. Such properties imbibe object states in the design which directly impacts the ability of the software to preserve privacy. This is as a consequence of the information disclosed to users when an object enters or exits a state. Broadly, (un)awareness is central to the manner by which we foster interactions in social settings. We coordinate and regulate our disclosure behavior based on our awareness about those we interact with and what information we want them to be aware or unaware. In a software-mediated setting, if a user was previously unaware of a fact, then a subsequent disclosure action may evolve the memory of objects representing the user, making the user aware of that fact. It is assumed here that the (un)awareness that is modeled in an object is perceived in the same way as its user. Hence, the two factors that can in°uence a user's disclosure behavior are: (1) the user's current (un)awareness; and (2) the desired (un)awareness that other users should have after disclosure. Hence, privacy is the ability for a user to regulate the evolution of its (un)awareness, and that of others, during information disclosure. The privacy objective during such regulation may be to increase awareness (information visibility) or unawareness (information secrecy) for one or more users.
Hence, verifying that a software design preserves privacy centers on how objects interact in the design and whether such interaction satis¯es a privacy objective. We are of the view that a systematic approach to considering the disclosure protocols used for interaction between objects can (1) provide insights on the extent a privacy objective is satis¯ed and (2) be used to select a better design from a set of alternatives. There are many of such disclosure protocols. For example, in order to satisfy a privacy objective, it may be essential that when information is requested by a recipient, then the sender is required to seek consent from the subject before the information is sent to the recipient. For other cases, it may be that a granted consent is acknowledged by the sender, and the subject is noti¯ed when information is sent to the recipient [38]. Indeed, there are numerous ways to combine information request, consent, send and notice actions to de¯ne a disclosure protocol, with each combination likely to generate a di®erent level of privacy satisfaction since it results in a unique (un)awareness transformation in the memory of associated objects.

Related Works
Hoepman [29] de¯ned a design strategy as an approach to achieve a certain design goal. This favors certain structural organization of the design or schemes over others, and contains properties that enable its distinction from other approaches that achieve the same goal. Hoepman also extended this view to de¯ne a privacy design strategy as a design strategy that achieves (some level of) privacy protection as its goal. We leverage on this view to investigate how the structure of a design and the limitations on disclosure protocols can vary across software design, as well as provide insights into the e®ectiveness of an alternative.
This research contributes to achieving privacy by design in software. While this paradigm has gained traction in policy circles, its actual integration into the design of software remains an open research question [24,5]. A review by Bernsmed [9] summarized di®erent approaches to operationalize privacy by design into existing software engineering processes. These include the Information and Privacy Commissioner of Ontario industry report on operationalizing privacy by design as a guide to implementing strong privacy practices [15] and the OASIS Privacy Management Reference Model and Methodology (PMRM) [47] for software engineering teams to analyze the system from a privacy perspective and to help them identify necessary technical and process mechanisms that should be implemented to support privacy. Similarly, Microsoft [46] and NOKIA [45] have, respectively, presented their engineering methodologies to bridge the gap between privacy laws and principles and techniques to foster the realization of privacy by design. One novel contribution is in the area of privacy impact assessments [54]; speci¯cally, to enable the designer to carry out an assessment of designed software platforms to determine the level of privacy risk that users are exposed to, and any associated mitigation measures.
Developer-centered security is an emerging research area focused on how to get developers to build more secure systems from the start [44,57]. While the traditional focus of cybersecurity research has been on developing new technologies and systems, in recent years, this is shifting to understanding the software engineer and how they are supported in creating secure products [53,23]. One central theme is to explore and improve the tool support and techniques that are available to software builders. While process-centric tools exist for understanding the relationship between the software engineering process and privacy [1,35], there is very little research on how to provide the developer with insights on the privacy-preserving capability of the product itself. This research contributes to this endeavor by investigating an analysis technique that enables the developer to understand how their design approach to achieve information disclosure in software impacts on user privacy. Normative research has been applied in reasoning about privacy. Breaux et al. [11] used description logic to analyze privacy in data-°ow speci¯cation with multi-party expectations. Similarly, Barth et al. [7] proposed the use of a temporal logic framework for expressing and reasoning about normative protocols in privacy legislation. Calikli et al. [13] used inductive logic programming to learn privacy norms in social software. Furthermore, Aucher et al. [6] applied modal and deontic logic in reasoning about obligations, permissions, knowledge and information exchange in the context of privacy policy compliance. A similar technique is applied by He and Antón [26] in the modeling of privacy requirements in role engineering. Our reasoning mechanism complements these existing approaches. We applied an awareness model based on possible world semantics to understand the nature of interaction between interacting objects in a software design and the privacy implications.
Finally, this work relates to access control in computing which typically involves the need to divulge information to authorized objects only [8]. An object here is a generic term that refers to an active agent capable of initiating or performing a computation of some sort. Access modes are broadly categorized into read, write and execute privileges granted to an object. Our technique provides a means to investigate the underlying engineering actions that lead to granted privileges and associated privacy risk; for example, the manner in which information is requested, consent is sought after, information is sent and user is noti¯ed. Indeed, our proposed approach provides a mechanism to determine the extent to which the de¯ned access control policies help preserve privacy.

Modeling Awareness, Unawareness and Privacy
The privacy threat envisaged relates to a networked setting where privacy is dependent not only on an individual's action but also on those of other users. If this interdependence is ignored during software design, it can lead to end-user perception of loss over the control of their personal information after disclosure. Our threat model therefore builds on the Communication Privacy Management Theory, which de¯nes information disclosure management in terms of privacy ownership, control and turbulence [43]. This theory is based on the principle that users believe they own and have a right to control their private information, and such control is achieved using personal privacy rules. When others are given access to a user's private information, they become co-owners of that information. Such coowners need to negotiate on the mutually agreeable privacy rules about telling others. Privacy boundary turbulence occurs when co-owners do not e®ectively negotiate and follow mutually-held privacy rules, subsequently providing the perception of control loss.
The focus of this paper is to mitigate the threat of privacy turbulence resulting from inappropriate information disclosure in a software design. Hence, an interaction model for analyzing software design is necessary to reveal the relationship between a privacy objective, the disclosure protocols used to enable interaction between objects A Privacy Awareness System for Software Design 1563 and the resulting privacy utility for end users. In this section, we discuss the components of this model.

Behavior modeling with role-based interaction history
General knowledge modeling involves creating a computer interpretable model of knowledge [34]. Adopting a similar approach, our aim is to create a model that represents the behavior exhibited when users disclose or receive information via their surrogate objects. This is a role-based interaction described by the information°ow t i , involving the disclosure of the proposition about a subject (su) from a sender (s) to a recipient (r): where . U i is the set of objects associated with t i ; . R ¼ fsu; s; rg are the roles that object in U i can assume; . roles : U i ! 2 R is a function mapping each object u j to a set of roles, rolesðu j Þ R.
When the sender discloses about itself to the recipient, then s and su refer to the same object. Alternatively, before is disclosed by the sender to a recipient, it is either generated by the sender or is granted custody by the subject. In this case, s, su and r refer to di®erent objects. Finally, an object can also assume the roles of subject and recipient. This is when the sender sends about a subject to the subject. Here, su and r refer to the same object. The¯rst and last scenarios highlight the duality of roles during information°ow. The sender cannot send to itself. This makes it impossible for s and r to reference the same object.
Furthermore, when is disclosed by multiple objects during interaction, an information°ow path is formed. Formally, a path is a sequence of information°o ws represented by where n is the path length and U i U. When s 2 rolesðu j Þ at t 1 , then u j is the path source. Whereas, if u j is the recipient of at t 1 , then the information°ow at t 2 is dependent on u j to switching its role from recipient to a sender. This switching of roles propagates from t 1 up to t n . A backtrack occurs along a path when the sender at t i discloses to a sender at t k<i ; for instance, an interaction involving u 1 and u 2 where u 1 sends to u 2 , then u 2 sends back to u 1 . A path terminates at t n when the intended destination of is reached without backtracking. Alternatively, a path terminates at t n when a backtrack occurs. At this point, a cycle is formed and terminated to prevent paths of in¯nite length over .
Disclosed information may be attributed to one or more subjects. For the latter, attribution can be prede¯ned statically, with all the subjects in determined before 1564 I. Omoronyia, U. Etuk & P. Inglis disclosure at t 1 . Alternatively, attribution is determined dynamically as the infor-mation°ows along a path. In this case, an object that assumes the role of a sender at t nÀ1 may become a subject at t n . The role of a subject is permanent once assigned and cannot be switched over the history of a path. Whereas, since a path terminates once a backtrack is observed, an object can switch the roles of sender and recipient no more than once and twice, respectively, along a path. Finally, information may°ow concurrently along any two paths t 1 0<i n and t 2 0<i m . The result is an interaction network consisting of all information-°ow paths: Assume an interaction between the set of objects u 1 -u 5 over about u 1 . First, u 1 discloses to u 2 and u 3 . Subsequently, u 2 and u 3 disclose to u 4 and u 5 , respectively. Finally, u 4 sends back to u 1 and u 2 and also discloses to u 5 . The graph diagram, interaction network (represented as adjacency matrix), resulting information°ow paths and role-based interaction histories for static and dynamic subject attributions are illustrated in Fig. 2. In extracting paths from the interaction network, the order in which the disclosure actions are executed is nonconsequential and the longest possible path that satis¯es backtracking constraint is assumed. It can be observed that each object in the network generates a peculiar interaction history based on its roles during each information°ow. An alternative information-°ow setting may result in a di®erent interaction history. Our objective is to understand how these variations in disclosure behavior can be used to articulate the level of (un)awareness that objects attain about disclosed information. Subsequently, we explore how such (un)awareness can be regulated in a software design to preserve privacy.

The (un)awareness of objects
Early works on reasoning about awareness in interactive settings assume limited rationality of objects [17,20,27]. This implies that when the information is disclosed, the awareness obtained by objects would vary within a spectrum of being fully aware of the information to fully unaware. In this Subsection, we leverage on this assumption to set the foundations for reasoning about the relationship between software design and the awareness generated by the interaction properties of objects in that design. To achieve this, the memory of an object based on its interaction with other objects about is represented using possible world semantics for describing alternative worlds (modes) and accessibility relations between such worlds [49,28]. When is disclosed in an information°ow, then the subject, sender and recipient who are previously unaware may become aware of and its related propositions. Thus, the (un)awareness contained in the memory of an object about can be modeled using Mð Þ ¼ ðW ; p; R; I; w c Þ, where . W is a set of possible worlds each considering a unique viewpoint on ; . p is referred to as the principal and represents the object whose memory contains the (un)awareness of ; . R 2 2 fsu;s;rg ; 8su 2 SU, is a set of reference objects whose (un)awareness about may be considered by the principal. Each subject, sender or recipient may assume the role of a principal and/or a reference object; . I W Â W is the accessibility relation on W . Given two worlds w 1 ; w 2 2 W , the principal¯nds in w 1 indistinguishable from in w 2 ; . w c is the current world of p.
To distinguish between an object being aware and the one that is unaware, we de¯ne an awareness instance in the memory of an object to consist of a principal and the optional reference objects, represented by the modal operator A. Similarly, an unawareness instance contains a principal with optional reference objects, represented by the operator :A. A composite instance contains more than one operator. Otherwise, the instance is atomic. The utilization of this operator precludes the valuation of the truth of a proposition as expected in reasoning about knowledge. In the following, we generate four (un)awareness classes in the memory of p based on M.
[A1] The¯rst class represents a principal being aware or unaware of the atomic proposition f (i.e. ¼ f). Instances of this class only contain the principal with no reference object, rendering R an empty set. Determining whether p is aware or unaware of f can be modeled by considering W as a set of two possible worlds. Thē rst world is w 1 where p does not consider f possible and is written as :f. The second world, w 2 , is the world where p considers f possible. The accessibility relations between these two worlds are illustrated in Fig. 3. From w 1 , both w 1 and w 2 are accessible. Whereas, from w 2 , only w 2 is accessible. The unshaded world are Fig. 3 represents the current world of p. When p's current world is w 1 , then p is unaware of f and it is represented as :A p f. This is because in this world, :f in w 1 is indistinguishable from f in w 2 . Conversely, when p's current world is w 2 , then p is aware of f and it is represented as A p f. This is because f is uniquely distinguishable from w 2 since it no longer considers w 1 accessible. In summary, let M p represent the memory of principal p. When reasoning about p's (un)awareness of f, then :A p f and A p f exist in the memory of p thus A1 p ¼ f:A p f; A p fg and 8a 2 A1 p ; a 2 M p : [A2] The second class enables the principal to consider whether a reference object is aware or unaware of f. In this case, is a composite proposition. Hence, A2 is a class of (un)awareness that can be used to reason about instances in A1. Instances of A2 can only contain one reference object. Thus, R may contain either the subject, sender or recipient. We refer to this potential reference object as r 0 . Determining whether p is aware or unaware that r 0 is aware or unaware of f can be realized by¯rst generating the (un)awareness of r 0 about f as A1 r 0 ¼ f:A r 0 f; A r 0 fg. The next step is to consider every instance in A1 r 0 as and apply similar heuristics used in realizing p's (un)awareness of f as highlighted in A1 for each . The result is a set of possible worlds, their accessibility relations and p's (un)awareness given its current world and as shown in Fig. 4. In summary, when considering whether p is (un)aware that a reference object r 0 is (un)aware of f, then :A p A r 0 f, A p A r 0 f, :A p :A r 0 f and A p :A r 0 f exist in the memory of p thus [A3] The third class enables the principal to consider the (un)awareness that a reference object may have about the principal. Hence, A3 is also a composite proposition and represents the class of (un)awareness that can be used to reason about instances in A2. Instances of A3 contain one reference object and a principal. The reference can either be the subject, sender or recipient. Again, we refer to the potential¯rst reference object as r 0 . Determining whether p is aware or unaware that r 0 is aware or unaware that p is aware or unaware of f is realized by¯rst generating the (un)awareness of r 0 about p being aware or unaware of f. Thus The next step is to consider every instance in A2 r 0 as and apply similar heuristics used in realizing p's (un)awareness of f as highlighted in A1 for each . The result is a set of possible worlds, their accessibility relations and p's (un)awareness given its current world and as shown in Fig. 5. In summary, when considering whether p is (un)aware that a reference object r 0 is (un)aware that p is (un)aware of f, then the memory of p is de¯ned by adding every instance of A3 p to M p . Thus [A4] This¯nal class enables the principal to consider the (un)awareness that a reference object has about other references. Hence, A4 is also a class of (un)awareness that can be used to reason about instances in A2 where is a composite proposition. Instances of this class contain two reference objects, these are any ordered pair of subject, sender and recipient. We refer to this ordered pair as r 1 and r 2 . Determining whether p is aware or unaware that r 1 is aware or unaware that r 2 is aware or unaware of f can be realized by¯rst generating the (un)awareness of r 1 about r 2 being aware or unaware of f and vice versa. Thus

I. Omoronyia, U. Etuk & P. Inglis
The next step is to consider every instance in A2 r 1 and A2 r 2 as and apply similar heuristics used in realizing p's (un)awareness of f as highlighted in A1 for each . The result is a set of possible worlds, their accessibility relations and p's (un)awareness given its current world and as shown in Fig. 6. In summary, when considering whether p is (un)aware that a reference object r 1 is (un)aware that another reference object r 2 is (un)aware of f, then the memory of p is de¯ned by adding every instance of A4 p to M p . Thus

Object memory
The subject, sender or recipient can assume unique or dual roles as the principal and/ or reference object in an instance of a class. When the roles are unique, then two consecutive (un)awareness operators cannot refer to the same object in that instance. For this case, there are 42 (un)awareness instances from A1, A2, A3 and A4, respectively, in the memory of a principal as illustrated in Table 1. These instances capture all the (un)awareness that a subject, sender or recipient will acquire about disclosed information and the related (un)awareness propositions. For each class, there is a set of (un)awareness instances with the same principal and reference object(s). For example, labels 3, 4, 5 and 6 in column M su of Table 1 constitute the set of all (un)awareness instances of type A2 in the memory of the subject with the sender as a reference object. Whereas, labels 7, 8, 9 and 10 are the set of all (un) awareness instances of type A2 in the memory of the subject with the recipient as a reference object. This uniqueness of roles in an information°ow represents the (un) awareness an object has about the (un)awareness of other parties during disclosure. When roles are not unique, then a principal can also be a reference object. In this case, two consecutive (un)awareness operators may refer to the same object. This duality of roles during information°ow depicts self-awareness where an object is (un)aware of its own (un)awareness. This phenomenon is normally assumed in standard models of belief and knowledge as positive and negative introspections [10,55]. An object is positively introspective if it is aware that it is aware of whenever it is aware of . Similarly, an object is negatively introspective if it is aware that it is unaware of whenever it is unaware of . This observation does not a®ect the instances of A1 since it only contains a principal. Whereas, for instances of A2 and A3, the principal and the reference are the same object. Hence, instance labels 3, 7 and 4, 8 in A2, respectively, depict the ability/inability of an object to positively introspect given that the object becomes (un)aware that it is aware of f. Also, labels 5, 9 and 6, 10, respectively, represent the ability/inability of an object to negatively introspect given that the object is (un)aware that it is (un)aware of f. Similar views of positive and negative introspections are observed for labels [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26] in A3, where an object is (un)aware that it is (un)aware that it is (un)aware of f. For A4, although the two reference objects remain a unique pair, the principal and one element of the pair can refer to the same object. This results in positive and negative introspections  47-42, where an object becomes (un)aware that it is (un)aware that another object is (un)aware of f.

Awareness di®erential
At any moment in time, an (un)awareness instance may or may not be tenable in the memory of an object. An instance becomes untenable when a principal considers it no longer reasonable as a result of a disclosure action, and vice versa. For example, it is tenable for the recipient to consider that it is aware of f after it receives f from the sender. Conversely, it is untenable for a recipient to consider that it is unaware of f after it has received f from the sender. The transition of an awareness instance from a tenable to untenable state or vice versa is based on the executed disclosure protocol during information°ow as detailed in Sec. 6. Categorizing each (un)awareness instance in the memory of a principal in this manner enables the distinction between di®erent levels of (un)awareness resulting from an information°ow. De¯nition 1. A principal p is fully unaware of a proposition if for every tenable instance a with respect to in the memory of p, the actual world of p is w 1 .
Otherwise, information°ow generates full or partial awareness of whose severity is determined by the number of tenable instances in w 1 or w 2 . This measure depicts the level of doubt that the principal may have about disclosed information, and computed as the awareness di®erential where . is the number of tenable instances of class A i with the same reference object(s) in M p ; . is the number of instances of class A i with the same reference object(s) in M p ; . is the number of tenable unawareness instances of class A i with the same reference object(s) in M p , and acts as a discriminator between full awareness and full unawareness.
Assume for the class A2 that it is only tenable for the principal to consider that it is aware that r 1 is aware of f. Then the awareness di®erential for this class in the memory of the principal is zero since this is the only active instance from A2 relating to r 1 . Whereas, if in addition it is tenable for the principal to consider that it is aware that r 1 is unaware of f, then the awareness di®erential becomes 0.33. Alternatively, when it is only tenable that the principal considers that it is unaware that r 1 is aware of f and that it is unaware that r 1 is unaware of f, then the principal is fully unaware with respect to A2 since for both tenable instances the actual world of the principal is w 1 . In this case, the awareness di®erential of the principal with respect to A2 is 2.3. Overall, the awareness di®erential leans towards zero when the principal becomes fully aware.
A Privacy Awareness System for Software Design 1571 De¯nition 2. A principal p is fully aware of if for every tenable instance a with respect to in the memory of p, the actual world of p is w 2 and ADiffðp; For instance, when the sender discloses information about the subject to a recipient, the subject attains full awareness when only the instances labeled 1, 3, 7, 11, 19, 27 and 35 are tenable in M su . This implies that the subject considers it tenable that it is aware of f and aware that both the sender and recipient are aware of f. It also considers it tenable that it is aware that the sender and recipient are aware of its awareness of f. In addition, the subject considers that it is tenable that it is aware that the sender is aware that the recipient is aware of f. Finally, the recipient is aware that the sender is aware of f. At full awareness, the awareness di®erential for su over all awareness classes is therefore ADiffðsu; A i Þ ¼ 0:

Memory Transformation
The advantage of modeling (un)awareness using possible world semantics is the ability to highlight how disclosure protocols transform the memory of objects as information°ows from one object to another. To illustrate this, we represent the disclosure protocol that results in (un)awareness transformations using transactions. A transaction contains processes responsible for updating the memory of object on what it is (un)aware in certain states. These transactions are labeled information Request, Consent, Sent and Notice, respectively. Transactions and associated processes are executed in sequence to ensure that the memory of objects does not end up in invalid (un)awareness states. We assume at the beginning of a path, before an information°ow occurs (t 0 ), that the subject, sender and recipient are fully unaware. Based on this assumption, tenable instances in the memory of the subject, sender and recipient in Table 1 are  A disclosure protocol constitutes a sequence of transactions that enables the sender, subject and recipient of an information°ow to gain a level of awareness ranging from being partially aware to fully aware after disclosure. The consequence of a disclosure protocol in the memory of a principal at t 1 after disclosure is twofaceted. First, one or more previously tenable (un)awareness instances of the same class and reference object(s) become untenable. Second, for the same class and reference object(s), one or more previously untenable (un)awareness instances become tenable. This section highlights these memory transformations for Request, Consent, Sent and Notice transactions.

Request
This transaction involves a sender and a recipient, and is required to give the sender an opportunity to consider if a recipient should gain awareness of information. This is representative of a setting where privacy is achieved by enabling an object to accept or reject another object's request to gain awareness of information. The semantics involve a recipient requesting for information about a subject from a sender with a request process. Tenable awareness instances in the memory of the recipient and sender as a result of this transaction are shown in Table A.1 of Appendix A. There are two scenarios where information can be requested by a recipient from a sender. First is where the sender generates requested information about the subject. In such a scenario, the sender is the custodian of information about the subject. The alternative scenario is where the sender is not in possession of the requested information. In this case, the request can only be ful¯lled if the information is sought from the subject. For example, in an event-driven system, a component (recipient) may request event noti¯cations about another component (subject) from the event broker (sender). At the time of request, the broker is unaware of such noti¯cations from the subject. Hence, the transaction can only be ful¯lled when noti¯cations are subsequently pushed by the subject component to the broker.

Consent
This transaction involves the sender and subject, and the aim is for a sender to disclose information to a recipient only when granted permission to do so by the subject. Obtaining Consent may be implicit where initiating the operation acts as the willingness to disclose; otherwise it may be explicit, where the user is clearly presented with an option to agree or disagree (opt-in or opt-out) of the collection or disclosure of personal information [21]. Consent can be speci¯c or generic. The former involves a sender seeking consent or the subject granting consent to disclose information to a speci¯ed recipient. Hence, the sender has identi¯ed the recipients for which it is seeking or granting consent. The subject is also informed or able to identify the recipients to which information will be sent at the time Consent transaction is executed. Conversely, generic Consent involves the sender seeking or the subject granting consent to disclose information to unknown recipients. The semantics are as follows: The sender¯rst seeks consent from the subject to disclose information to a recipient with seekConsent. Subsequently, the sought consent may be granted by the subject with grantConsent. Tenable awareness instances in the memory of the subject and sender as a result of this transaction for speci¯c consent are shown in Table A.4

of Appendix A.
A seekConsent process can also be executed in one of the two ways. In the¯rst case, the sender is the custodian of information about the subject for which the disclosure consent is being sought. Hence, the sender is aware of information to be disclosed at the point seekConsent is executed. Alternatively, the subject is the source of information for which the consent is being sought. In this case, the subject becomes aware of information to be disclosed at the point seekConsent is executed.
After grantConsent is executed, it becomes tenable for the subject and sender to consider that they are aware of the information for which the consent is being granted. Hence, it is impossible for the subject to grant consent to disclose information that it is not aware. Also, it is impossible for the sender to remain unaware of the information it has been granted consent to disclose. Tenable instances of A2, A3 and A4 are also dependent on whether the sender generates the information for which the consent is being granted, or the subject is the source of the information. Finally, when the consent is not granted by the subject for the information generated by the sender, but rather it responds to the sender with an acknowledgment of sought consent, then it is tenable for the subject to consider that it is aware of the information for which the consent is denied. Hence, it is impossible for the subject to acknowledge sought information that it is not aware (cf. Sec. 5.5).

Sent
This transaction enables a recipient to gain awareness of information about a subject and involves a sender and a recipient. The semantics entail the sender disclosing information to the recipient with a send process. After the execution of this process, the recipient becomes aware of disclosed information with the corresponding transformation in tenable A1-A4 instances in the memory of sender and recipient as shown in Table A.2 of Appendix A.

Notice
A Notice transaction is used by an object to notify other objects of its state of (un)awareness. There are primarily two types of noti¯cations À À À prominent and discoverable [37]. Prominent notice is the one that is designed to catch the user's attention and the user to inspect the outcome of their privacy options and choices. Whereas, discoverable notice is the one that the user has to¯nd. Irrespectively, they both aim to achieve privacy by ensuring transparency À À À referring to openness to users in the manner personal information is manipulated [3]. These include which information is sent, who the receiver is and where the information came from. To achieve transparency, it is then an obligation for each object to inform other parties involved in an information°ow of its (un)awareness. Realizing full transparency across parties involved in an information°ow requires six forms of Notice transactions involving each pair of sender, subject and recipient as demonstrated in Table A.3 of Appendix A. For example, the transaction Notice:s-su is triggered by the sender to a subject, and results in transforming the memory of the subject.

Process acknowledgments
An acknowledgment provides the guarantee of information receipt. In a social setting, the mere act of a subject granting consent to disclose or the sender disclosing 1574 I. Omoronyia, U. Etuk & P. Inglis the information may provide enough guarantee that the sender (respectively, recipient) is aware of the information. But this may not hold in a socio-technical setting. For example, the information may be held up in a message bu®er not yet accessed by the reference or principal object. Hence, for the guarantee of delivery, an object may provide acknowledgment on receipt of the information. An acknowledgment can occur after the execution of Request-, Sent-, Notice-and Consent-related processes as shown in Table A.5 of Appendix A.

Disclosure Protocol Suite
The transactions in a disclosure protocol depend on its inherent processes: -There are two variants of Request. These are Request, containing processes where the recipient requests for information from the sender, followed by the acknowledgment of request by the sender, and Request2, which only contains processes where the recipient requests for information from the sender, without any acknowledgment. -There are eight variants of Consent. The¯rst is Consent1 where the sender seeks consent and then the subject grants consent. The second is Consent2, which involves a process where the subject grants consent without the sender initially seeking consent. The third variant is Consent3 where the sender seeks consent, but is never granted by the subject. Furthermore, Consent4 and Consent5 extend Consent1 and Consent2, respectively, with the acknowledgment of granted consent by the sender. Whereas, Consent6, Consent7 and Consent8 extend Consent1, Consent3 and Consent4, respectively, with the acknowledgment of sought consent by the subject. -Furthermore, there are two variants of Sent. First is Sent1, which contains processes where the sender discloses information to the recipient, and a receipt is then acknowledged by the recipient. The second is Sent2, and it only consists of a process where the sender discloses information to the recipient, without any acknowledgment of receipt. -Finally, there are two variants for each pair of Notice transactions. For example, Notice:s-su1 contains processes where a sender noti¯es the subject of its (un)awareness and is thereafter acknowledged by the subject. Whereas, Notice:s-su2 only consists of a process where the sender noti¯es the subject of its (un) awareness, without any acknowledgment.
Altering the manner in which these di®erent transactions are combined leads to disclosure protocols that uniquely transform the memory of an object. Ensuring that the memory of objects remains consistent after the execution of a disclosure protocol requires the assurance that only legitimate sequence of transactions is allowed in a disclosure protocol. Furthermore, each transaction may transform instances of an (un)awareness class di®erently. Hence, when two transactions are executed in A Privacy Awareness System for Software Design 1575 sequence, then a set of rules is necessary to determine how transformations resulting from one transaction override another.

Memory consistency
The legitimate sequence of transactions to ensure the memory of objects remains sound in a valid state is speci¯ed using the precedence (adjacency) matrix in Fig. 7. A Request transaction can only occur before Sent since the recipient can only request for information it is unaware. In contrast, a Request can occur before or after any variant of Consent or Notice. A disclosure protocol cannot contain more than one variant of Consent, and can only occur before Sent. This is important since seeking and/or granting consent when the information is already disclosed invalidates the purpose of Consent. Similarly, a protocol cannot contain more than one variant of Sent, which cannot occur before a Request or Consent. Finally, any variant of Notice can occur before or after any other transaction in a disclosure protocol. The resulting matrix is a state space of disclosure protocols, with traces containing a minimum of one and a maximum of nine transactions, respectively.
When two transaction processes are executed in sequence and the resulting transformation, for instance, of the same class and reference objects di®ers, then the process transformation that contains fewer negations and achieves lower awareness di®erential overrides the other. The general rules applied to determine how process transformations override each other are described in Table A.6 of Appendix A. When any variant of Notice occurs before a process, then the process overrides Notice. Conversely, when a process occurs before any variant of Notice, then the process is overridden by Notice. Interacting objects can also end up with di®erent awareness di®erentials depending on the disclosure protocol. This di®erence is indicative of the varying levels of awareness that transactions in the protocol generate in each object. Based on this view, the memories of parties in an information°ow are only consistent with each other when they are all fully aware. Whereas, partial awareness may introduce some level of inconsistencies based on varying awareness di®erentials and accounts for the uncertainty that objects may have about disclosed information.

Properties of disclosure protocols
The objective of privacy-preserving information disclosure is to transform the memory of parties involved to varying levels of awareness, ranging from being fully unaware to fully aware. Assume a simple information°ow characterized by a sender (Bao) and a recipient (Oz), where Bao generates and discloses information about a subject (Gor). Also, consider that the recipient is revealed at the point consent is being sought and/or granted. Then each disclosure protocol trace may enable each party to achieve more or less awareness of the information. For example, the trace ¼ (Consent1, Sent1, Notice:r-su, Notice:s-su, Notice:s-r) would enable Gor to achieve more awareness compared to Bao and Oz. Whereas, the trace ¼ (Consent2, Notice:su-r, Request, Sent1, Notice:r-s) enables Oz and Bao to achieve more awareness compared to Gor. We leverage on three metrics to characterize the impact of disclosure protocols on the memory of objects.

Unawareness level
This is a measure of the extent an object remains unaware of information after disclosure. This metric is determined by comparing the awareness di®erential in the memory of the object at t 0 (i.e. full unawareness) with the di®erential at t n (after disclosure) as follows: where A i 2 fA1; A2; A3; A4g. As awareness di®erential in the memory of an object at t n turns towards zero, the unawareness level of the object also turns towards zero. Hence the information is more visible to the user. Otherwise, the object is less able to determine its awareness of disclosed information and/or the awareness of other objects about the disclosed information.

The cost of a disclosure protocol
This is a measure of the frequency at which instances in the memory of an object are switched from being untenable to tenable or vice versa by a disclosure protocol relative to another. Given i , the cost for an object is determined by where max is the disclosure protocol trace that generates the maximum number of memory transformations at t n . Likewise, HSize p is the number of entries in p's interaction history. This represents the impact of p's evolving roles as the subject, sender and/or recipient of information up to t n using i . Whereas, HSize max is the maximum number of entries that can exist in the interaction history of an object at t n given i . This cost is an indication of the actual overhead associated with using a disclosure protocol in designing interaction between objects in software. From an implementation viewpoint, the cost of a disclosure protocol indicates the resources and e®ort required to realize its design. Whereas, from an end-user viewpoint, more costly disclosure protocols may also be more disruptive. Finally, the actual cost can be forfeited with a discount factor that ranges between zero and the actual cost, which indicates the trade-o® for increased or reduced unawareness levels.
The utility derived when the software is designed to enable interaction between objects based on a disclosure protocol depends on the privacy objective. When this objective is to increase awareness and the visibility of disclosed information at low cost, then privacy utility is computed as Util vis ðp; i Þ ¼ ½1 À ULðpÞ À Costðp; i Þ: Alternatively, the objective is to increase unawareness and the secrecy of disclosed information at low cost. This is computed as

Degree of freedom
Constraining the disclosure behavior of users to a single protocol may inhibit the usability of the software. A design which implements a functional requirement using only one protocol provides minimum°exibility since users can only interact in one way. Likewise, the software design becomes more°exible if a functional requirement can be achieved using a set of alternative disclosure protocols that provide the same or acceptable range of privacy satisfaction. For example, the two traces a ¼ (Consent2, Sent1, Notice:su-r) and b ¼ (Consent1, Sent1, Notice:r-su) will generate similar unawareness levels and costs at ULðGorÞ ¼ 0.08 with CostðGor; a Þ ¼ 0.69 and CostðGor, b Þ ¼ 0:69, respectively. Hence, a and b provide the same level of privacy to Gor at the same cost. The number of disclosure protocols that can be used in software design to achieve an object's privacy objective is referred to as the degree of freedom (DoF) and determined by where DP matrix is the set of disclosure protocols from the precedence matrix in Fig. 7.
The function Util obj is the privacy utility realized by i 2 DP matrix , when the objective is to enhance visibility or secrecy of disclosed information in the memory of p. Whereas, min Util lim max is the threshold on acceptable range of privacy utility. When the threshold is in¯nite, then every disclosure protocol trace in DP matrix can be used in enabling object interaction in the software design. In this case, jDP matrix j ¼ 36; 913; 048 with DoFðpÞ ¼ 1. This is a relatively large state space and suggests the plethora of design options available to implement interaction between objects. But this state space can be pruned based on search constraints; for instance, by limiting the maximum and/or minimum number of transactions in a disclosure protocol, and also protocols that contain, exclude, start with and/or end with a speci¯c transaction. For example, if the designer is only interested in disclosure protocols with a maximum of four transactions that contain any variant of Sent and a Consent1, then the awareness system will generate a reduced state space with jDP matrix j ¼ 1580.

Case Studies
Overall, designing software that preserves privacy is a balance between the information-°ow settings and the disclosure protocol(s) used to enable interaction in the design. These settings are de¯ned by properties such as the roles of objects associated with each°ow along a path, and also the duality/uniqueness of such roles; whether the subject is the source of information being disclosed or the information is generated by or in custody of the sender; and¯nally, whether the subject and/or sender has identi¯ed the information recipients at the point consent is sought and/or granted. Whereas, the degree of freedom broadly indicates°exibility in the manner A Privacy Awareness System for Software Design 1579 software can be designed. In this section, we present two case studies to investigate the impact of these factors on real-world system implementation. The¯rst study is a reverse engineering of the Twitter followership network to understand highlighted factors in its design. The second study is a scenario analysis involving a family of design patterns for realizing service-oriented software systems.

Methodology
Given a design that highlights the expected behavior of a system, we map observable interaction patterns noted in the software design to disclose protocol traces. We achieve this by the systematic analysis of functional speci¯cation as follows: (1) Identify objects and their actions from behavioral speci¯cations.
(2) Abstract roles of objects from identi¯ed interactions.
(3) Map actions to transactions to identify the associated disclosure protocol.
After the information-°ow settings and disclosure protocols are discovered from a software design, the e®ectiveness of the design can be investigated using PriSAT. A designer can specify the disclosure protocols and information-°ow settings over an interaction network. PriSAT then determines the extent a privacy objective is sat-is¯ed. Alternatively, given an information-°ow setting, PriSAT determines the appropriate set of disclosure protocols that can be used to realize a privacy objective. We applied this methodology to evaluate the privacy-preserving capabilities in the design of private interactions on Twitter [51] and service-based software.

The design of private interactions on Twitter
Twitter is a social networking platform where users interact predominately via a followership network [51]. Users are expected to register with the service before they can interact with other users. Once registered, the interaction is fostered by a user following another user to gain visibility of the messages they tweet, retweet, like or reply. The relationship between a followed user and the follower is not symmetric. Also, disclosure behavior is dependent on whether the users choose to interact privately or publicly in their privacy con¯gurations. In this study, we focus on a scenario where a user interacts with its followers privately as shown in Fig. 8. When a user follows another user, then the follower is the recipient and the followed user is the sender. This action implies that the recipient is requesting for the visibility of all actions executed by the sender on a message. When the followed user account is set as private, then a followership request from the follower has to be explicitly con¯rmed by the followed user. This corresponds to Request1 transaction considering that a followership request from a recipient matches the request process, while a con¯rmation from the sender is ackRequest. When a message is tweeted, the follower is the recipient while the followed user is the source of the tweet and therefore the sender and subject. The follower may like a private tweet, while a 1580 I. Omoronyia, U. Etuk & P. Inglis reply is considered a new message and not bound by the privacy settings of the followed user. A \like" action can be initiated by the followed user or follower, who, respectively, assumes the role of a sender. If initiated by a followed user, then its followers are the recipients. Whereas, the visibility of an action on a tweet that is initiated by a follower is dependent on whether or not the tweet is private. For a private tweet, the followed user and its other con¯rmed followers each assume the role of a recipient. These actions correspond to a Sent2 transaction, since the propagation of a message by a sender is not acknowledged by the respective recipients. Hence, the corresponding disclosure protocol for interacting privately on Twitter is 1 ¼ ðRequest1; Sent2Þ, which matches the scenario where a private user is followed by other users, then the followed user or follower tweets, likes, replies or retweets a message.

Followership design on Twitter
There are diverse followership scenarios in a private setting. A user may choose to follow or unfollow another user at any time and a message tweet does not always attract the same amount of likes from a set of followers. This results in a changing followership network for every message that is tweeted. The scenario in Fig. 9 represents a user F 0 that tweets a private message to¯ve of its followers F 1 -F 5 . This results in¯ve information-°ow paths with an interaction history where F 0 assumes a dual role of the subject and sender¯ve times, with each follower being a recipient once. At this point, the followership design in Fig. 8 yields a maximum expected A Privacy Awareness System for Software Design 1581 positive utility of 0.2 for F 0 when 1 is used to enable interaction and the privacy objective to reduce unawareness levels. Whereas, the followers F 1 -F 5 all derive negative utility with the mean of 0.01 across all users. When F 1 subsequently likes the tweet, the interaction network is extended to nine information-°ow paths, with F 0 assuming an additional role of a recipient while F 1 assumes the role of a sender¯ve times. This action improves the mean privacy utility to 0.1 for all users in the network. Overall, it is observed that the unawareness level of users reduces and privacy utility increases with each like action on a tweet. A maximum mean utility of 0.2 is reached when all the followers have liked the tweet. At this point, a total of 1305 paths would have been generated.
Based on the outlined scenario, the research question is whether there is an alternative followership design that improves the expected utility for users given a privacy objective. We note that an objective to maximize information visibility cannot necessarily be realized by using a trace with high number of transactions and processes. For example, the trace max ¼ ðRequest1, Consent8, Sent1, Notice:s-su1, Notice:su-s1, Notice:s-r1, Notice:r-s1, Notice:r-su1, Notice:su-r1) results in ULðF 0 Þ ¼ 0, CostðF 0 ; max Þ ¼ 1 and Util vis ðF 0 ; max Þ ¼ À1. This represents a state where F 0 achieves full awareness but at the maximum cost and hence the worst negative utility. Utilizing max for interaction is only viable when F 0 discounts its associated cost of interaction. Likewise, once the information is disclosed, it is impossible for a privacy objective of maximizing information secrecy to be realized with Fig. 9. Twitter interaction after a private message is tweeted from F 0 to its¯ve followers and the subsequent like action executed by each follower using 1 and discount ¼ 0.
1582 I. Omoronyia, U. Etuk & P. Inglis a utility of 1. This is because any disclosure protocol used will result in some unawareness reduction in either the subject, sender and/or recipient. Thus, the impact of these contending factors can be mitigated by leveraging on the utility values to determine the extent a privacy objective is satis¯ed.
To investigate an alternative design, DoF analysis was carried out on the outlined scenario using PriSAT. The analysis focused on identifying potential lightweight refactorings where inherent transactions in the design are preserved but augmented with variants of Consent or Notice. In this way, the functional properties of the followership network remain unchanged. PriSAT was used to search the precedence matrix for traces that start with Request1, may contain any variant of Consent and end with Sent2. The outcome was a reduced state space with jDP matrix j ¼ 9 which includes 1 and the traces (Request1; Consent½1j2j . . . j8; Sent2). Also, since the followership network is bound by a con¯rmation of follow request by the followed, it was assumed that the subject and sender have identi¯ed the information recipient at the point the consent is sought. Figure 10 illustrates the outcome of DoF analysis based on traces that matched our Consent search criteria. For F 0 , the traces (Request1; Consent½2j3j5j7, Sent2) generated a utility greater than 1 which ranged between 0.4 and 0.6. Whereas, (Request1; Consent½1j6, Sent2) o®ered utilities similar to 1 while (Request1; Consent½4j8, Sent2) did not o®er better utilities compared to 1 . For F 1 , the traces (Request1; Consent½1j4j6j8, Sent2) generated lesser utility values ranging between Fig. 10. The DoFs in realizing categories of privacy utilities for a private message tweet from F 0 to F 1 -F 5 with every follower responding with a like action. The dotted marker indicates the DoF classi¯cation for 1 and the DP matrix contains the traces where 1 is augmented with a variant of Consent transaction.
A Privacy Awareness System for Software Design 1583 0 and 0.1, while (Request1; Consent½2j3j5j7, Sent2) yields the same utility as 1 . Likewise for F 2 -F 5 , the traces that contained variants of Consent resulted in the same utility as 1 . A similar pattern of utility variance with DoF was observed irrespective of the number of followers associated with F 0 . It is therefore concluded that augmenting Twitter fellowship design with variants of Consent transactions only enhances privacy utility for the subject that tweets a private message. Whereas, the utility for its followers is not improved.
The second refactoring involved augmenting existing design with a combination of acknowledged Notice transactions. Hence, PriSAT was used to search the precedence matrix for traces that start with Request1, followed by Sent2 and may end with a combination of one or more forms of Notice transactions without acknowledgment. This generated a state space with jDP matrix j ¼ 64 which contained 1 and the traces (Request1; Sent2; 2 X ), where X ¼ fNotice:s-su2, Notice:su-s2, Notice: r-su2, Notice:su-r2, Notice:s-r2, Notice:r-s2g. The performance of 1 against Notice-related traces is shown in Fig. 11. For F 0 , all traces generated utility values greater than 1 . Furthermore, three traces consisting of ðRequest1, Sent2; 2 X þ ), where X þ ¼ fNotice:su-r2, Notice:su-s2g generated utility values ranging between 0.3 and 0.4, respectively. The remaining 60 traces provided signi¯cantly improved utility values that ranged between 0.7 and 0.9, respectively. Likewise, for F 1 -F 5 , 15 traces consisting of (Request1; Sent2; 2 X þþ ), where X þþ ¼ fNotice:s-su2, Notice:su-s2, Notice:r-su2, Notice:su-r2g had no improved performance over 1 . Fig. 11. The DoFs in realizing categories of privacy utilities for a private message tweet from F 0 to F 1 -F 5 with every follower responding with a like action. The dotted marker indicates the DoF classi¯cation for 1 and the DP matrix contains traces where 1 is augmented with combinations of Notice transactions.

I. Omoronyia, U. Etuk & P. Inglis
Whereas, the remaining 48 traces provided better utility values that ranged between 0.4 and 0.6, respectively. Again, a similar pattern of utility variance with DoF was observed irrespective of the number of followers associated with F 0 . Thus, given an objective to minimize user's unawareness, privacy utility on Twitter followership design for exchanging private tweets can be enhanced by augmenting inherent disclosure protocol with DP twitter ¼ ðRequest1; Sent2; DP au ), where DP au represents the combination of Notice transactions from the set DP au ¼ X À X þþ and X þ X þþ .

Discussion
A key insight is that the maximum privacy utility inherent in Twitter followership design is marginal compared to an alternative design that is augmented with a subset from a combination of Notice transactions. This makes the latter a preferred design when the objective is to broadly maximize the visibility of information in the network. For example, the relative improvements in privacy utilities for F 0 -F 5 can be observed in Fig. 12 where 0 1 ¼ ðRequest1; Sent2; Notice:s-r2) 2 DP twitter is used for interaction, compared to 1 in Fig. 9. A maximum mean utility of 0.6 is reached when all the followers have liked a tweet using 0 1 compared to 0.2 that is derived using 1 . We note that refactoring an existing design to achieve a privacy objective may further require domain-speci¯c design choices. For instance, refactoring the followership design in Fig. 8 to realize 0 1 will require that a sender does not only disclose a message to the recipient, but also inform the recipient of other recipients to which it discloses the same message (see the design extension in Fig. 13). Fig. 12. Twitter interaction after a private message is tweeted from F 0 to its¯ve followers and the subsequent like action executed by each follower using 0 1 ¼ ðRequest1; Sent2; Notice:s-r2).

A Privacy Awareness System for Software Design 1585
Alternatively, a privacy objective may be to reduce visibility, for example, making information less visible for all users in the network or a subset of users. For the former, 1 is the preferred disclosure protocol compared to 0 1 . Both protocols yield mean positive and negative utilities of 0.35 and À0.07, respectively, when the privacy objective is to maximize information secrecy. Whereas, when the privacy objective is to achieve varying information visibilities, then a followership design which uses a single disclosure protocol to foster interaction does not necessarily provide an equal amount of privacy utility for each user on the network. This is illustrated in Figs. 9 and 12 where the privacy utility for the subject F 0 tends to di®er signi¯cantly from other users.
These¯ndings suggest privacy in Twitter's software design can be enhanced in one of the two ways. The¯rst is informing users of their changing privacy utility as their personal information°ows from one user to another in the network. Users can then adjust their disclosure behavior to mitigate emerging privacy concerns. Second, variability in privacy utility can be managed by enabling users to specify their privacy objectives and expected utilities. Disclosure protocols are then dynamically selected by the platform during interaction to satisfy a privacy objective. We have only considered lightweight refactorings where inherent transactions in the design are preserved but augmented with variants of Consent or Notice transactions. Whereas, there are other refactoring options; for example, considering a combination of Notice transactions before and after Sent1 is executed, as well as traces that contain Consent and Notice with varying precedence. Again, we assumed that a message has only one subject and after a tweet, the authorship and attribution of the message do not change. Whereas, it is possible to have a message associated with multiple subjects; for example, when the message mentions another user via a UserTag. The structural/semantic changes to the message make the user that is tagged a co-owner and also a subject.

The privacy analysis of service-based software design
Interactions between components in distributed software applications are often orchestrated as services. A service is a discoverable software entity that can exist as a single instance and interacts synchronously or asynchronously with applications and other services through a loosely coupled communication model [42]. This concept is based on a software architectural style that de¯nes an interaction between three primary entities: the service producer, who publishes a service description and provides the implementation for the service; a service consumer, who uses the service; and the service broker that enables interaction between the producer and consumer [30]. There are two main interaction patterns involving identi¯ed entities [19,50]. The¯rst is a message pattern where all communication (information°ow) that occurs between a service producer and a consumer is mediated by a service broker [42]. The alternative pattern is where information°ow occurs in a peer-topeer fashion following the \register-¯nd-bind-execute" paradigm [36]. The producer registers a service contract in a public registry that exists on the broker. This registry is queried by consumers to¯nd services that match certain criteria. If the registry has such a service, the broker provides the consumer with the contract and an endpoint address to bind directly with the producer. Both interaction patterns are often constrained by Quality-of-Service (QoS) assurances to satisfy certain nonfunctional requirements. More importantly, these patterns o®er di®erent interaction patterns amongst service objects. In this subsection, we articulate a subset of these scenarios to gain insights into their privacy-preservation capabilities.

Broker-mediated interaction pattern
In broker-mediated service interaction, information°ow is initiated via the broker using a push/pull mechanism or a hybrid of both as shown in Fig. 14. When the producer pushes a message to the broker, then the producer grants consent for the broker to disclose the message to any set of consumers. Where the QoS necessitates that the broker responds with acknowledgment after the consent is granted, then the matching transaction it is Consent2, otherwise it is Consent5. Alternatively, the broker may pull the message from the producer. In this case, the broker is seeking consent from the producer to disclose the message to any set of consumers, with consent subsequently granted by the producer. Typically, the broker and producer have no knowledge of the consumer at this point. The matching transaction is either Consent1, Consent4, Consent6 or Consent8, and it depends on whether or not the QoS guarantees acknowledgment. The broker may then send the message to the consumer via a push mechanism. The matching transaction is either Sent1 or Sent2 and again depends on the associated QoS assurances. Alternatively, the consumer may pull the message from the broker. In this case, the consumer¯rst requests for a message about the producer from the broker. This matches a variant of Request transaction. Subsequently, the broker sends the requested message to the consumer to match Sent1 or Sent2 transaction depending on QoS assurances. MQTT, popularly used to implement IoT device-to-device interaction, is an example of a standard messaging speci¯cation based on broker-mediated interaction model [18]. Other examples include the Java Message Service (JMS) [33] and its implementation such as Apache ActiveMQ [52]. There are a number of interaction scenarios based on this pattern. These are as follows.
Scenario 1. Interaction between parties is achieved strictly via a push mechanism. The producer grants message disclosure consent to the broker. Subsequently, the message is disclosed by the broker to the consumer. As illustrated in Fig. 14(a), interaction between parties is realized using one of the four disclosure protocols from (Consent½2j5, Sent½1j2).

Scenario 2.
Interaction between parties is achieved strictly via a pull mechanism. The broker¯rst seeks consent to disclose a message from the producer. Subsequently, consent is granted by the producer. Afterwards, the consumer requests the message from the broker. Finally, the message is disclosed by the broker to the consumer. Interaction is realized using one of the 16 disclosure protocols from (Consent½1j4j6j8, Request½1j2, Sent½1j2) as illustrated in Fig. 14(b).

Scenario 3.
Interaction between parties is achieved via a hybrid push-pull mechanism. First, the producer grants message disclosure consent to the broker. Afterwards, the consumer requests for the message from the broker. Finally, the message is disclosed by the broker to the consumer. For this case, the interaction is realized using one of the eight disclosure protocols from (Consent½2j5, Request½1j2, Sent½1j2) as illustrated in Fig. 14(c).

Scenario 4.
Interaction between parties is achieved via a hybrid pull-push mechanism. The broker¯rst seeks consent to disclose a message from the producer. Subsequently, consent is granted by the producer. Finally, the message is disclosed by the broker to the consumer. As illustrated in Fig. 14(d), the interaction between parties is realized using one of the eight disclosure protocols from (Consent½1j4j6j8, Sent½1j2).
A Privacy Awareness System for Software Design 1589

Broker-facilitated interaction pattern
When interaction occurs in a peer-to-peer fashion, then message passing only takes place during binding and execution. The broker facilitates this by enabling the producer and consumer to discover each other via a push or pull mechanism, as well as the hybrid of both. This is initiated when the producer pushes a contract which typically contains the services it produces and its uniform resource identi¯er to the broker. This is synonymous to a producer notifying the broker of its capabilities. Depending on whether there exist QoS guarantees of acknowledgment, the matching disclosure transaction is a variant of Notice:su-s where the subject is a producer and the sender a broker. Alternatively, the broker can initiate a pull request for contracts from the producer, who then responds by publishing the contracts on the broker. In this case, the matching disclosure transaction is a variant of Request, which is then followed by a variant of Notice:su-s. Likewise, the consumer discovers service contracts by initiating a pull request on the broker's matching variants of Request, followed by a Notice:s-r where the broker acts as a sender and the consumer a recipient. The broker may also push service contracts to the consumer without prior request.
Once the consumer discovers a service contract, it then binds with the producer. At this point, the consumer invokes or initiates an interaction with the producer using the binding details in the service contract to locate, contact and invoke the service. The matching disclosure transaction is a variant of Request. The successful invocation of a service typically results in a functional execution by the producer and the result is returned to the consumer. This matches Sent1 or Sent2 transactions, depending on whether there exist QoS guarantees of acknowledgment. Examples of distributed software frameworks and standards based on broker-facilitated interaction patterns include the Simple Object Access Protocol (SOAP), Representational State Transfer (REST) for Web services and its reference implementations such as Java API for RESTful Web Services (JAX-RS) [31] and Jersey [32]. Interaction scenarios based on this service model include the following.

Scenario 5.
Interaction between parties is achieved using a push with binding. The producer¯rst noti¯es broker of a service contract. This is followed by the broker notifying a consumer of the producer's service contract. The consumer then makes a message request to the producer based on the service contract. Finally, the message is disclosed by the producer to the consumer. As illustrated in Fig. 15(a), interaction between parties is realized using one of the 16 disclosure protocols from (Notice:su-s½1j2, Notice:s-r½1j2,Request½1j2, Sent½1j2). Scenario 6. Interaction between parties is achieved using a pull with binding. The broker¯rst requests for a contract from the producer, who responds by notifying the broker of a service contract. Next, the consumer requests for matching contract from the broker's registry, with the broker responding by notifying the consumer of the contract o®ered by the producer. The consumer then makes a message request to the producer based on the service contract. Finally, the message is disclosed by the producer to the consumer. This interaction scenario is realized using one of 64 disclosure protocols from (Request½1j2, Notice:su-s½1j2, Request½1j2, Notice:s-r½1j2, Request½1j2, Sent½1j2) as illustrated in Fig. 15(b).

Scenario 7.
Interaction between parties is achieved using a hybrid push-pull with binding. The producer¯rst noti¯es broker of a service contract. Next, the consumer requests for a matching contract from the broker's registry, with the broker responding by notifying the consumer of the contract o®ered by the producer. The consumer then makes a message request to the producer based on the service contract. Finally, the message is disclosed by the producer to the consumer. This is illustrated in Fig. 16(a), and realized using one of the 32 disclosure protocols from (Notice:su-s½1j2, Request½1j2, Notice:s-r½1j2, Request½1j2, Sent½1j2).

Scenario 8.
Interaction between parties is achieved using a hybrid pull-push with binding. The broker¯rst requests for a contract from the producer, who responds by notifying the broker of a service contract. The broker then noti¯es the consumer of the contract o®ered by the producer. The consumer then makes a message request to the producer based on the service contract. This ends with the message being disclosed by the producer to the consumer. This is illustrated in Fig. 16(b), and realized using one of the 32 disclosure protocols from (Request½1j2, Notice:su-s½1j2, Notice:s-r½1j2, Request½1j2, Sent½1j2).

Discussion
A key insight from carrying out the scenario analysis of service interaction patterns relates to the DoF in realizing a design. Scenarios 1-4 can be realized using one of the 4, 16, 8 and 8 disclosure protocols, respectively. Whereas, Scenarios 5-8 can be realized using one of the 16, 64, 32 and 32 disclosure protocols, respectively. Hence, it can be inferred that the°exibility in designing a service-based distributed software is dependent on whether the interaction is modeled using push or pull mechanism or a combination of both, and also with or without message binding. Furthermore, there is a limit to the privacy-preserving capability of each interaction pattern. It is observed that the choice to design a software system based on a pattern may inhibit or enhance the ability of producer, broker and/or consumer to realize a privacy objective. For example, the plot in Fig. 14(a) shows privacy utilities realized by the producer, broker and consumer for the visibility and secrecy objectives in Scenario 1. When the objective is to maximize message visibility, then the maximum privacy utility realized by the producer is 0.01, and is achieved using (Consent5, Sent½1j2). This is insigni¯cant, compared to the mean utilities of 0.33 and 0.43 realized by the broker and consumer across all the disclosure protocols that can be used in the scenario. Conversely, the same producer would realize a maximum privacy utility of 0.63 using (Consent2, Sent½1j2) when the objective is to maximize secrecy. The broker and consumer also realize relatively signi¯cant privacy utilities.

I. Omoronyia, U. Etuk & P. Inglis
Hence, it is concluded that when a service-oriented design is achieved strictly using a push mechanism, then the maximum privacy utility is realized when the objective is to maximize message secrecy across interacting parties.
Again, the plot in Fig. 14(b) illustrates the privacy utilities realized by parties in Scenario 2. It is observed that any disclosure protocol used in this scenario o®ers a (a) (b) Fig. 16. Analysis of broker-facilitated architectural models for service-based interaction. Privacy utilities are determined without cost discount on the disclosure protocols.
A Privacy Awareness System for Software Design 1593 negative privacy utility to the producer when the objective is to maximize message visibility. Whereas, the broker and consumer will satisfy the same objective irrespective of the disclosure protocol used, with mean utilities of 0.19 and 0.42, respectively, across the 16 disclosure protocols that can be used in this scenario. This suggests that a pull mechanism is less suitable for a service-oriented design when the objective is to maximize the extent the information is visible to the producer. Conversely, when the objective is to maximize secrecy, mean privacy utilities of 0.56, 0.07 and 0.30 are realized by the producer, broker and consumer, respectively. This suggests that it is less e±cient to achieve a service-oriented design using a pull mechanism when the privacy objective is to maximize the extent the message remains secret to the broker. Similar reasoning can be applied to Scenarios 3-8 to evaluate their suitability in realizing a privacy objective in a software design.
Observed variance in privacy utilities implies that instantiating a pattern in a service-oriented design may involve some compromise in privacy by either the producer, broker and/or consumer. A plot of mean utilities across all the disclosure protocols for each analyzed scenario is illustrated in Fig. 17. When the software is designed based on pull, push or a hybrid of both (marked S1-S4, respectively, in Fig. 17), then the broker and consumer would signi¯cantly know more about the message than the message producer. Similarly, when the design strategy is to ensure that the message being disclosed is least visible to the broker, then the appropriate design choice is a push with binding interaction pattern (marked S5 in Fig. 17). Alternatively, the design strategy may be to ensure that the message disclosed is equally visible to the parties involved. Then the appropriate design choice is a hybrid push-pull with binding interaction pattern (marked S7 in Fig. 17). This pattern o®ers the least variance in privacy utility between interacting parties. This exploration of service-oriented design patterns provides insights on how a designer can select a design pattern based on a privacy objective, the privacy utility that a pattern provides and the satisfaction of desired functional requirements.

Threats to Validity and Future work
In our case studies, we leveraged on alternative documentation available in the public domain and observed the behavior of running systems to build the interaction 1594 I. Omoronyia, U. Etuk & P. Inglis models. In future work, we intend to automate this task by augmenting the design artifacts familiar to designers in their daily work with insights on privacy implications. We expect that such automation will reduce the knowledge gap required to consider privacy during early-stage software design.
Our technique precludes factors such as the adversarial or cooperative tendencies of objects and also the level of sensitivity of disclosed information. Also, memory transformations during object interaction and implied awareness are solely determined by the executed disclosure protocols. This means that by only relying on transactions in the disclosure protocol suite, there are memory states that cannot be reached from an assumed initial state of full unawareness. For example, it is impossible to realize a disclosure protocol that renders all elements in the memory of a principal tenable. Whereas, it is easy to see that such a memory state is unintuitive from a socio-technical viewpoint, since this will infer an object denies its awareness after disclosure even though it is tenable that it is aware. Hence, it can be concluded that a disclosure protocol that makes an object to realize such a state should not be allowed in a software design. The open research question is to determine whether all unreachable memory states are also unintuitive from a socio-technical context, and therefore not relevant for privacy management.
The analysis of a disclosure protocols state space is a reachability problem of determining whether there is a disclosure protocol that makes a certain awareness state reachable from an initial state of full unawareness. Addressing this problem requires: (1) identifying all memory states that an object can assume based on Table 1; (2) determining which identi¯ed states are intuitive from a socio-technical viewpoint; (3) inferring whether there is a disclosure protocol that makes the state reachable; and nally (4) the impact that such reached/unreachable state has on privacy. A memory leak then exists when there is an unreachable state that is intuitive from a sociotechnical viewpoint. Otherwise, the disclosure protocol suite can be considered as complete. Therefore, we do not claim in this research that the disclosure protocol suite is complete and prevents all memory leaks. Making such a claim by addressing highlighted tasks is beyond the scope of this paper and the focus of future work.
A broader picture of the relationship between end users, software designers and regulations is multi-dimensional, whereas this research only sets the foundation for understanding this relationship from a designer's viewpoint. Finally, although we assume that the (un)awareness modeled in objects is the same as their users', this deterministic assumption may sometimes not hold.

Conclusions
This paper presents a technique for analyzing the privacy-preserving capabilities of software design. First, possible world semantics are used to demonstrate how the memory of user objects in a behavioral design representation transforms during interaction, where an object's memory is de¯ned in terms of what its respective users are aware or unaware of the disclosed information. The more unaware an object is A Privacy Awareness System for Software Design 1595 about the disclosed information, the more secret the information is to its user. Conversely, the more aware an object is about the disclosed information, the more visible the information is to its user. Privacy engineering during software design then involves determining the appropriate disclosure protocol that can be used in the design to ensure a level of information secrecy or visibility when objects are interacting. Hence, we de¯ne a disclosure protocol that constitutes information Request, Consent, Sent and Notice transactions, and characterize ensuing object memory transformations when any of the transaction is triggered as part of information disclosure. Finally, given a privacy objective to maximize information visibility or secrecy, a privacy awareness system is used to determine the privacy utility that users derive when interaction between objects is designed based on a disclosure protocol.
Our approach was evaluated based on two case studies. First, we carried out an analysis of the followership design on the Twitter social networking platform. We demonstrated the variability in the range of privacy utilities that a followed user and its followers can derive as a message is tweeted, retweeted or liked in the network. We then investigated a refactoring of the followership design with variants of Consent and Notice. The results showed that with a privacy objective to maximize message visibility, refactoring the Twitter followership with Consent did not signi¯cantly improve privacy utility. Whereas, for the same privacy objective, refactoring the design with Notice showed a signi¯cant improvement in privacy utility across interacting parties. The broader insight is that the design of software where user objects are associated with emergent properties and therefore changing privacy objectives needs to be adaptive privacy ready [41].
The second case study involved the scenario analysis of service-based software design patterns. Two categories involving a broker either mediating or facilitating interaction between information producers and service providers were analyzed. Our study results showed that°exibility in designing a privacy-preserving serviceoriented system is also dependent on the design pattern used to mediate or facilitate interactions. These patterns include pull, push as well as their hybrid combined with or without binding. We demonstrated the strengths and weaknesses of each pattern in satisfying a privacy objective. The broader insight is the relationship between software design patterns and the satisfaction of a privacy objective.
The use of proposed technique in practice depends on two steps. First, the software designer articulates the features to be implemented using a model-driven design technique. Second, the designer de¯nes a privacy objective to be realized in the design. This is speci¯ed in terms of the desired level of information secrecy or visibility. A subset of disclosure protocols that minimize privacy risk in the design is then proposed.
Appendix A