Risk mitigation services in cyber insurance: optimal contract design and price structure

As the cyber insurance market is expanding and cyber insurance policies continue to mature, the potential of including pre-incident and post-incident services into cyber policies is being recognised by insurers and insurance buyers. This work addresses the question of how such services should be priced from the insurer’s viewpoint, i.e. under which conditions it is rational for a profit-maximising, risk-neutral or risk-averse insurer to share the costs of providing risk mitigation services. The interaction between insurance buyer and seller is modelled as a Stackelberg game, where both parties use distortion risk measures to model their individual risk aversion. After linking the notions of pre-incident and post-incident services to the concepts of self-protection and self-insurance, we show that when pricing a single contract, the insurer would always shift the full cost of self-protection services to the insured; however, this does not generally hold for the pricing of self-insurance services or when taking a portfolio viewpoint. We illustrate the latter statement using toy examples of risks with dependence mechanisms representative in the cyber context. Supplementary Information The online version contains supplementary material available at 10.1057/s41288-023-00289-7.


Motivation and approach
Cyber insurance is still a relatively new, but steadily expanding market. The reasons for its ongoing growth in demand are manifold: the dynamically expanding and evolving cyber-threat landscape (ENISA 2021;tenable 2021), extensive media coverage of severe cyber incidents PartnerRe 2017, 2018;Marotta et al. 2017), ubiquitous introduction of stricter legislation (Anchen and Pain 2017;Marotta et al. 2017), and increased awareness of companies about their augmented dependence on information technology. To emphasise the first point, in particular the growing extent of the professionalism and economic potential of the ransomware "industry" are addressed, e.g. in ENISA (2021). As of 2020, cyber incidents were ranked the number one peril to businesses worldwide (Allianz 2020) and their perilousness can hardly be expected to have diminished since, as the COVID-19 pandemic and its effects (e.g. extensive ad-hoc shifts to remote work without adequate time to amend IT security measures and practices) have been labelled by some experts "the largest-ever cybersecurity threat" (Munich Re 2021). Many insurers are already actively participating in the global cyber insurance market, while still grappling with a firm understanding of this new and dynamic type of risk and its underlying drivers. Far from being solved is the question of how to adequately assess and price cyber risk given the various challenges, e.g. scarcity of historical data, non-stationarity of claims, association between claims, and strategic motivations of threat actors. Many academic works have recently been devoted to understanding and modelling these challenges in cyber risk. We, therefore, deliberately refrain from providing an exhaustive overview and refer to the surveys (Marotta et al. 2017;Awiszus et al. 2023).
In most established insurance lines, insurers have multiple years of claims experience and established technical expertise to quantify risks. In contrast, assessing and pricing cyber risks is particularly challenging due to the dynamically evolving threat landscape and the high complexity of modern IT systems. Therefore, insurers strive to collaborate with specialised IT security service providers (consider Bosch CyberCompare as an example or Advisen for a market overview), who not only support insurers in accurately assessing to-be-insured risks, but collaborate in providing services that aim at mitigating the insured risk as part of an insurance policy. Such cyber-assistance services can be divided into pre-incident services, such as network security, back-up of critical systems and data, and patch management, and post-incident services, such as restoration of data, forensic services, and legal advice (see Munich Re 2021). The former typically serve to decrease the probability of a cyber incident, while the latter support mitigation of the loss size in case an incident has occurred. In practice, the effects of both types of service are naturally intertwined, and additionally, all types of cyber assistance can also serve to provide insurers with additional information, i.e. to enhance their cyber-risk assessment practices or simply to obtain supplementary data (see also Remark 1 below). A recent survey (Munich Re 2021) indicates that the majority of (prospective) buyers believes that such services should be covered by holistic cyber insurance solutions, indicating that both the supply and demand side have realised that cyber insurance coverage should encompass more than pure compensation for financial losses. One type of service which is not yet explicitly advertised on the market, but holds great potential, is the insurer's ability to use the interdependence of cyber incidents to all parties' benefit by offering additional risk mitigation services.
To the best of our knowledge, established actuarial pricing approaches for these new policies are yet to be developed. The aim of this work is to propose a mathematical framework to study the optimal price structure of such insurance contracts, in particular to start addressing the question if (and under which circumstances) an insurer is economically incentivised to subsidise risk reduction services within an insurance policy. As part of this question, the issue of the optimal combination of insurance and risk mitigation (depending on their prices) from an insurance buyer's point of view is also studied. A further point, which is particularly relevant in the cyber context, is that for an insurer, it is not exhaustive to consider every single policyholder separately, but due to the potential interconnectedness of cyber losses, a portfolio viewpoint considering dependencies needs to be taken into account.
Our approach is based on the work of Bensalem et al. (2020), by using the framework of distortion risk measures and stochastic ordering of loss distributions, respectively, to capture risk assessment of all parties and the effects of risk mitigation services, and by modelling the interaction between insurer and insurance buyer(s) as a Stackelberg game. We extend their setting to a bivariate problem for the insurer, allowing her to choose the price for both risk transfer and risk mitigation, and analyse the results of the corresponding buyer's problem [which is conceptually similar to Bensalem et al. (2020)] in the cyber insurance context. Furthermore, we transcend from the study of an interaction with a single buyer to examples of (sequential or simultaneous) interactions with several buyers with dependent losses.

Related literature
A concise overview of academic studies on the interaction between risk reduction and insurance in the cyber context is given in Xiang et al. (2021). As mentioned therein, many of these studies rely on very simplified assumptions regarding the distribution of random cyber losses or the interplay between costs of prevention and consequence on the reduction of risk. Most often, the optimal combination of security provisions and insurance from an insured's point of view is studied, see, e.g. the early game-theoretic contribution of Pal and Golubchik (2010), the work of Young et al. (2016), and subsequently Mazzoccoli and Naldi (2020), or Yang and Lui (2014), Chase et al. (2017), and Mazzoccoli and Naldi (2021) who investigate optimal security investments under the presence of cyber insurance in a heterogeneous network, in a cloud computing environment, and for a multi-branch firm with correlated vulnerabilities, respectively. Zhang and Zhu (2021) use a dynamic moral hazard type of principal-agent model with Markov decision processes to capture decisions on self-protection of the insured and Skeoch (2022) expands the Gordon-Loeb model (Gordon and Loeb 2002) for cybersecurity to a cyber insurance context. Pal et al. (2014Pal et al. ( , 2017 more generally study synergies between cybersecurity and the (existence of a then nascent) cyber insurance market.
Fewer studies emphasise the insurer's role in designing cyber insurance contracts, e.g. by choosing premium and contractual indemnity (Dou et al. 2020), employing a bonus-malus system (Xiang et al. 2021), or trying to mitigate moral hazard by means of risk preference design (Liu and Zhu 2022).
The problem of combining different strategies of coping with risk, in particular the combination of risk reduction by investing in prevention measures and risk transfer by purchasing insurance, is of course not specific to cyber and has been the interest of many earlier studies. A good starting point is the survey (Courbage et al. 2013) on the economic literature on prevention and precaution. As differentiated therein, prevention activities encompass self-protection, i.e. modifying the probability of a loss, and self-insurance, i.e. shaping the potential loss size. The seminal work by Ehrlich and Becker (1972) examined the relationship of both activities to market insurance, and many authors have subjected these results to various model changes (for an overview, see Courbage et al. 2013), see, e.g. Dionne and Eeckhoudt (1985) and Hiebert (1989). Most aforementioned models use an Expected Utility (EU) framework and consider only two states (i.e. a loss occurs = "bad" state or no loss occurs = "good" state). 1 Another model of behaviour under risk, namely Rank Dependent Expected Utility (RDEU), has been considered for the study of prevention, e.g. in Konrad and Skaperdas (1993), Bleichrodt and Eeckhoudt (2006), Etner and Jeleva (2013). Courbage (2001) considered the relationships between market insurance, self-insurance, and self-protection in the context of Yaari's Dual Theory.
Our work is conceptually most closely related to Bensalem et al. (2020), who model the interaction between insurer and insurance buyer as a so-called Stackelberg game (see, e.g. Osborne and Rubinstein 1994;Fudenberg and Tirole 1991), a setting recently used to describe the interaction between reinsurer(s) and insurer(s), e.g. in Bai et al. (2022), Chen and Shen (2018), Chen et al. (2020), andCheung et al. (2019). 2 Recently, some authors have also studied equilibria in sequential optimisation games in an insurance-reinsurance-setting, see, e.g. Boonen and Ghossoub (2022), Boonen et al. (2021) and Boonen and Zhang (2022). Let us also mention that in the cyber insurance domain, some works employ different game-theoretic approaches including the insurer and insured as parties, sometimes additionally featuring malicious third parties (cyber attackers), see, e.g. Zhang et al. (2017) and Yin et al. (2021). One aspect of the usual (principal-agent)problem between an insurer (acting as principal) and an insurance buyer (responding as agent) is the problem of moral hazard, i.e. the fact that the (risk reduction) actions of the agent are unobservable to the principal (see, e.g. Holmstrom 1979). This complicates matters, i.e. static principal-agent problems involving moral hazard are typically hard to solve (see, e.g. Rogerson 1985;Jewitt 1988). Many of the above-mentioned works incorporate, or at least mention, the issue of asymmetric information in their studies, e.g. Liu and Zhu (2022), Boonen et al. (2021), and Zhang and Zhu (2021). 3 The popular framework of risk measures to model risk preferences of both the insurance buyer and insurer has recently been used by, e.g. Bensalem et al. (2020), Cheung et al. (2019), Boonen and Ghossoub (2022), and Balbás et al. (2011), mostly in an insurance-reinsurance context. In the insurance context, an axiomatic characterisation of insurance prices as Choquet integrals (see Denneberg 2013) with respect to distorted probabilities was introduced in Wang et al. (1997) and studied further, e.g. in Bellini and Caperdoni (2007) and Wang (2000). 4 The first explicit connection of distortion risk measures and insurance pricing was made by introducing the proportional hazard transform (Wang 1995(Wang , 1996(Wang , 1998. Wang et al. (1997) described an axiomatic characterisation of insurance prices as Choquet integrals and Wang (2000) introduces another particular distortion in the general setting of Wang (1996), later called Wang transform, with the aim of connecting the pricing of insurance and financial risks. Finally, let us mention that many questions that arise from the practical usage (due to corresponding regulatory frameworks) of the value-at-risk (VaR) and average value-at-risk (AVaR) measures are subsequently studied for a more general class of distortion risk measures, e.g. backtesting methods [see, e.g. Christoffersen and Pelletier (2004) and Ziggel et al. (2014) for VaR, Emmer et al. (2015) and Kratz et al. (2018) for AVaR, and Bettels et al. (2022) for general distortion risk measures and an extensive overview of works on VaR and AVaR backtesting] or risk sharing [see, e.g. Galchion (2010) for VaR, Embrechts et al. (2018) for quantile-based risk measures (range value-at-risk), and Wang (2016), resp. Weber (2018), for more general (resp. VaRtype) distortion risk measures].
3 Indeed, in other insurance domains, if incentive programmes exist (e.g. discounts on health insurance for participating in fitness regimes), they often give rise to moral-hazard issues, i.e. the insurer needs to secure the insured actually complies with the agreed-upon level of effort. In the cyber context, however, moral hazard does not seem to be a major concern for two reasons: first, due to the novelty and dynamics of cyber risk and the high complexity of technical systems, it is likely that neither of the parties (insurer and insured) have a full understanding of the underlying risk, i.e. the main problem is a lack of information for both parties rather than information being withheld. Due to the necessity for up-to-date technical expertise, insurers collaborate with specialised IT service providers to assess and monitor the insured risks and recommend or employ risk mitigation measures. Thus, in our framework, we assume both risk transfer and risk reduction are offered through the insurer (principal), i.e. risk reduction services are part of the insurance contract and therefore their uptake (ex-ante) and upholding (ex-post) observable to the insurer. Second, as e.g. reputational risk from cyber events or losses from threats classified as war actions are not fully insurable but substantial risks in practice, the insured has an intrinsic motivation to mitigate such risks, even if an insurance policy to transfer other financial losses is in place. 4 Such distortion risk measures result from the properties of law-invariant, coherent risk measures if the property of sub-additivity for all random variables is replaced by additivity for comonotone random variables (see, e.g. Föllmer and Schied (2016) and Dhaene et al. (2012) for a detailed exposition and Dhaene et al. (2006Dhaene et al. ( , 2011 for a general review on (distortion) risk measures and their relation to comonotonicity). The sub-class of distortion risk measures with concave distortion functions used in this study can furthermore be shown to be coherent (see Wirch and Hardy 1999), i.e. are a sub-class of law-invariant, coherent risk measures.

Contribution
This paper extends the landscape of previous studies on the combination of risk reduction and risk transfer by bestowing the insurer with a more central role, namely controlling the cost of both risk transfer and risk mitigation. This relates to the realworld situation in cyber insurance, where insurers have started to endow insurance policies (risk transfer) with so-called cyber-assistance services (risk mitigation). We consider a monopolistic, profit-maximising, risk-averse or risk-neutral insurer using a concave distortion risk measure and study separately the cases of cyber-assistance services relating to the concepts of self-protection and self-insurance. 5 The interaction between the insurer and the insurance buyer(s), 6 who are risk averse and also use a concave distortion risk measure, is modelled as a Stackelberg game, where the "inner" optimisation problem corresponds to the insurance buyer's response to a given price structure by the insurer and the "outer" optimisation problem corresponds to the insurer's problem of determining prices for (cyber) risk transfer and (cyber) assistance services. In particular, we derive the following insights: • The "The insurer's problem: single-contract case" section addresses the insurer's problem in the single-contract case, studying in which cases an insurer is incentivised to encourage risk reduction in her policyholders by sharing the cost of risk reduction measures. We find that under the above assumptions, the insurer would never share the cost of risk reduction in a single-contract, pure self-protection scenario (Theorem 1 and case study in section A.5 in the electronic supplementary information). This does not hold in a single-contract, pure self-insurance scenario, where the optimal share of risk mitigation cost the insurer chooses to bear may depend e.g. on the parameters of the loss size distribution and both parties' risk aversions (Remark 11 and case study in section A.6 of the electronic supplementary information). • The "The insurer's problem: portfolio viewpoint" section extends the insurer's study of the pure self-protection scenario from a single-contract view to bivariate examples of insurance buyers facing dependent cyber losses under dependence mechanisms relevant for cyber (loss propagation, common events). We demonstrate that the finding from the single-contract case does not carry over, i.e. already for these small toy portfolios, the insurer may have an incentive to subsidise risk mitigation in some policyholders. The study is extended to an example of a larger ( N ≥ 2 ) portfolio in section A.7.3 in the electronic supplementary information, illustrating the increasing importance of taking a portfolio viewpoint for dependent risks.
5 While both types of services can have intertwined effects and relate to gaining information via risk assessment services, the issues of moral hazard / asymmetric information and the prospect of gaining additional information are excluded from the mathematical analysis in the main part of this paper. A discussion of how to potentially address the effect of risk assessment services is provided in section A.1 in the electronic supplementary information. 6 We consider a single buyer during the first part of the paper and extend this to examples of two (resp. N ≥ 2 ) buyers with dependent cyber-loss occurrences in the "The insurer's problem: portfolio viewpoint" section (resp. section A.7.3. in the electronic supplementary information).
• The "Solution to the insurance buyer's problem" section addresses the insurance buyer's solution to his problem of choosing an optimal combination of insurance and risk mitigation for a given price structure by the insurer (Corollary 3) and deduces the potentially complementary nature of the two activities (Corollary 4).
In summary, the contribution offers threefold insights, regarding the viewpoints of insurers, (prospective) insurance buyers, and the general (cyber insurance) market. For insurers, the study of the insurer's bivariate optimisation problem offers a first guidance to the optimal pricing of insurance policies including risk mitigation services (under specific assumptions). For insurance buyers, it is also invaluable to better understand how different contracts would be optimally priced by an insurer, in particular that the price structure a prospective policyholder is offered (and the included incentive for risk reduction) may not only depend on his own characterictics, but on the insurer's existing portfolio and the (assumed prospective) dependence between losses. 7 The study of the insurance buyer's problem on the optimal combination of risk transfer and risk mitigation is not conceptually new, but its detailed consideration offers valuable insights. Next to naturally providing guidance on the recommended course of action for insurance buyers, it may serve to theoretically explain the insurance gap observed in the cyber insurance market (see, e.g. Shetty et al. 2018), an offer-demand mismatch caused by the fact that potential buyers often look for insurance against extreme cyber events and tend to perceive asked prices of such coverage as excessive, while insurers seek to limit their liabilities from unprecedented cyber losses either by limiting coverage or by charging heavy risk premiums. One way to mitigate this mismatch, where no premium acceptable to both parties can be found for the original risk, is to equip insurance policies with (potentially subsidised) risk reduction services which help to alter the risk in a way that allows the insurer to reduce premiums and offer desired coverage at an acceptable (from the buyer's viewpoint) premium. The remainder of this paper is structured as follows: in the "Model set-up and assumptions" section, the model assumptions and set-up are explained; in the "Solution to the insurance buyer's problem" and "The insurer's problem: singlecontract case" sections the insurance buyer's and insurer's optimisation problems, respectively, are studied in the single-contract setting; the "The insurer's problem: portfolio viewpoint" section addresses the insurer's problem in simple portfolio settings with dependent losses. The "Conclusion" section summarises and outlines future research opportunities.

Risk mitigation services in cyber insurance (cyber assistance)
We first consider a model involving one profit-maximising, risk-averse insurer ('she') and one risk-averse (insurance) buyer ('he'). Before detailing the model setup and the mechanics of the sequential optimisation game, we give some compelling arguments for considering risk mitigation services in conjunction with cyber insurance policies and subsume types of risk mitigation services into three categories: (R1) Reduction of loss probability after initial risk assessment: Insurers often work with specialised IT service providers (SP) who help them to thoroughly classify a prospective client's IT security. After the effort of such an assessment is invested, the SP and the assessed company share a common understanding of the company's IT security standpoint and potential need for action. Given that the risk is deemed insurable, a joint offer by SP and insurer to the company is in everyone's interest: the company receives insurance protection and high-quality IT security maintenance services as a joint package without the necessity of extra effort to ensure complying with the insurer's requirements, which is especially relevant for small companies. The insurer does not forfeit the upfront investment for risk assessment and has certainty about the maintenance and potential improvement of the IT security according to the SP's assessment. The SP has certainty about the company's willingness to comply with recommendations in order not to jeopardise insurance coverage, and about insurance coverage with a trusted "counterparty" who will not doubt their work in case a cyber event still occurs. 8 (R2) Reduction of loss magnitude in a cyber event: Among the insured's obligations within a typical cyber insurance contract is the immediate notification of the insurer in case of a (suspected) cyber event. This allows the insurer to supply immediate technical and legal support in order to mitigate economic losses. Naturally, it is in both the company's and insurer's interest for these experts to already have a good understanding of the company's IT security landscape and to be available immediately, both of which can be guaranteed by including these services -to be performed by a service provider collaborating with the insurer -in an insurance contract. (R3) Use of insurer's knowledge about current cyber-loss landscape: While many businesses dedicate their attention to describing current cyber-threat trends, insurers have invaluable knowledge about economic losses currently suffered by their portfolio of clients. Companies are usually obliged by contract to notify their cyber insurer about cyber events, while naturally being reluctant to voluntarily share this information publicly or with external parties (e.g. researchers) in order to avoid reputational damage. Therefore, insurers have an information advantage regarding current threats and their common causes (e.g. a new trend in phishing mails or a vulnerability in a software used by companies of a specific industry sector) and can make use of this extra knowledge to warn other policyholders who are particularly prone to similar threats and vulnerabilities (e.g. all policyholders from the same industry sector or all using some vulnerable software). The benefit of doing so is reducing the probability of additional cyber losses from the same cause in their portfolio. This is especially relevant for large companies with sophisticated IT security (who may already work with external SPs) which might not find it necessary to additionally take advantage of (R1) and (R2) as part of insurance coverage. For the insurer, this type of mitigation helps to reduce the impact of systemic events and, thus, accumulation risk in the portfolio.
Remark 1 (Link between theoretical and marketed types of risk reduction service) The types of service currently offered on the cyber insurance market and suggested above direct quite naturally to the concepts of self-protection and self-insurance: (R1) Describes pre-incident services which are self-protection activities. Examples are network security, back-up of critical systems and data, anti-malware tools, identity and access management, IT security consulting, employee awareness measures, patch management, and mobile device management (Munich Re 2021). (R2) Describes post-incident services which are self-insurance activities, such as restoration of data, 24h help hotlines, forensic post-breach services, legal advice, and consulting in case of extortion (Munich Re 2021). (R3) Describes a type of self-protection activity not yet advertised on the market, as contracts are typically viewed stand alone. However, using the insurer's portfolio knowledge to install such warning mechanisms would be an important way to use dependencies (and information) between risks to the insurer's and insureds' advantage.
Of course, the above categorisation simplifies reality regarding several points: preand post-incident services are usually not offered disjointly, but as a complete "cyber assistance" service package, and each service activity within the above categories can have beneficial effects on both cyber-loss probability and severity. For example, anti-malware tools not only serve their primary purpose, i.e. to deter malware from entering the system (preventing a cyber incident completely), but as a side effect -in case malware circumvents the protection -may help to identify the source of a cyber incident more efficiently and reduce the time until system functionality is restored (reducing the economic impact of an occurred cyber incident). Nevertheless, from a mathematical viewpoint, it is convenient (and in line with previous academic work) to study the two concepts separately and therefore it is helpful to keep in mind the types of "real-world cyber assistance activities" they relate to. 9 One aspect of cyber assistance which is purposely omitted here is risk-assessment services (see section A.1 in the electronic supplementary information). This includes, e.g. extensive IT audits conducted by an IT service provider collaborating with the insurer to analyse a company's IT security provisions, to identify vulnerabilities, and to provide recommended courses of action.

Model prerequisites
Following the framework of Bensalem et al. (2020), we assume that over a given policy year, the buyer faces a random loss represented by a non-negative random variable (r.v.) X from a family of distributions F s indexed by a parameter s ∈ [0, ∞). 10 For X ∼ F s , we denote the corresponding survival function by F X,s (x) = ℙ s (X > x), x ∈ ℝ , and its generalised inverse, the tail quantile function, by To formalise the relationship between the parameter s and the distributions F s , we assume a decreasing order in the sense of first-order stochastic dominance ( ≤ FSD ), i.e. for any 0 ≤ s 1 < s 2 < ∞ and X 1 ∼ F s 1 , X 2 ∼ F s 2 it holds that X 2 ≤ FSD X 1 . This is equivalent (see Müller and Stoyan (2002), Theorem 1.2.8) to assuming for any non-decreasing 11 function f ∶ ℝ → ℝ for which both expectations exist. We furthermore assume that s [X] > 0, ∀s ∈ [0, ∞) , meaning that no risk reduction can ever completely eliminate the possibility of a positive loss.
The decreasing order in the sense of FSD of F s implies that This means that increasing s alters the risk X in such a way that for any probability level, the minimum loss amount that is exceeded by X with this probability does not increase.
Assumption 1 (Convexity of tail quantile in s). Furthermore, we assume that 10 The parameter s denotes the amount of risk mitigation service, whose categories were detailed above. 11 Throughout, we use the term non-decreasing for a real-valued function that fulfils ∀x, y ∶ x < y ⟹ f (x) ≤ f (y) and increasing if the order in the implication is strict. The terms nonincreasing and decreasing are used analogously.
This assumption can be interpreted as a decrease in marginal effect of service, i.e. the impact per unit of s on the risk X in the sense of (A1) does not increase as the baseline level of s increases, which is a very natural economic assumption.
We assume that both parties evaluate risk by using law-invariant, coherent risk measures, whose properties are recalled in section A.2 of the electronic supplementary information. An important class of risk measures are so-called distortion risk measures (see Wang et al. 1997), defined for a real-valued r.v. X as the usual Choquet integral that simplifies for non-neg. X to where ∶ [0, 1] → [0, 1] is a distortion function 12 and q X (u), u ∈ (0, 1), is the tail quantile function. From Eq. (1), one can directly see that the distortion risk measure for a.s. non-neg. losses represents a distorted expectation of X.
Assumption 2 (Concavity of distortion function). Concavity of the distortion function is a natural economic assumption. As it corresponds to assigning a higher weight to small probability events, it describes risk aversion of the decision maker, a standard assumption and indeed a prerequisite for the existence of insurance. Therefore, we will restrict our analysis to distortion risk measures with concave distortion, a class of coherent, law-invariant risk measures. 13 Remark 2 [Distortion risk measures and stochastic dominance, e.g. Dhaene et al. (2006)] Any distortion risk measure preserves first-order stochastic dominance, i.e. for any a.s. non-negative r.v. X 1 , X 2 , it holds that X 1 ≤ FSD X 2 ⟹ (X 1 ) ≤ (X 2 ). Table 1 lists some commonly used distortion risk measures and their corresponding distortion functions. In the case studies of our latter analysis, we focus on the proportional hazard transform.

Example 1
The above assumptions on the risk measures and loss distributions [in particular (A2)] are convenient insofar as they imply that the map s ↦ s (X) (and as a special Bensalem et al. (2020) and section A.3 in the electronic supplementary information]. (1) 13 By the properties of the Choquet integral (see Denneberg 2013), any distortion risk measure fulfils 1., 2., 4., and 5. in Definition 1 (Section A.2 of the electronic supplementary information) and additionally 3. if the distortion function is concave (and the underlying probability space has no atoms), see, e.g. Wirch and Hardy (1999).

Interaction between cyber-insurance buyer and insurer
We now describe how the interaction between insurance buyer and insurer in the case of a cyber insurance contract is modelled as a Stackelberg game, i.e. a sequential optimisation game between two parties, where one party (the leader) moves first by choosing her strategy and the other party (the follower) moves second by choosing his strategy depending on the selected strategy of the leader, whereby both parties seek to maximise a gain or utility function or equivalently, minimise a loss function. For a general introduction to Stackelberg games, see Fudenberg and Tirole (1991) and Osborne and Rubinstein (1994). A common tool to solve a Stackelberg game is backward induction (see Fudenberg and Tirole 1991), i.e. first solving the follower's problem for any possible choice of the leader's strategy and then -knowing all the follower's responses -solving the leader's problem. The search for a solution (and its existence) therefore depends on the specific formulation of both problems, which we now detail in our case.

Common (correct) knowledge of initial loss distribution
The prospective insurance buyer approaches the insurer to inquire about offered prices for cyber insurance policies (in person or by entering data into an online calculation system), where in order to receive price quotes, he needs to provide information that allows the insurer (with the help of an IT service provider) to classify his risk profile given his characteristics (e.g. industry sector, company size, IT security measures). We assume he provides the information truthfully and to the best of his knowledge, such that buyer and insurer have a common, unambiguous view of the original loss distribution, denoted F 0 . 14 The real-world uncertainty of either Beta DRM (Wirch and Hardy 2000) 1 Proportional Hazard (PH) transform RM (Wang 1995) u r Yes r ∈ (0, 1], Special case of Beta DRM 14 F 0 denotes the loss distribution of the buyer given his initial characteristics, including his existing IT security measures. The subscript 0 indicates that no additional services to reduce the risk have yet been acquired following the initial risk assessment. As the initial IT security level (and other characteristics) vary between prospective buyers, the initial risk assessment yields inhomogeneous F 0 . Note that for some companies, the risk assessment as part of the insurance take-up process may be the first comprehensive analysis of the cybersecurity level of their organization. While not every inquiry about insurance prices leads to the closure of a cyber insurance contract, the process may serve as a wake-up call for the acquisition of (additional) risk reduction measures within or without an insurance policy.
parties' knowledge of the unknown initial loss distribution is not studied here. Naturally, the question of accurate cyber-risk assessment has gained increased practical importance and expresses itself, e.g. in the increasing number of service providers in this domain, see, e.g. Bosch CyberCompare as an example or Advisen for a market overview. For a seminal discussion of cyber-risk assessment services and a proposal how to approach them mathematically, see section A.1 in the electronic supplementary information.

Prices quotes by the insurer
Given the buyer's original risk X ∼ F 0 , the insurer offers price quotes Π for a range of contracts, where each offered contract is characterised by the included level of risk mitigation service s ∈ [0, ∞). 15 Assume that the price of entering a contract with service level s ∈ [0, ∞) is given by where the first term represents the risk premium according to the expected value principle with loading and the second term denotes the service premium, where we assume that providing service at level s ∈ [0, ∞) requires a monetary cost of c(s) for the insurer, of which a proportion ∈ [ , 1] , > 0, 16 is charged to the insured and, thus, the remaining proportion (1 − ) can be regarded a subsidy by the insurer to incentivise risk reduction. Analogously to (A1) and (A2), s ↦ c(s) is assumed to be increasing, strictly convex, and continuous with c(0) = 0 and lim s→∞ c(s) = ∞ . The cost incurred by the insurer can be understood e.g. as the internal cost charged by the IT service provider for providing pre-or post-incident services (i.e. (R1) and (R2)) or the administrative cost of monitoring and evaluating loss data to warn policyholders about imminent threats (i.e. (R3)). Thus, the insurer's task is to choose a combination ( , ) ∈ [0, ∞) × [ , 1] which then defines price quotes for all feasible contracts.

Choice of a contract by the buyer (or opt-out)
Given a family of prices Π(s) for all feasible contracts, the buyer selects a contract by choosing a proportional insurance share ∈ {0, 1} (to opt into full insurance = 1 or to not buy insurance = 0 ) and the amount of risk mitigation service s ∈ [0, ∞) . We assume that the purchase of (additional) service at any level s is also feasible outside of an insurance contract, but at a higher cost o c(s) with o > 1 . This can be understood as the cost of buying service directly through an IT service provider (without a discount offered for insurance customers) or from the 15 One might argue that s should rather be chosen from a discrete set {s 1 , … , s n }, n ∈ ℕ (a potentially interesting combinatorial optimisation problem), representing all feasible combinations of service packages offered by the insurer. This is reasonable and we regard this as a mathematically different version of the problem whose analysis is not the present focus. 16 As does not depend on s (a potential generalisation for future studies), we do not allow the insurer to give away service for free, as otherwise the cost of service c(s) would not increase with its amount, which is unnatural. insurer herself at a mark-up. 17 In summary, given the prices for all feasible contracts as offered by the insurer, the insurance buyer's problem consists of choosing ( , s) ∈ {0, 1} × [0, ∞) . We detail in Remark 4 how the insurance buyer's choice encapsulates three classical ways of dealing with risk (acceptance, reduction, transfer), see, e.g. Marotta et al. (2017).

Solution by backward induction
To find both parties' optimal solution, we use backward induction (see, e.g. Osborne and Rubinstein 1994) by first finding the buyer's optimal response ( * , s * ) to any insurer's choice of ( , ) and second, given all optimal buyer's responses, finding the insurer's optimal choice ( * ( * , s * ), * ( * , s * )) . In order to formulate and solve the game, below we state the loss functions of buyer and insurer, respectively.

Remark 3
We highlight some similarities and distinctions between the present work and the study of Bensalem et al. (2020), whose framework was our inspiration: as indicated above, the choice of risk measures and the ordering of loss distributions follows Bensalem et al. (2020) and from the insurance buyer's point of view, the risk reduction service s fulfils a very similar role to the effort considered in Bensalem et al. (2020), yielding related optimisation problems for the buyer within the Stackelberg game. In the present study, however, the insurer's role is more central, as she controls the cost of risk mitigation service within an insurance contract (via the share of administrative cost charged to the insured). This implies that the insurer has to solve a two-dimensional problem (choosing a combination of risk premium and service premium optimally), and circumvents the moral-hazard problem that often occurs in studies on prevention and insurance. As in the present setting the risk mitigation service is offered through the insurer, the challenge of ensuring that the buyer actually complies with the agreed-upon optimal level of risk reduction (according to which insurance is priced) does not arise. Furthermore, we extend the study of the interaction with one insurance buyer to toy examples of interactions with a portfolio of dependent buyers, a particularly relevant issue in the cyber context.

Formalisation of the Stackelberg game
We now combine the assumptions of the above sections to formulate the optimisation problems of both parties within a Stackelberg game. For the reader's convenience, all parameters and functions appearing within the optimisation problems are summarized in Tables 2 and 3. The insurance buyer's objective is to minimise a coherent and law-invariant risk measure 1 associated to his total position including insurance, while the insurer's objective is to minimise, given the buyer's optimal response, another coherent and law-invariant risk measure 0 associated to her (negative) total loss.  Service without insurance is more expensive than service combined with insurance.

Table 3
Overview of functions in problems (BP) and (IP) where we have used that both risk measures are cash-additive and positively homogeneous. It is obvious that the insurer's loss depends on ( , ) directly as well as via the buyer's optimal response denoted ( * ( , ), s * ( , )).
Remark 4 (Interpretation of insurance buyer's choice) The buyer's options correspond to three classical ways of dealing with risk: is equivalent to opting out of buying insurance or services and just retaining and accepting the original risk. • Pure risk transfer: meaning that the buyer opts for fully insuring the original risk. • Pure risk reduction: i.e. the buyer opts out of risk transfer but chooses to reduce the original retained risk by purchasing risk reduction services (from the insurer outside of a policy or from a service provider directly). 18 • Combination of risk transfer and risk reduction: A choice = 1, s > 0 yields L 1 (1, s) = (1 + ) s [X] + c(s) and means that the buyer chooses an insurance policy with risk mitigation services included, i.e. opts for insuring a reduced risk.
Remark 5 (Buyer's and insurer's optimal attainable loss) • Note that as the insurance buyer starts out by facing the non-negative random loss X, by assumption L 1 ( * , s * ) > 0 , i.e. the insurance buyer can never completely eliminate his risk or even make a profit. • On the contrary, we naturally assume that the insurer only offers a contract if it is profitable, i.e. only if she can obtain a negative loss L 0 ( * , * ) < 0 . Otherwise, she would refrain from offering a contract by refusing to quote a price. 18 If one does not want to allow the interpretation that such contracts are offered by the insurer outside of an insurance policy (e.g. due to legal restrictions), the insurer's loss function should be formulated in a way that makes these contracts unprofitable (e.g. as done here by restricting ∈ [ , 1] ). If one wants to allow such contracts (one could argue that such a contract could be closed in the cyber domain with a client that has other contracts with the same insurer), a choice of > 1 would allow the insurer to sell her services at a mark-up (one could argue that this might be profitable for an insurer who has the appropriate infrastructure in place anyway for the rest of her portfolio). In our analysis, we stick to the interpretation that these outside service contracts are offered by third parties, i.e. service providers, and their price is externally given and higher than any within-insurance price (i.e. o > 1 , see above).

Solution to the insurance buyer's problem
As the analysis of (BP) is an extension of Bensalem et al. (2020), this section focuses on the additions to their analysis originating from the new formulation of (IP) and the interpretation of all results in the cyber insurance context. Derivations and proofs are outlined in section A.3 of the electronic supplementary information. First, one determines the set of values of s such that full insurance is demanded (i.e. * (s) = 1 , denoted I ) and its complement (no insurance is demanded, * (s) = 0, N ∶= I ).
Note that for fixed s, the choice * ∈ {0, 1} depends only on the sign of the expression in the last bracket of (BP) such that it follows: On the sets I and N , the buyer's loss function is a sum of convex functions: Therefore, one considers (BP) separately on I and N and compares the resulting local minima to obtain a global minimum. To this end, one first needs to study I and N for given ( , ) , i.e. the behaviour of s ↦ G (s) with respect to the threshold (1 + ) . We know that by assumption and Lemma 1 (see section A.3 in the electronic supplementary information), s ↦ G (s) is continuous and its second summand is non-negative and increasing. 19 In this study, we consider two cases: • Self-protection: In a self-protection scenario (Ehrlich and Becker 1972), i.e. if service only affects the probability of a loss, the map s ↦ s (X) s [X] is monotone nondecreasing (see Bensalem et al. 2020, Lemma 3.2, and section A.3 in the electronic supplementary information). Economically, this means that increased risk reduction has a larger impact on (reducing) the price of insurance than on (reducing) the risk. 20 Mathematically, this implies increasingness of the entire map 19 This follows immediately as by assumption s [X] > 0 , o − > 0 , and s ↦ c(s) is non-negative and increasing, while by Lemma 1, s ↦ s [X] is non-negative and non-increasing. 20 This can be seen even more clearly by rewriting Equation (A3) in terms of elasticity with respect to s [as used in economics for e.g. the price-elasticity of demand, see e.g. Parkin et al. (2002)], i.e. for 0 < s 1 < s 2 < ∞ as yielding that the expectation is more elastic with respect to service than the risk measure.
s ↦ G (s) , meaning that G (s) could intersect (for given and ) the threshold (1 + ) at most once, making I and N straightforward to determine. This setting will be considered in the following. • Special case of self-insurance: Bensalem et al. (2020) argue that in a scenario of self-insurance, i.e. in the present context if service only affects the severity of a cyber loss, for some standard loss distributions (e.g. Pareto, Weibull, or Log-Normal), s ↦ s (X) s [X] is monotone non-increasing. This does not lead to a straightforward expression of I and N , as monotonicity of s ↦ G (s) is not implied and there is a priori no limit for the number of times it crosses a given threshold (1 + ) for s ∈ [0, ∞) , such that no general results for this case can be stated. In section A.6 in the electronic supplementary information, we study the particular case of a Pareto-distributed loss whose severity is affected by risk reduction service. Here, under mild assumptions, G (s) turns out to be strictly convex (with lim s→∞ G (s) = ∞ ), yielding only one additional case compared to the self-protection case, namely G (s) intersecting the level (1 + ) exactly twice.
As outlined above, we now consider a scenario of self-protection (Ehrlich and Becker 1972), i.e. an a.s. non-negative loss X which stems from a family of zeroinflated distributions of the form where s ↦ p(s) ∈ [0, 1] is decreasing and F Y is the c.d.f. of an a.s. positive r.v. Y. This means that a positive loss with c.d.f. F Y (which could describe a single loss or be a compound distribution describing a cumulative loss) occurs with a probability that can be lowered by purchasing services while the severity distribution remains untouched, relating to (R1) and (R3) above. Ansatz (4) only assumes s ↦ p(s) to be decreasing (which is natural, as increased service should decrease the loss probability). As a standard economic assumption (e.g. Courbage et al. 2013) is s ↦ p(s) being convex (decreasing marginal impact), (A2) is not necessarily implied. Therefore, we assume another sufficient condition to ensure convexity of s ↦ s (X) for distributions of the form (4), namely that both the objective loss probabilities p(s) and the subjective loss probabilities (p(s)) are decreasing in a convex way (see Bensalem et al. 2020, Lemma 3.3, and section A.3 in the electronic supplementary information).

Example 2 As
is concave, s ↦ p(s) must be "sufficiently" convex for the concatenation to be convex; e.g. for the common choice of distortion function (u) = u r , r ∈ (0, 1] , a sufficient condition for the convexity of (p(s)) = p(s) r would be for s ↦ p(s) to be logarithmically convex (see section A.5 in the electronic supplementary information).
Increasingness of s ↦ G (s) for any ∈ [ , 1] in the self-protection case allows a convenient expression of the sets I and N . In the latter case, both maps ↦ s B ( , ) and ↦ s B ( , ) are increasing.
Remark 6 (Interpretation of Corollary 1) Case (1) states that if the loading is lower than a given constant level 0 , the buyer would purchase insurance already for the original risk (at s = 0 ) and therefore at any level s (recall that increasing s reduces the price more than the risk). Case (2), illustrated in Figure 1, corresponds to a situation where the loading is too high for the buyer to insure the original risk, but by adding a service level of at least s B ( , ) (which depends on as well as its relative cost ), an insurance contract with loading becomes acceptable for the buyer.
This directly relates to the insurance gap on the cyber insurance market: for the pure risk transfer ( s = 0 ) policies offered with loading , it may not be acceptable for the buyer to insure the original risk at the price the insurer demands. To make an insurance contract possible, either would have to be lowered to at most a level 0 (move from case (2) to case (1)) or risk reduction services equivalent to a level s B would have to be offered as part of the policy (in case (2), enable a move from N to I).
Lastly, it is intuitive that if the risk premium or service premium increase, the with-insurance solution becomes relatively more expensive for the buyer, and the interval corresponding to N (resp. I ) becomes larger (resp. smaller).
To solve the buyer's problem, first note that L 1,N (s) , resp. L , 1,I (s) , each admit a unique global minimiser on [0, ∞) , denoted s N resp. s I ( , ). In the latter case, the following hold: (i) For any ∈ [ , 1] , the map ↦ s I ( , ) is increasing.
Remark 7 (Interpretation of Corollary 2) Part 1.: As the loading increases, the set N (no insurance) expands, i.e. the boundary s B ( , ) increases (shift to the right in Fig. 1). The value N ( ) is the smallest loading such that the global minimiser of L 1,N (s) lies in N Part 2.: For fixed service cost , as increases, it becomes relatively more expensive to transfer risk, which makes it economically rational to reduce the to-beinsured risk by increasing service. Vice versa, for fixed risk loading , as increases, and thus, service becomes relatively more expensive, it is economically rational to decrease the purchased amount of service.  Remark 8 (Interpretation of Corollary 3) For any choice of , there is a maximum loading R ( ) the insurance buyer is willing to accept: if it is not exceeded, he subscribes to full insurance with service level s I ( , ) ; else, he refrains from purchasing insurance and buys service at level s N from an outside provider. The maximum acceptable loading decreases as the share of service cost increases, which is intuitive as the buyer accepts the contract if his total loss with insurance does not exceed his (fixed) total loss without insurance.
The relationship between risk loading and service demand is summarised in Corollary 4.   have in many cases concluded that given the availability of cyber insurance, individuals' willingness to invest in self-protection decreases and it is, thus, generally not possible to design insurance as a means to reach socially optimal levels of investment. Corollary 4 emphasises the much more optimistic perspective that in case of self-protection, the existence of insurance can indeed lead to higher optimal levels of risk reduction at least for individual policyholders. While we do not consider negative externalities of interdependent security investments, it is reasonable to postulate that by subscribing to insurance with a high service level, policyholders inadvertently benefit other agents in their network, e.g. by reducing the risk of cyberattacks being propagated through their systems or by providing loss data the insurer can use to warn other policyholders.
Furthermore, Corollary 4 allows another understanding of the cyber insurance gap: as the optimal service demand within insurance can be higher than without insurance, for a given combination ( , ) that an insurer demands in practice, if the service that can be offered is limited (e.g. due to technical constraints or due to limited contracts between insurers and service providers), the optimal within-insurance service level may not be attainable and the company may prefer the no-insurance solution. A way to close (or narrow) the gap would be to either decrease the premium or to increase the amount of available service within an insurance policy to make s I ( , ) attainable.
Having found the insurance buyer's optimal response to any combination ( , , o ) , we address the insurer's problem of choosing ( , ) to minimise her loss over all optimal responses of the buyer.

The insurer's problem: single-contract case
Given the results of Corollary 3, (IP) reduces to a minimisation over a compact set: assuming that the obtainable objective value of (5) is negative. This corresponds to a choice ( , ) yielding full risk transfer with service level s I ( , ) ≥ 0 as the buyer's optimal response. In case the insurer could not obtain a negative objective value in (5), she abstains from offering risk transfer by choosing > R ( ) in (IP). In this case, the buyer's optimal response is ( * , s * ) = (0, s N ( o )) , i.e. to buy service at level s N ( o ) outside an insurance policy. 21 Note that the special case = 1 , where the insurance buyer carries the full cost of self-protection, has already been studied previously, the difference here being that the self-protection measures can be obtained cheaper within an insurance contract, increasing the maximum risk premium chargeable by the insurer.
We now state that in the self-protection case, choosing = 1 is also a solution to the more general problem (5). The steps leading to this result are outlined subsequently, proofs are postponed to section A.4 in the electronic supplementary information.
Theorem 1 (Solution of (5) in the self-protection case) Let the assumptions of Lemma 2 (self-protection, see section A.3 in the electronic supplementary information) hold. Then, a solution ( * , * ) to the minimisation problem (5) lies in the compact set {( , 1) ∶ ∈ [0, R (1)]} . This means that in the self-protection case, i.e. if service only affects the loss probability, it is always optimal for the insurer to shift the full service cost to the insured.
Example 3 (Zero-inflated Pareto loss) The solution to (5) cannot be characterised further without more structure. Details for the special case of a zero-inflated Paretodistributed loss are given in section A.5 of the electronic supplementary information. In this case, the insurer's loss can be shown to be monotone in for = 1 , yielding the solution * = R (1) (see Bensalem et al. 2020). Combining this with L 0 ( , ) = 0,s I ( , ) (X) − (1 + ) s I ( , ) [X] + (1 − )c(s I ( , )), 21 As mentioned above, one could theoretically allow the insurer to offer "service-only" contracts by solving which certainly yields a non-positive objective value. It might be feasible to assume that the insurer would be able to offer such services cheaper than other market participants, as she might have certain service infrastructures (contracts with IT experts, warning mechanisms) in place already for her insurance clients. One might also assume that the insurer has initially solved this problem, thus, determining o , and the upper bound in (6) is the next-cheapest outside option. Under no circumstance would we find it realistic to allow the insurer to simultaneously compare (negative) objective values of (5) and (6) and choose the lower one. In other words, the insurer should not compare for a prospective buyer where risk transfer is profitable whether it could be more profitable to offer only services and choose a solution that discourages the buyer from buying risk transfer. (1 − )c(s N ( )), Theorem 1 means that for a Pareto-distributed loss whose occurrence probability can be lowered by risk reduction services, an optimal solution for the insurer is given by shifting the full cost of service to the insured and charging the maximum acceptable loading, i.e. ( * , * ) = ( R (1), 1).
Remark 10 Theorem 1 does not make a statement about uniqueness of the solution, as uniqueness only holds whenever the maximum attainable loading R ( ) is larger than the minimum loading I ( ) that makes pure risk transfer undesirable to the insured compared to a combination of risk reduction and risk transfer (i.e. leads to a solution s I ( , ) > 0 , see the proof of Corollary 2). This holds true under quite general assumptions on the function s ↦ c(s) , e.g. for its right-side derivative at 0 to vanish, i.e. c � (s)| s=0 + = 0.
We use the (implicit) definition of the maximum feasible loading for any share of service cost R ( ) from the proof of Corollary 3, given as which is well-defined for any ∈ [ , 1] , as the map ↦ L , 1,I (s I ( , )) is increasing with L 0, 1,I (s I (0, )) < L 1,N (s N ) . Furthermore, it is shown that for any ≥ 0 (resp. > I ( ) ), the map ↦ L , 1,I (s I ) is non-decreasing (increasing) such that ↦ R ( ) is non-increasing (decreasing). By denoting ∶= R (1) and ̄∶ = R ( ) , it holds L , I (s I ( , )) < L N (s N ) for any ∈ [0,̄] , such that one can likewise define for any such the constant denoting the maximum feasible share of service cost such that the contract is accepted for a given loading. The map ↦ M ( ) is by definition non-increasing on ∈ [0,̄] . As a corollary of Lemma 2, we deduce that for ≥ 0 fixed, the insurer's loss is monotone in the share of service cost .

Proposition 1 (Monotonicity of insurer's loss in ) Under the conditions of
Lemma 2 (self-protection) and under the necessary condition of profitability for the insurer, i.e. if L 0 ( , ) < 0 , ↦ L 0 ( , ) is a monotone, non-increasing function for any ≥ 0.
Proposition 1 states that for any (fixed) loading , an optimal solution for the insurer is to choose the maximum possible service cost M ( ) acceptable to the buyer, or equivalently that the insurer has no incentive to subsidise risk reduction through a rebate on services. This implies that an optimal solution to problem (5)  higher risk loading while offering a contract the buyer will accept. The following proposition states that the insurer's loss on this set is monotone in , leading to the statement of Theorem 1.
Proposition 2 (Monotonicity of insurer's loss in with maximum feasible risk premium) Under the conditions of Lemma 2 (self-protection), the map ↦ L 0 ( R ( ), ) is non-increasing.
Remark 11 (Self-insurance) A central property leading to the above results for the selfprotection case is non-decreasingness of s ↦ s (X) s [X] . In case of self-insurance, this assumption does not necessarily hold; indeed, for some standard loss distributions (e.g. Pareto, Weibull, or Log-Normal), the converse holds true, i.e. s ↦ s (X) s [X] is nonincreasing (see Bensalem et al. 2020). In section A.6 in the electronic supplementary information, we study the particular case of a Pareto-distributed loss whose severity is affected by risk reduction service. We find that in this self-insurance case, the insurer can indeed have an incentive to subsidise service cost (i.e. offer contracts with * < 1 ), where the optimally subsidised share (1 − * ) increases with the insurer's risk aversion. In particular, if the risk aversions of insurer and insurance buyer are similar (i.e. r 0 ↘ r 1 for the PH transform risk measure), a mutually acceptable contract may only exist if the cost is shared ( 0 < < 1 ). This further implies that the insurer's optimal solution, i.e. the price structure the insurance buyer is offered, may depend on his choice of risk measure, even if the initial risk assessment is equivalent.
So far, we scrutinised the interaction between the insurer and a single insurance buyer as an isolated problem. This is often reasonable, as in practice insurers usually price individual risks on a stand alone basis without taking into account the existing portfolio. However, the failure of the independence assumption between risks is one of the central challenges in cyber insurance, as cyber incidents at different firms can be dependent, e.g. due to common underlying vulnerabilities (e.g. Böhme et al. 2018;Zeller and Scherer 2022) or due to propagation for worm-type viruses. Therefore, one could argue that rather than finding price structures ( , ) by considering problem (5) separately for each customer, the insurer should jointly optimise the risk measure for the entire portfolio against the sum of all premiums received (note that distortion risk measures are in most situations not additive for non-comonotonic risks).
In the "The insurer's problem: portfolio viewpoint" section, we illustrate that already for portfolios of two dependent losses, the results of Theorem 1 do not necessarily hold anymore, i.e. when optimising from a portfolio viewpoint, indeed the insurer can have an incentive to subsidise self-protection measures for some policyholders.

The insurer's problem: portfolio viewpoint
In the self-protection case, a central property is that for any single contract in a portfolio of n policyholders with risks X i , i ∈ {1, … , n} , for any feasible loading i , i ∈ {1, … , n} , the reduction in price for increased service outweighs the reduction in the insurer's risk measure 0,s i (X i ), i ∈ {1, … , n} for each single risk, i.e. However, ordering of the relevant sensitivities is not necessarily preserved in a portfolio context, i.e. when adding a new policyholder to an existing portfolio, the reduction of the overall portfolio risk measure 0,s (X) may outweigh the price reduction of the additional contract, i.e. for some i ∈ {1, … , n}: where s ∶= (s 1 , … , s n ) and X = ∑ n i=1 X i is the aggregated loss. This may imply a situation where the insurer has an economic incentive to subsidise risk reduction for some policyholders in the self-protection case, as we will now analyse in a toy example of two policyholders with dependence mechanisms representative for cyber risk: (directed) loss propagation, common cyber events, and copula approaches. While these bivariate examples will already be sufficient to work out the structural difference to the univariate case, we provide one exemplary extension to a general multivariate setting in section A.7.3 of the electronic supplementary information.

(Directed) loss propagation
A popular way of modelling dependencies between cyber losses is to consider a model of epidemic spreading in an underlying network, i.e. a directed or undirected graph whose nodes are interpreted as companies (or machines) and whose edges are interpreted as connections between these companies (or machines) through which a state of "infectiousness" can be passed on. These models, often originating from mathematical biology, have been extensively studied in the cyber context over the last few years, see, e.g. Fahrenwaldt et al. (2018), Xu and Hua (2019), Xu et al. (2015) or the surveys Marotta et al. (2017), and Kerstin Awiszus et al. (2022).
Interpretations of such models are worm-type viruses spreading between connected machines or a state of business interruption propagating through a supply chain.
Example 4 (Bivariate model with one directed edge) For illustration purposes, we consider a portfolio of two firms with one directed edge between them and we understand the "infected" state as a loss occurrence, i.e. assume a loss occurrence in firm 1 can cause a loss in firm 2 with probability q ∈ [0, 1] , but not vice versa. 22 If a loss occurs, the loss sizes are deterministic; w.l.o.g. 0 < L 1 ≤ L 2 < ∞ . We assume that the events of the occurrence of a loss in firm 1, its propagation, and the occurrence of a non-propagated loss in firm 2 are independent. This implies that, depending on the chosen service levels s i , i ∈ {1, 2} , the loss r.v.s X i , i ∈ {1, 2} , take the values where s ↦ p i (s) are continuous, non-increasing functions with lim s→∞ p i (s) > 0 for i ∈ {1, 2} . Let X ∶= X 1 + X 2 denote the portfolio loss, such that the insurer's portfolio risk measure, using (u) = u r 0 , r 0 ∈ (0, 1] , is given by (see section A.7.1 in the electronic supplementary information): where the dependence on s i , i ∈ {1, 2} , is suppressed for notational convenience and s ∶= (s 1 , s 2 ). Figure 3 illustrates that (7) may hold in the above example, which indicates that the insurer can have a financial incentive to subsidise service.
where the superscript 'ind' denotes individual contract pricing.

Remark 13
By very similar calculations as for the Pareto case, one can show that for a loss of deterministic severity, ↦ L 0,i ( , 1) is monotone non-increasing, such that the insurer's optimal solution to the minimisation problems (8) is to shift the full cost of service to the buyers and charge the maximum feasible loading, respectively.
We now consider her optimisation problem from a portfolio viewpoint in a twocontract set-up, where, interestingly, it has to be distinguished whether the contracts with the buyers are closed sequentially or simultaneously. Let us commence by assuming that the two contracts are closed sequentially and firm 2 is insured first.
Example 5 (Interpretation of sequential contract closure) Sequential contract closure could be interpreted as a situation where for a prospective policyholder, a loss could be caused by an occurrence at another firm (e.g. a supplier) outside the insurer's portfolio, but insuring the other firm is not feasible (yet). Derivative of ρ 1 (X 1 ) Derivative of ρ 0 (X 1 ) Derivative of (1 + θ) E[ X 1 ] Derivative of ρ 0 (X) Derivative of ρ 1 (X 1 ) Derivative of ρ 0 (X 1 ) Derivative of (1 + θ 1 ) E[ X 1 ] Derivative of ρ 0 (X) Fig. 3 Comparison of derivatives with respect to s 1 of single-contract and portfolio risk measures as well as the price of insurance (at a feasible loading 1 = 0.35 ). Note that Equation (7) holds: The decrease in price outweighs the decrease in both single-contract risk measures, but is outweighed by the reduction in the insurer's portfolio risk measure. The parameters for this example are chosen as r 0 = 0.8, r 1 = r 2 = 0.3, L 1 = 5, L 2 = 10, p 1 (s 1 ) = 1 Remark 14 (Insurer's problem: sequential optimisation, first policy) The results for firm 2, being insured first, are analogous to the single-contract case: In her initial risk assessment, assume the insurer correctly assesses the loss probability (given service level s 2 ) as which depends (due to loss propagation) on the unknown loss probability of firm 1. 23 For this study, we assume that firm 1 has not subscribed to insurance yet, but has solved the minimisation problem for the no-insurance case correctly, such that in Eq. (9) we set s 1 = s N1 . As remarked above, we know that the solution to the insurer's problem (8) for i = 2 is given by ( * 2 , * 2 ) = ( R,2 (1), 1) and given (9), we can proceed analogously to Sect. 3 to deduce firm 2's optimal service level without insurance s N2 and s I2 ( * 2 , * 2 ) within insurance.
The striking observation is as follows: By incentivising a higher service level in a subsequent contract with firm 1, the insurer not only improves the to-be-insured risk in that contract, but also the already priced risk in the existing contract with firm 2, as the probability for a propagated loss decreases. 24 Remark 15 (Insurer's problem: sequential optimisation, second policy) If the insurer prices each contract as if the risks were independent (or the propagation potential is undetected), she would solve (8) for i = 1 yielding ( * 1 , * 1 ) = R,1 (1), 1 . However, if she correctly takes the effect on the portfolio risk into account, to find ( * 1 , * 1 ) she instead considers the problem where the superscript 'seq' denotes sequential contract closure and X = X 1 + X 2 . 25 Remark 16 Sequential contract closure in the reverse order can be studied analogously. It is, however, obvious from the set-up of directed loss propagation that the insurer has no additional incentive to subsidise service for firm 2, independently of whether firm 1 is part of the portfolio, i.e. this analysis would not yield different results from the single-contract case and is, thus, omitted. L seq 0,1 ( 1 , 1 ) = 0,s I1 ( 1 , 1 ),s I2 ( R,2 (1),1) (X) − (1 + 1 ) s I1 ( 1 , 1 ) (X 1 ) − (1 + 2 ) s I2 ( 2,R (1),1) (X 2 ) + (1 − 1 )c(s I1 ( 1 , 1 )) + (1 − 2 )c(s I2 ( 2,R (1), 1)), We now assume that both contracts are priced simultaneously.
Example 6 (Interpretation of simultaneous contract closure) In practice, simultaneous contract closure could be interpreted as two firms jointly inquiring about insurance (e.g. companies along a supply chain or parent company and subsidiary) or the insurer approaching both before the first contract is closed.
The results of numerically solving the above optimisation problems are given in Fig. 4 for the propagation probability q ∈ [0, 1] , which in this set-up governs the dependence between the risks. 26

Remark 18 (Interpretation of results for directed loss propagation)
• Panel 4(a) depicts the optimal pricing parameters ( * 1 , * 1 ) of the contract offered to firm 1 (the "source of propagation"). If the contract with firm 2 is priced first, the insurer may subsidise service (i.e. choose * < 1 ) in the subsequent contract with firm 1, as this reduces the insured risk in contract 2 (without having to adjust the premium of firm 2). This subsidy (1 − * ) , as well as the loading * 1 , increase with the dependence between the risks. The same effect occurs, but to a smaller extent, if the contracts are priced simultaneously. This is caused by the fact that by subsidising service for firm 1, the insured risk in firm 2 is reduced, but this now has to be reflected in a decreased chargeable premium for that contract. Therefore, the incentive to subsidise service for firm 1 is smaller relative to the case where the price of contract 2 is fixed first. • Panel 4(b) depicts the optimal parameters ( * 2 , * 2 ) of the contract offered to firm 2. As the service level of firm 2 has no additional effect on firm 1, the insurer's problem for firm 2 is always analogous to the single-contract case, and thus, service cost is never subsidised ( * = 1 ). However, the risk loading depends on the (11) min ( 1 , 1 , 2 , 2 )∈A L sim 0 ( 1 , 1 , 2 , 2 ) = 0,s I1 ( 1 , 1 ),s I2 ( 1 , 1 , 2 , 2 ) (X) ( 1 , 1 , 2 , 2 )), 26 The calculation of the gradients, used in the numerical optimisation routine, is detailed in section A.7.1 of the electronic supplementary information.
loss probability ℙ s (X 2 = L 2 ) , which differs between the cases as it depends on s * 1 and therefore on whether firm 1 is insured already (and under which parameters). • Panel 4(c) depicts the insurer's optimally attainable negative loss (gain) L 0 ( * 1 , * 1 , * 2 , * 2 ) , which decreases with increasing dependence between the risks, while the additional gain from pricing contracts "correctly", i.e. using the portfolio risk measure, increases with the dependence. Analogous observations hold for the insurer's portfolio risk, see Panel 4(d).

Cyber events at multiple 'targets'
Another way to understand dependence between cyber losses is to consider the presence of common (systemic) vulnerabilities which allow cyber threats to affect multiple companies simultaneously (see, e.g. Böhme et al. 2018;Zeller and Scherer 2022). Realistic examples for systemic events causing incidents in multiple firms are the accidental outage or the malicious exploitation of a vulnerability in commonly used software or operating systems, leading to, e.g. data breaches or fraudulent activity (e.g. ransomware claims). 27   Remark 19 (Buyer's vs. insurer's perspective on common events) In this setting, each company faces incidents from systemic events as well as idiosyncratic incidents occurring independently from other firms, e.g. the loss or theft of hardware or negligent employee behaviour leading to involuntary data disclosure or business interruption. From the viewpoint of each company (insurance buyer), both types of incidents are indistinguishable in the sense that they aggregate to one loss arrival process, i.e. the company simply monitors if a loss occurs (disregarding its source) without knowing (or caring) if others may be simultaneously affected. From the insurer's portfolio viewpoint, however, the two types of incidents are viewed differently: incidents from systemic events are particularly worrisome as they entail accumulation risk, whereas idiosyncratic incidents are "desirable" in the sense that they constitute (if correctly priced) the basis of the insurance business and can be "diversified away" in a large portfolio.
Example 7 (Bivariate model with common events) Consider as model for the risks X 1 and X 2 : , and E 12 ∼ Exp( 12 ) independent with 1 , 2 , 12 ≥ 0, s.t. i + 12 > 0, i ∈ {1, 2} , and w.l.o.g. 0 < L 1 ≤ L 2 < ∞ . E 1 and E 2 model the arrival times of an idiosyncratic incident to firm 1 and 2, respectively, whereas E 12 models the arrival time of a common event causing simultaneous incidents in both firms, with deterministic loss sizes L 1 and L 2 , respectively. Let T denote the time horizon of the policy under consideration (w.l.o.g. T = 1 in what follows) and let denote the overall marginal arrival rates of incidents to firms 1 and 2, respectively. 28 It follows that the buyers' risk measure and expected loss are given by while the insurer's portfolio risk measure is given by (see section A.7.2 in the electronic supplementary information) where y 00 ∶= e −( 1 + 2 + 12 ) , y 10 ∶= (1 − e − 1 )e −( 2 + 12 ) , y 01 ∶= (1 − e − 2 )e −( 1 + 12 ) are the probabilities of none (subscript 00 ) or exactly one (subscripts 10 and 01 ) of the companies experiencing a loss. 29 X 1 = L 1 {min{E 1 ,E 12 }≤T} , X 2 = L 2 {min{E 2 ,E 12 }≤T} , I ∶= 1 + 12 , II ∶= 2 + 12 , , 0 (X) = L 1 [(1 − y 00 ) r 0 + (1 − (y 00 + y 10 + y 01 )) r 0 ] + (L 2 − L 1 )(1 − (y 00 + y 10 )) r 0 , Remark 20 (Interpretation: Self-protection by prevention of systemic events) We now consider the effect of self-protection services which can be distinguished into different categories described in Table 4. In the following, we scrutinise one possible type of effect we regard as particularly interesting in the cyber context, namely the prevention of systemic events: as the existence of common vulnerabilities (e.g. use of the same software) is regarded as the source of dependence between losses, it is firstly crucial for a cyber insurer to identify such common factors among policyholders and offer services which prevent the manifestation of a loss from a systemic event for the policyholder himself (e.g. timely patch management for standard software). Second, it is in the insurer's interest to use knowledge about an incident (or so-called near miss, i.e. a threat that did not lead to an incident due to adequate controls) at one insured company to immediately warn other policyholders about the imminent threat and, thus, hopefully increase the chance of averting a loss manifestation for them. Thus, the total portfolio loss in case of a systemic event could be reduced or, if all policyholders are warned on time, the manifestation of the systemic event could even be prevented. 30 Remark 21 (Insurer's problem: sequential optimisation, first policy) Assume again sequential contract closure, where w.l.o.g. the contract with firm 2 is closed first and its chosen service level affects the rate II via a decreasing map s 2 ↦ II (s 2 ). 31 Recall that by Lemma 3 (see section A.3 in the electronic supplementary information) a sufficient condition for convexity of the insurance buyer's optimisation problem is to choose the map s 2 ↦ II (s 2 ) in such a way that the subjective loss probability s 2 ↦ 2 ℙ s (X 2 = L 2 ) = (1 − e − II (s 2 ) ) r 2 is convex. For simplicity, we choose analogously to above (however, for the rate, not the loss probability directly) with a 2 , b 2 > 0 such that the above convexity condition is fulfilled.
In this sequential set-up, there is no distinction between idiosyncratic incidents and incidents from systemic events yet, as firm 1 is not yet part of the portfolio; in other words, the overall rate II = 2 + 12 can be observed, but it is not yet distinguished between 2 and 12 . 30 Our model implicitly equates incident arrival times (e.g. Z 1 ∶= min{E 1 , E 12 } ) with loss occurrence times, which would not allow time for a warning mechanism as all losses occur instantly and simultaneously. In reality, however, the discovery and exploitation of the same vulnerability in different firms can be delayed over time, see again, e.g. the ProxyShell exploit case (Born 2021). As we do not take into account discounting over the policy year and therefore do not need to explicitly model a delayed loss occurrence time after the incident arrival time, we assume the warning mechanism to directly prevent the incident arrival.

Decreases Decreases Decreases
Service prevents the manifestation of a loss from a systemic event for the policyholder and allows the insurer to prevent a potential loss from the same source in other companies in the portfolio. An example is timely patch management for all common software where additionally all near misses are immediately reported to and analysed by the insurer (or her service provider), allowing them to identify current threats and warn other firms.  Böhme (2005) analyses the similar idea of premium discrimination between users of a dominant and an alternative platform (e.g. representing an operating system) to estimate the extent to which insurance premiums can motivate "ecosystem diversification" and counterbalance market processes that converge to a "monoculture" of installed systems Remark 22 (Insurer's problem: sequential optimisation, second policy) At subsequent contract offering to firm 1, we assume that the service level of firm 1 influences the rate I (s 1 ) via a decreasing map with a 12 , b 12 > 0 such that s 1 ↦ 1 ℙ s (X 1 = L 1 ) is convex and it must hold 12 (s 1 ) ≤ II (s * 2 ) for any s 1 ≥ 0 . The marginal rates for both firms are then given by (now the incidents can be classified as idiosyncratic or systemic) for some constant 1 > 0 , implying that the choice of s 1 affects the marginal distributions of both risks as well as the dependence between them, e.g. expressed by s 1 ↦ 1 − 1 I (s 1 ) . 32 Therefore, when offering a contract to firm 1, the insurer should again consider problem (10) to correctly take the dependence into account, as opposed to solving (8) for i = 1.
Remark 23 (Results for prevention of systemic events) Numerical results of solving (10) are given in Fig. 5 for varying degree of dependence between the two risks. 33 We observe that if the contract of firm 1 is priced using (10), it can be optimal for the insurer to choose * 1 < 1 , leading to an increased risk loading, an increased optimal service level s I1 within the insurance policy, a decreased loss probability for both policyholders, and an increased gain and decreased portfolio risk for the insurer. These effects increase with the dependence between the two risks. 34

Copula approaches
Copula approaches have become a widely popular method to assess and describe dependence between random variables, as they allow the decomposition of a multivariate distribution function (c.d.f.) F of a random vector (X 1 , … , X d ) into marginal c.d.f.s F 1 , … , F d and an object representing the dependence structure, called copula C, which itself is a multivariate c.d.f. with standardized uniform marginals (see section A.2 in the electronic supplementary information). In empirical research on cyber-risk modelling, one starts with observations of cyber losses that s 1 ↦ 12 (s 1 ) = 1 s 1 + a 12 + b 12 , I (s 1 ) = 1 + 12 (s 1 ), II (s 1 , s * 2 ) = 2 (s * 2 ) + 12 (s 1 ), 33 The gradients used for the numerical optimisation are given in section A.7.2 in the electronic supplementary information. Due to the symmetrical set-up of the dependence, we do not consider the reverse order of contract closures. 34 Note that contrary to the last example, the x-axis does not start at 12 (0) = 0 representing (initial) independence, resulting in * 1 < 1 for the whole depicted range 12 (0) ∈ {0.15, 2}. 32 Note that in this set-up, neither independence nor comonotonicity can be reached, as b 12 > 0 and are conjectured not to be independent. As the main goal of many empirical studies is the description and analysis of the observed data, bottom-up approaches that seek to mimic the mechanism underlying the dependence between cyber losses may not be available for a statistical investigation, yet. Rather, a top-down approach of analysing the multivariate observations by fitting (parametrically or non-parametrically) univariate distributions to the marginals and by choosing a flexible parametric copula family and fitting its parameter(s) to the observed data, is often preferred (due to numerical tractability).
In the cyber context, e.g. Eling and Jung (2018) study the cross-sectional dependence of data breach losses (cross-industry and cross-breach type) using a Gaussian copula, among others. Previously, Böhme and Kataria (2006) and Herath and Herath (2011) proposed models for cyber risk using the t-copula and the Archimedean copula family (Clayton and Gumbel), respectively. More recently, Peng et al. (2018) studied the multivariate dependence exhibited by real-world cyber attack data using a Copula-GARCH model with vine copulas.
Example 8 (Bivariate Gumbel copula) An example akin to the ones above would be for the bivariate case (X 1 , X 2 ) ∼ F s with   where F i,s i are the marginal c.d.f.s of the single risks depending on the chosen service levels s i (for example, zero-inflated Pareto distributions as considered in Appendix 7.5 in elctronic supplementary information) and C (s) (u, v) is the bivariate Gumbel copula (see Gumbel 1960) which seems a suitable choice in the cyber-risk context as it allows for capturing upper tail dependence and is the only member of the Archimedean family which is also an extreme-value copula. 35 The dependence is governed by the parameter (s) ∈ [1, ∞) , ranging between the independence copula for (s) = 1 and perfect positive dependence (i.e. converging to the comonotonicity copula) for (s) → ∞. 36 Remark 24 (Effects of service on portfolio risk in the copula setting) Again, different assumptions about how the chosen service levels s = (s 1 , s 2 ) of insurance buyers influence the (joint) portfolio risk can be postulated: • If service only influences the marginal distribution of the insured risk, i.e. via s i ↦ F i,s i , i ∈ {1, 2} , inducing a decreasing order in the sense of the "Model set-up and assumptions" section, the analysis does not differ from the univariate case. For examples in the cyber context, see the first row of Table 4. • If service only affects the dependence between the risks via a (in some suitable (partial) ordering decreasing) map s ↦ (s) without altering the marginals, it is obvious that no insurance buyer would have an economic incentive to purchase such service (compare the last case in Table 4) and another (interesting!) question would arise, namely, how much the insurer should optimally spend on giving away service (as a free addition to risk transfer) to favourably (in her risk measure) alter the dependence structure of her portfolio. • If service affects both the marginal distribution(s) and the dependence structure, an example where both parties agree to share the cost of service could be constructed. For interpretations in the cyber context, compare the second and third row of Table 4. F s (x 1 , x 2 ) = C (s) F 1,s 1 (x 1 ), F 2,s 2 (x 2 ) , x 1 , x 2 ∈ ℝ, C (s) (u, v) = exp − (− ln(u)) (s) + (− ln(v)) (s) 1∕ (s) , (s) ∈ [1, ∞), u, v ∈ [0, 1], 36 Note that generally, an Archimedean copula is not parametrised by a parameter , but by the so-called (Archimedean) generator = , a non-increasing function ∶ [0, ∞) → [0, 1] with (0) = 1 and lim x→∞ (x) = 0 . The Gumbel copula is obtained by using the parametric family (x) = exp − x 1 ; for brevity, we use the notation C instead of C . 35 Extreme-value copulas allow to capture the dependence structure between certain rare events, for details see, e.g. Mai and Scherer (2017). The necessity of dealing adequately with extreme events in the cyber context has been emphasised by many authors, e.g. the comprehensive data-driven analysis of cyber losses by Eling and Wirfs (2019) advocated for distinguishing between "cyber risks of daily life" and "extreme cyber risks".
As remarked above, however, the main drawback of such a top-down modelling approach is that it is not based on an attempt to causally understand the dependence between cyber losses; instead, its merit is based on the analytical decomposition in Theorem 2 (see section A.2 in the electronic supplementary information) and its tractability in statistical inference. This is a somewhat questionable foundation in the cyber context due to scarcity, limited reliability, and suspected non-stationarity of available data, limiting the informativeness of models estimated on past data for the prediction of future losses. Therefore, we do not go into more detail on this example, but reiterate that in principle it provides the same flexibility regarding the effect of risk reduction services in insurance policies as the examples treated in detail above.

Conclusion
In recent years, with demand for cyber insurance increasing tremendously, cyber insurance markets around the world have been growing and the range of available cyber policies has been continuously expanding. As policies continue to mature, many prospective insurance buyers and external cyber experts agree that pure risk transfer cannot be an optimal cyber-risk management solution. Instead, companies -insured or not -have to make ongoing efforts to keep their cybersecurity measures up-to-date, given the evolving cyber-threat landscape. Therefore, there is mutual benefit (for all stakeholders) in the combination of risk transfer and risk reduction measures, leading to the (prospective) ubiquitous offering of pre-incident and postincident services.
In this study, we have dealt with this combination of risk reduction and risk transfer in the cyber insurance context, and in particular addressed the question of how such risk reduction services should be optimally priced from an insurer's viewpoint. We have illustrated how common services within cyber insurance can be classified into the concepts of self-protection and self-insurance, and have argued how insurers should make use of their unique position regarding knowledge about the current cyber-loss landscape to offer additional pre-incident (warning) services to their policyholders.
We have shown that in the univariate case, i.e. when pricing a single contract alone, an insurer using a distortion risk measure with concave distortion (i.e. being risk-neutral or risk-averse) never has an economic incentive to subsidise pure selfprotection services (i.e. only considering the effect on loss probability, factoring out potential cross-effect on loss sizes and the prospect of gaining additional information) and will, thus, always shift their full cost to the insurance buyer. Interestingly, this does not generally hold for the pricing of self-insurance services or when taking a multivariate (portfolio) viewpoint, in which case it can be optimal (and in some cases even mandatory to find an acceptable contract for both parties) to share the cost of risk reduction service between insurer and policyholder. We illustrate this finding using toy examples of two risks with dependence mechanisms representative for the cyber context and one exemplary extension to a larger multivariate setting.
From the insurance buyers' point of view, the study serves to illustrate how their initial risk (when approaching the insurer) and their choice of (distortion) risk measure as well as the existing portfolio of the insurer can influence the insurance price offered to them for different contracts (i.e. how much risk reduction is implicitly incentivised for them by the insurer's choice of price structure).
Some interesting aspects, however, remain for future research. We restricted the insurance buyer's options to full or no insurance (as is customary for primary insurance in the cyber context), but one could extend this to more general payout functions (e.g. proportional at any share ∈ [0, 1] or excess-of-loss per risk at different priorities and limits). 37 Furthermore, we have mentioned that in the cyber context, part of the risk should be considered non-insurable (e.g. reputational risk), an aspect that could generalize the modelling of the insurance buyer's optimisation problem.
From the insurer's point of view, the pricing of self-protection and self-insurance services has been studied disjointly, whereas in practice, the combination of both types of services within a policy is customary. Furthermore, we have only illustrated the insurer's portfolio viewpoint in bivariate examples and an exchangeable extension. Fully exploring the question of optimal offering of cyber services using an insurer's more general multivariate viewpoint on a portfolio of dependent policyholders comprises many interesting questions for future work.
Furthermore, especially due to the potential for extreme cyber losses resulting from single large losses or accumulation risk from a large cyber event, many insurers work with reinsurance providers to limit their exposure and manage their portfolio risk. This opens the potential to analyse a suitable Stackelberg game between insurer and reinsurer(s) or even a set-up involving all three parties (insurance buyer(s), insurer, and reinsurer(s)). In this context, also interesting questions about optimal risk sharing arise.
Lastly, we have argued that the understanding of the dependence between cyber losses is crucial for insurers, as purely top-down dependence modelling approaches may not be suitable in the highly dynamic, non-stationary cyber domain. Therefore, more empirical research on the dependence structures underlying cyber risk, e.g. to more accurately determine underlying common factors leading to simultaneous exposure to a certain cyber event, is certainly necessary to better understand the evolving cyber-threat landscape. Lastly, it should be mentioned that many related questions from a not purely mathematical viewpoint arise. For example, economically and legally, it needs to be investigated how to ideally set up cyber insurance policies including services such that all parties (insurer, insureds, and IT security experts as service providers) draw synergies from the collaboration. From a technical viewpoint, one important issue is how to effectively quantify (and monitor) the IT security landscape of a potentially highly complex enterprise for actuarial applications. These issues emphasise the importance of interdisciplinary collaboration and research in the cyber insurance domain in order to tackle this challenging risk. This article is complemented by an electronic supplement (Appendix) containing a seminal discussion of risk-assessment services, mathematical preliminaries, proofs, case studies and extended calculations.