Preference disaggregationmethod for value-basedmulti-decision sorting problemswith a real-world application in nanotechnology

We consider a problem of multi-decision sorting subject to multiple criteria. In the newly formulated decision problem, besides performances on multiple criteria, alternatives get evaluations on multiple interrelated decision attributes involving preference-ordered classes. We propose a dedicated method for dealing with such a problem, incorporating a threshold-based value-driven sorting procedure. The Decision Maker (DM) is expected to holistically evaluate a subset of reference alternatives by indicating the quality or risk level on a pre-defined scale of each decision attribute. Based on these evaluations, we construct a set of interrelated preference models, one for each decision attribute, compatible with intraand inter-decision constraints imposed by such indirect preference information. We also formulate a new way of dealing with potentially non-monotonic criteria by discovering local monotonicity changes in different performance scale regions. The marginal value functions for criteria with unknown monotonicity are represented as a sum of two value functions assuming opposing preference directions, one non-decreasing and the other non-increasing. This permits to obtain an aggregated marginal value function with an arbitrary non-monotonic shape. The practical usefulness of the approach is demonstrated on a case study concerning risk management related to handling (i.e., production, use, manipulation, and processing) nanomaterials in different conditions. We analyze the expert judgments and discuss the inferred preference models, which can be applied to support health and safety managers in reducing the possible risk associated with the respective exposure scenario. © 2021 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Multiple Criteria Decision Analysis (MCDA) concerns decision problems where a set of alternatives are evaluated on a family of criteria, which represent all relevant, heterogeneous viewpoints on the quality of alternatives [1,2]. Many such decision problems fall into the general category of classification, where the alternatives need to be assigned to distinct classes [3]. If the classes are completely ordered, one deals with ordinal classification or, equivalently, sorting problems [4]. They are considered, e.g., in the ABC analysis, which is a type of inventory categorization method where high-, mid-, and low-value alternatives need to be identified [5], in medical diagnosis, where high-risk patients need to be distinguished from the low-risk ones [6], in business * Correspondence to: Institute of Computing Science, Poznań University of Technology, Piotrowo 2, 60-965 Poznań, Poland.
failure prediction, where firms are sorted into healthy, uncertain, and close to bankruptcy [7], or in nanomanufacturing, where synthesis processes of nanomaterials can be sorted according to their greenness level [8].
Decision aiding sorting methods aim to provide recommendations to the Decision Maker (DM) regarding the assignment of alternatives to pre-defined and ordered classes [9]. There are various approaches serving this purpose though differing with respect to the underlying assumptions and characteristics of delivered results. In particular, various methods expect the DM to provide different types of preference information through a coconstructive elicitation process led by a decision analyst. On the one hand, some methods assume that the DM would directly specify values for a set of parameters of an assumed sorting model [10,11]. However, this is a bit unrealistic because (s)he usually has difficulties with keeping consistency between the supplied values and the model output [4]. On the other hand, preference disaggregation procedures have been proposed to prevent such difficulties [12]. They aim at deriving compatible model parameters from the analysis of the DM's comprehensive judgments (assignment examples) concerning a subset of reference alternatives. This allows generalizing the DM's policy to an entire set of alternatives through the use of a suitably parametrized sorting model. Specifically, UTADIS disaggregates the DM's assignment examples into marginal value functions and class thresholds separating the consecutive decision classes on a scale of comprehensive value [13,14]. This idea was found appealing in such various fields as finance [15], energy management [16], or stock portfolio analysis [17].
Motivated by the complexity of real-world sorting problems, UTADIS has been extended in various ways. First, different procedures for dealing with inconsistency of assignment examples with an assumed model have been proposed [14,18]. Second, a hierarchical classification approach, called MHDIS, has been introduced in [19]. Third, the Multiple Criteria Hierarchy Process has been adapted to UTADIS to allow handling preference information and deriving recommendations at both comprehensive and intermediate levels of the hierarchy of criteria [20]. Moreover, novel preference modeling procedures have been designed to admit the specification of desired class cardinalities, assignment-based pairwise comparisons [21], or valued assignment examples [22]. Furthermore, the frameworks for robustness analysis have been proposed [4,23,24] to exploit infinitely many instances of the preference model (e.g., value functions and class thresholds) compatible with the DM's holistic decisions. While all approaches mentioned above considered a single DM, [25] introduced a group decision framework, called UTADIS GMS -GROUP, investigating the spaces of agreement and disagreement between sorting recommendations obtained for different DMs. Some other recent methodological advancements of the UTADIS method concern the form of an employed value-based model. In this regard, an additive value function has been extended in [26] to account for positively and negatively interacting pairs of criteria. The other stream concerned dealing with the non-monotonicity of preferences on a per-criterion level. In particular, [27] defined a broad spectrum of non-monotonic shapes that could be considered along with the gain-and cost-type criteria. Moreover, [28] and [29] introduced the models admitting non-monotonicity of marginal value functions while not restraining their complexity. In turn, [29,30] and [31] minimized, respectively, the variation in the slope or the number of changes of non-monotonicity in the shape of marginal value functions to ensure the most interpretable sorting model. The contribution of this paper is three-fold. First, we introduce a new problem of multi-decision sorting in MCDA and propose a dedicated method for dealing with it. In this problem, each alternative is evaluated in terms of multiple decision attributes involving preference-ordered classes. We expect the DM to assign a subset of reference alternatives to classes of each decision attribute by indicating a quality or risk level on a scale pre-defined for all decision attributes. A similar setting has been considered in [32] and [33] in the context of credit rating problems. On the one hand, [32] adapted the UTADIS method to infer a single threshold-based value-driven sorting model compatible with the ratings provided by Moody's and Standard & Poor's, hence providing a precise recommendation based on potentially conflicting inputs for the same alternative. On the other hand, [33] used the three credit rating agencies' recommendations to form an interval rating that was subsequently used as a potentially imprecise reference benchmark to be reproduced by the ELECTRE TRI-nC method [34].
The multi-decision sorting problem and dedicated approach introduced in this paper are original in the sense of requiring construction of a set of interrelated preference models, one for each decision attribute. Such a requirement contrasts with the inference of a single sorting model that would align with multiple classifications at the same time [32] or an imprecise assignment built on multiple ratings for the same alternative [33]. Specifically, we propose a threshold-based value-driven sorting method. It involves a set of intra-and inter-decision constraints. The former ones ensure appropriate relations between comprehensive values of different alternatives for an individual value function used to derive the assignments for a single decision attribute. The latter correspond to the relations between comprehensive values of the same alternative for multiple value functions employed for classifying this alternative given various decision attributes. This makes sense when the classes of various decision attributes correspond to the same default categories, having the same scope and interpretation. This paper's second contribution derives from presenting the results of a case study concerning risk management related to handling nanomaterials in different conditions [35]. The production, processing, and use of nanomaterials may lead to health or life exposure. Depending on the particular exposure scenario, different types of precautions or safety measures can be used to counteract the respective risk [36]. Also, some precautions meet general hazards, whereas others are dedicated to deal with some specific dangers. Each of the precaution types (e.g., incorporation of some personal protective equipment, engineering controls, or work practices) can be interpreted as a decision attribute with pre-defined preference-ordered classes representing different levels of risk [37]. When facing hazards that particular nanomaterials carry with them, some precautions are required, and others are optional or unnecessary [31]. However, different precautions are not independent, being defined on the same set of criteria and related in terms of their interpretation.
There is a need for a method dealing with multiple interrelated preference-ordered decision attributes to tackle such a problem. In this regard, MCDA has little to offer. This, in turn, implies that such complex problems would typically be decomposed into independent ones. This would allow for modeling intra-decision dependencies, neglecting the inter-decision relations that could negatively affect results' usefulness. Other ideas are solutions derived from multi-label classification, such as label powerset [38] or probabilistic classifier chains [39]. The label powerset generates a vast number of classes and requires many examples so that each class has a sufficient number of its representatives. The latter is difficult to satisfy for the case study. Probabilistic classifier chains offer different solutions depending on the order of the decisions under consideration and require repeated solving of the same problem. The limitations of the existing approaches motivated the development of a dedicated multiple criteria sorting method. In the context of the considered case study, the information coming from the proposed approach will help the DMs in assessing the risk related to the treatment of nanomaterials in different conditions. Specifically, it can be used for recommending the level of need for the use of specific personal protective equipment, engineering controls, or work practices.
Our third contribution consists of proposing a new way of dealing with potentially non-monotonic criteria. Non-monotonic criteria appear in the MCDA problem when, for some attributes, neither the univocal preference direction could be specified, nor the non-monotonic shape of respective marginal value function could be defined a priori. This happens in our case study. Then, such a shape needs to be inferred from data describing the multiple criteria problem and the DM's holistic judgments. In particular, the method should verify whether a monotonic relationship exists, if it is of gain-or cost-type, or if the monotonicity is not global. The latter scenario could reveal some local positive or negative relationships in different parts of the investigated performance scale [40]. To perform this task, we propose a new approach that attempts to discover local monotonicity changes without requiring the DM to fix the preference directions for all criteria. Specifically, we represent the marginal value functions of potentially non-monotonic criteria as a sum of marginal value functions assuming opposing preference directions, one non-decreasing and the other non-increasing. This permits to obtain an aggregated marginal value function with an arbitrary non-monotonic shape. However, the two monotonic components remain easy to interpret.
The remainder of the paper is organized in the following way. Section 2 is devoted to the new method dealing with multi-decision sorting problems and handling potentially nonmonotonic criteria. Section 3 illustrates the use of the proposed method on a didactic instance. In Section 4, we report the results of a case study concerning the analysis of exposure scenarios related to the treatment of nanomaterials in various conditions. The last section concludes and outlines the ideas for a future work.

Notation and problem statement
We use the following notation: . . , a n } -a finite set of n alternatives; where n j (A) = |X j | and n j (A) ≤ n; consequently, X = ∏ m j=1 X j is the performance space; In what follows, we discuss the employed preference model and preference information. We present the mathematical constraints that allow dealing with multi-decision sorting problems while originally handling potentially non-monotonic criteria. The latter ones are interpreted as criteria with unknown monotonicity. This means that a decision analyst and the DM cannot specify a preference direction for them, and they admit that such a direction may not exist. Moreover, they accept that the shapes of marginal value functions for these criteria will be inferred through disaggregating the DM's holistic preferences. The resulting shape will determine if the monotonic relation can be imposed in the entire performance space of a given criterion, and, if not, what are the local relationships of monotonicity in different regions of this space.

Preference model
For each decision attribute D s ∈ D, a comprehensive quality of each alternative a i ∈ A is quantified using an additive value function defined as the sum of marginal values u Ds j (a i ) on all criteria g j , j = 1, . . . , m: (1) Alternatives are evaluated in terms of three types of criteria: gain g j ∈ G g , cost g j ∈ G c , and potentially non-monotonic g j ∈ G n (G g ∪G c ∪G n = G). For the gain-type criteria, greater performances are more preferred than smaller performances. This implies the following monotonicity and normalization constraints: , k = 2, . . . , n j (A), and u Ds Analogously, for the cost-type criteria, smaller performances are more preferred: , k = 2, . . . , n j (A), and u Ds Example marginal value functions for the gain-and cost-type criteria are presented in Fig. 1. Note that these functions are, respectively, non-decreasing and non-increasing. The marginal value function for the potentially non-monotonic criterion g j ∈ G n is modeled as the sum of marginal values derived from the non-decreasing and non-increasing components contributing to the comprehensive assessment of alternatives from this particular viewpoint: where the monotonicity of u Ds j,nd and u Ds j,ni is modeled in a standard way: , then any non-monotonic shape of the marginal value function u Ds j can be obtained. However, this may yield a comprehensive marginal value function, which is not equal to zero for the worst performance on the nonmonotonic criterion. Such a situation is undesired because it is hard to interpret such a model, and, moreover, the scale of values attained by the comprehensive model gets reduced. To prevent such a scenario, the marginal value function should be normalized so that ∃x k j ∈ X j such that u Ds   with a bias are arbitrarily non-monotonic. Such functions can capture the local relationships of monotonicity that can be positive in some of the considered performance space and negative in the other part of the same space. Since in the proposed approach, these functions are inferred from the assignment examples, their complexity (i.e., the non-monotonic character, the number of monotonicity changes, or differences in slopes in various regions of the performance space) depends on the dependencies observed in the preferences of alternatives and input preference information.
To assign the alternatives to pre-defined and ordered classes for decision attribute D s ∈ D, we will apply a threshold-based sorting procedure. It derives the assignment of alternative a i ∈ A from the comparison of U Ds (a i ) with a set of thresholds b Ds h , h = 0, . . . , p, such that for D s ∈ D: b Ds 0 = 0, b Ds p−1 ≤ 1 − ε, and b Ds p = 1 + ε, where ε is an arbitrarily small positive value. In this way, the values of the worst and the best thresholds are set to, respectively, zero and greater than one. Moreover, there is a difference between the extreme thresholds delimiting each class so that it could accommodate some alternatives. Then, a i is assigned to class C , if a i is at least as good as the respective lower threshold and strictly worse than the corresponding upper threshold. Such a threshold-based sorting procedure is illustrated in Fig. 3. Eqs. (1)-(10) form a core component of a larger set of linear programming constraints defining a set of instances of an assumed sorting model that are compatible with the DM's preference information. We will refer to it as E BASE .

Preference information
We expect the DM to specify the desired assignments for a subset of reference alternatives a * ∈ A R ⊆ A on each decision attribute D s ∈ D: Note that the classes provided by the DM for different decision attributes D s ∈ D can be different for the same reference alternative. Such holistic preference information is used in a twofold way. On the one hand, we need to reproduce the desired assignments on each decision attribute, i.e.: for all a * In case v(a * ) = 0, U Ds (a * ) falls in the range corresponding to class ). Then, the assignment provided for a * on D s is reproduced. If v(a * ) = 1, the respective constraints are always satisfied, being relaxed. The binary variables v(a * ), a * ∈ A R , will be subsequently used to minimize the prediction distance of the inferred model from the reference data in case the sorting model would not be able to align with all assignment examples.
On the other hand, in line with the specificity of the multidecision sorting problem, we will compare the desired assignments for each reference alternative a * ∈ A R for different pairs of decision attributes. Let us remind that both the number and interpretation of classes are the same for all decision attributes. In this way, the classes specified by the DM determine an order of labels associated with each reference alternative. If C Ds DM (a * ) is more preferred than C Dt DM (a * ), this can be interpreted as the label D s being more desired for a * than label D t . Consequently, a comprehensive value of a * associated with D s should be greater than its respective value associated with D t , i.e.: Analogously as in E R (intra − D s ), binary variable v(a * ) implies that the respective constraints associated with the assignments of a * are either instantiated (when v(a * ) = 0) or relaxed (when v(a * ) = 1).

Compatible sorting model
We aim to infer a sorting model that would be compatible with the provided assignment examples while respecting the assumptions on additivity, monotonicity, and normalization, as well as intra-and inter-decision constraints. The model is composed of a set of interrelated additive value functions and vectors of class thresholds such that a single function is associated with a single vector of thresholds corresponding to each decision attribute. We admit that the reference assignments are burdened with some error, though we would like the model to be compatible with as many assignments of reference alternatives as possible. For this purpose, we solve the following Mixed-Integer Linear Programming (MILP) model: The primary goal of the above objective function f w is to minimize the number of reference alternatives for which the desired assignments are inconsistent with an assumed preference model, i.e.,  (7), such a sum is constrained by r · l, i.e.: where r = |G n | is the number of potentially non-monotonic criteria, and l is the number of decision attributes. As a result, the above model always favors a lesser number of reference alternatives a * ∈ A R that need to be removed to restore consistency. Specifically, among the models for which such a number is minimal, we favor the one for which the sum of biases on all criteria with unknown monotonicity is as small as possible.

Illustrative example
In this section, we illustrate the use of the proposed method on a simple didactic example composed of a pair of scenarios, denoted as Scenarios 1 and 2. The considered problem involves ten alternatives, which are evaluated on the following three criteria: g 1 of gain type, g 2 of cost type, and g 3 being potentially non-monotonic. For the performance matrix, see Table 1. The alternatives are comprehensively evaluated using two decision attributes (D 1 and D 2 ) with five preference ordered classes C 1 −C 5 such that C 5 and C 1 are, respectively, the most and the least preferred ones.
Let us first consider Scenario 1 for which the reference assignments are provided in Table 1. For example, a 1 is assigned to C 4 on D 1 and to C 3 on D 2 , whereas the order of classes for a 5 is inverse. The inferred model is able to reconstruct the assignments for eight alternatives (see column v(a * ) (Scenario 1) of Table 2). The comprehensive judgments for a 6 and a 10 could not be reproduced. The assignments of a 6 were contradictory with those of a 7 . Specifically, a 7 is more preferred than a 6 on g 1 and g 2 , while attaining the same performance on g 3 . However, the classes of a 7 are worse than the respective classes of a 6 , which contradicts the dominance principle. Similarly, a 10 dominates a 1 while being at least as good on all monotonic criteria and having the same performance on the non-monotonic criterion. However, the desired class of a 10 on D 1 is worse than that of a 1 , while they Table 1 Performance matrix and the reference assignments considered in the two scenarios in the illustrative example.

Alternative
Criteria Scenario 1 Scenario 2 are both assigned to the same class on D 2 . Also in this case, the assignments of a 10 and a 1 could not be reproduced jointly.
The comprehensive values of all ten alternatives and the assignments generated with the inferred sorting model are presented in Table 2. For the separating class thresholds, see Table 3. Let us first discuss the assignments of reference alternatives which agree with the one specified by the DM. For example on D 1 , a 1 is assigned to C 4 and a 3 is assigned to C 5 . Not only a 3 attains a greater comprehensive value than a 1 , but also the comprehensive values of these alternatives fall in the ranges delimited by the respective class thresholds (compare Tables 2 and 3). Similarly, on D 2 , a 1 is assigned to a more preferred class than a 3 , which is reflected in the relationship between their comprehensive values (U D 2 (a 1 ) = 0.4762 > U D 2 (a 3 ) = 0.2886). However, the inferred comprehensive values respect also the desired inter-decision relationships. For example, a 3 was assigned to C 5 on D 1 and to C 2 on D 2 . As a result, its comprehensive value on D 1 (U D 1 (a 3 ) = 0.6162) is greater than that on D 2 (U D 2 (a 3 ) = 0.2886). In the same spirit, since a 8 was assigned to C 4 and C 5 on, respectively, D 1 and D 2 , we have U D 1 (a 8 ) = 0.6078 < U D 2 (a 8 ) = 0.6162. When it comes to the reference alternatives for which the desired assignments were not fully reproduced, the resulting class of a 6 on D 2 and a 10 Table 2 Comprehensive values U(a * ), respective class assignments on decision attributes D 1 and D 2 , and values of binary variables v(a * ) for the two scenarios in the illustrative example.

Alternative
Scenario 1 Scenario 2 Table 3 The class thresholds for the two scenarios in the illustrative example. on D 1 was C 1 as compared to, respectively, C 2 and C 5 in the DM's judgments.
The marginal value functions inferred for Scenario 1 are presented in Fig. 4. The imposed monotonicity constraints are respected for the gain (g 1 ) and cost (g 2 ) criteria and the monotonic components of g 3 . The shapes of marginal value functions on the different decisions are similar. The differences concern slightly greater marginal value assigned to performance 3 on g 1 for D 2 and to performances 0-2 on g 2 and 3-4 on g 3 for D 1 . The nonincreasing component for g 3 was the same for both decision attributes. The marginal value function's overall course for g 3 took the V-shape with 1 being the least preferred performance.
As the other scenario (Scenario 2), let us consider slightly modified desired assignments (see Table 1). When compared to Scenario 1, the assignments of a 7 on D 2 and a 6 on D 1 were changed to, respectively, C 3 and C 1 . Now, only assignments of alternative a 1 could not be reproduced with an assumed model (see Table 2, column v(a * )) for Scenario 2). The comprehensive values and the respective assignments are presented in Table 2 and the class thresholds are given in Table 3. Similarly as for Scenario 1, we could observe that the classifications on the two decision attributes and the relationships between comprehensive values attained by each alternative on D 1 and D 2 are preserved. For example, a 2 is assigned to C 2 on D 1 and to C 1 on . However, the primary motivation for considering Scenario 2 is to show the impact of eliminating a bias for a non-monotonic criterion (for the marginal value functions, see Fig. 5). Indeed, when summing up the non-decreasing and non-increasing components for g 3 for decision D 2 , all performances would be assigned positive marginal values. To ensure that the worst performance on g 3 (for this scenario -g D 2 3 (a) = 2) was assigned zero, the constructed model subtracted a bias t D 2 3 = 0.4444. In this way, a comprehensive value of the anti-ideal alternative was also zeroed, while not affecting the relative comparison of existing alternatives.
To support the comprehension of different types of constraints defining a set of inter-related sorting models, in Table 4, we illustrate the use of these constraints in the context of an example presented in this section. Specifically, for nine different constraint types, we provide their general form, an example constraint for a specific decision attribute, alternative, criterion, performance, or class, and the values assigned in this example constraint to the variables by the sorting models inferred for Scenario 2.

Multi-decision sorting in the context of exposure management of nanomaterials
Nanomaterials are particles with a size of several dozen nanometers and physicochemical properties being significantly different from the materials of larger sizes composed of the same atoms [35]. Due to these specific properties, they can improve the performance of products in several application areas, including energy production and storage [41], water treatment [42], healthcare [43], and food preservation [44], to name a few. The production of nanomaterials is based on the manipulation of materials at the nanoscale (1-100 nanometers), which requires caution and protective measures to guarantee their safe handling.
Since nanotechnology is a relatively new scientific field, all the potentially harmful effects of individual nanomaterials and threats resulting from their production and employment are not yet precisely known [45,46]. The research on this subject is still ongoing, but the safety standards used in nanomaterials production are currently mainly adopted from similar chemical production processes [37,47,48]. Nevertheless, the safety of nanomaterials production processes is a pressing issue in the area of nanotechnology [48,49]. In this perspective, the development of guidelines for the appropriate selection of precautions for nanomanufacturing would be a beneficial contribution.

Criteria
When evaluating nanomanufacturing exposure scenarios, there are several characteristics of the nanomaterials and operating conditions that need to be accounted for. In this case study, we will consider the following ten criteria, which are common descriptors for this type of scenarios [31,35,48]: • Particle size (g 1 ) -in general, the smaller the size, the easier the nanomaterial gets through any filter. Nevertheless, since nanomaterials have different toxicological profiles according to their size, it is not yet possible to generalize a monotonic dependency between size and harmfulness [50].
• Toxicity (g 2 ) determines type of effect the nanomaterial has on human health [51].
• Airborne capacity (g 3 ) characterizes the engineered nanomaterials' capacity to spread in the workplace through the air stream. It is scored on 4-point scale from none to high, with none and high being, respectively, the most and the least preferred performances [52].
Example values assigned to the model variables 0.4500 ≥ 0.1224 + 0.0001 − 0 • Detection limit (g 4 ) specifies the capacity of the instruments used for exposure assessment to detect the nanomaterials. The better the detection limit is, the safer the exposure scenario is assumed to be [35].
• Exposure limit (g 5 ) indicates the limit of exposure, expressed on five ranges, for a given exposure scenario. The lower this limit, the less risk concerning a given exposure is [35].
• Quantity (g 6 ) of a nanomaterial (in kg) handled in a given scenario. Smaller quantities are preferred as they imply a smaller chance of exposure [53]. • Number of employees (g 8 ) indicates the number of people required to handle a given exposure scenario. One cannot define a priori how this attribute is associated with the risk of an exposure scenario [35].
• Duration of exposure (g 9 ) is negatively associated with the risk, i.e., the shorter duration is deemed to be less risky [53].
• Multiple exposures (g 10 ) is related to the frequency of exposure (a scenario is safer in case the number of exposures is lesser) [54].
The measurement units, preference directions, performance scales, and encoding of performances for all criteria are provided in Table 5. In general, there are six criteria of cost type, a single gain criterion, and three criteria for which the preference direction is unknown.

Alternatives
The considered set of alternatives is composed of exposure scenarios for nanomanufacturing generated by the JMP software [35]. They correspond to the existing and future types of nanomaterials and manufacturing processes. To demonstrate the proposed method's applicability, we consider a set of 45 exposure scenarios, denoted by a 1 − a 45 . For their performances, see Tables 6 and 7.

Multi-decision sorting
The alternatives are holistically evaluated in terms of four decision attributes that could be considered individually. However, the multiplicity of these attributes, the reference to the same aspect of production, i.e., safety, and the resulting inter-relations between the examined decisions form the basis for multi-decision sorting. Specifically, the decision attributes correspond to four types of precautions that can be used to reduce the risk. They concern three main aspects of safety: respirator (D 1 ) represents a personal protective equipment, fume hood (D 2 ) and fume hood with HEPA filter (D 3 ) stand for the engineering controls, and HEPA vacuum cleaner (D 4 ) corresponds to the work practices. Let us note that a respirator is a form of a mask with a filter that protects against dangerous substances in the air. A fume hood is a type of ventilation system protecting against harmful gases and toxins. Finally, High Efficiency Particulate Air (HEPA) filter is a filter that has a very high capacity of retaining particles in the range of several micrometers and above, as well as below.
For each precaution type, we make decisions about its requirement during the nanomanufacturing process. The holistic preference on each decision attribute includes five preferenceordered classes corresponding to the levels of need for the specific precaution: required (C 1 ), might be required (C 2 ), optional (C 3 ), might be optional (C 4 ), and not required (C 5 ). The reasoning on the decision attributes is the following: if the exposure scenario is deemed as risky, then a given precaution will be indicated as required. If it is not, then the expert would indicate no need for the precaution. A non-risky scenario is preferred to the risky one.

Preference information
For forty exposure scenarios (a 1 − a 40 ), we consider input provided by the health and safety managers in the form of class assignments on four decision attributes [35]. The experts were asked in a survey what precautions should be taken and in what intensity they should be used given a set of production parameters and features of the nanomaterials based on those presented in Table 5. For the scenarios deemed as risky and dangerous by the specialists, a given precaution is required. In the case of safer scenarios, the requirements are lower, and the necessity of some precaution types is not required. For the answers of the experts, see Table 9. The numbers of reference alternatives assigned to each class for the four decisions are presented in Table 8. The most common decisions are ''required" (C 1 ), ''optional" (C 3 ) and ''not required" (C 5 ), whereas the least chosen classes in the survey classes are ''might be required" (C 2 ) and ''might be optional" (C 4 ).
Let us discuss the performances and assignments for the three example reference alternatives (a 26 , a 5 , and a 1 ). Alternative a 26 attains very favorable performances on four criteria of cost type g 2 , g 6 , g 9 and g 10 and the best performance on gain criterion g 4 . As a result, the assigned classes for respirator, fume hood with HEPA filter and HEPA vacuum cleaner are ''not required" and for fume hood -''required". Consequently, the most risky evaluation in terms of fume hood is linked to the performances on g 3 , g 5 , and the potentially non-monotonic criteria. Furthermore, a 5 performs poorly on g 2 , g 3 , g 4 , g 5 , g 9 and g 10 , which was an important reason to classify this scenario to C 1 (''precaution is required") for all precaution types. Finally, a 1 attains favorable performances on criteria g 2 , g 3 , g 4 and g 10 , while being less advantageous on criteria g 5 , g 6 and g 9 . Therefore, its classification for all decisions is between ''might be required" (C 2 ) and ''optional" (C 3 ).

Research questions
The research goal consists of understanding under which operational conditions and according to which characteristics of the nanomaterials, different types of precautions can be required, might be required, are optional, might be optional, or not be required. This contribution results in a sorting model capable of providing decision recommendations on multiple risk management measures, corresponding to various precautions, for the same exposure scenario.

Marginal value functions
The marginal value functions for the ten criteria and four decision attributes are presented in Figs. 6 and 7. They preserve the imposed monotonicity constraints. In particular, the marginal value function u 2 for the cost-type criterion toxicity is non-increasing, i.e., a value assigned to the ''moderate" performance is always greater than to the ''high" performance and equal or lesser (depending on the decision attribute) than the value corresponding to the ''low" toxicity (see Fig. 6). Similarly, the marginal value function u 4 for the gain-type criterion detection limit is non-decreasing. It assigns a strictly greater value to ''poor" than to ''null" performance, and exhibits a stable or a slightly increasing trend from ''poor" through ''moderate" to ''good" detection limit (see Fig. 6). On the contrary, the marginal value functions for the criteria with unknown monotonicity exhibit a non-monotonic trend. For example, the least marginal value on u 1 does not correspond to either of the extreme performances (see Fig. 6). However, the corresponding non-decreasing and nonincreasing components adhere to the monotonicity constraints.
The impact of each criterion on the recommended decision can be estimated with the maximal share of each criterion in the comprehensive value (see Table 10). The greatest shares correspond to: for respirator -airborne capacity and detection limit, for fume hood -particle size and airborne capacity, for fume hood with HEPA filter -airborne capacity, and for HEPA vacuum cleaner -exposure limit and airborne capacity. The values of bias for all non-monotonic criteria are given in Table 11. They allowed normalizing the performance of an anti-ideal alternative to zero, as described in Section 2.
For the marginal value function u 1 for particle size, the greatest value is assigned to the size greater than 1000 nm for all decisions but fume hood with HEPA filter, for which the greatest value is attained for the size of lesser than 2 nm. The function is of ''W" shape for respirator, fume hood with HEPA filter and HEPA vacuum cleaner with a significant peak corresponding to sizes of 10 − 100 nm or 100 − 500 nm. Such a shape is implied by the largest decrease of value for the non-increasing component observed between sizes 10-100 nm, 100-500 nm, and 500-1000 nm, and the largest increase of value for the non-decreasing component observed for sizes between 2-10 nm, 10-100 nm, and 100-500 nm. For fume hood, the shape of u 1 is similar to ''V", and the zero value is assigned to the intermediate size.
The value function u 2 for toxicity indicates a negligible difference between the low and moderate performances. Such a difference is slightly greater only for HEPA vacuum cleaner. Intuitively, the precautions are less required with low toxicity. This criterion has a very low impact on the comprehensive value when considering respirator and fume hood. This means that this precaution type is needed even with low toxicity.
Airborne capacity has a very significant impact on the alternatives' assignments. The ''null" performance vastly contributes to reducing the requirement of a given precaution type. In addition, for fume hood the value differences between performances ''null" and ''low" or ''moderate" and ''high" are very marginal or nonexisting. Thus, in this case, only the difference between ''low" and ''moderate" matters. For the remaining decision attributes, the difference between ''moderate" and ''high" and ''low" and ''moderate" are huge.
The shapes of marginal value functions for detection limit (u 4 ) reveal high discrepancy between the decisions, even if they are similar in terms of a general trend. For fume hood, this criterion has almost no effect on the comprehensive value, whereas for respirator -the impact of g 4 is significant. The main difference in terms of a trend is that for fume hood with HEPA filter, the difference between values assigned to ''poor" and ''moderate" or ''good"' detection limits is negligible, whereas for the respirator and HEPA vacuum cleaner it is around 0.025.
Analogously, the slight differences in the shapes of value functions for various decision attributes can be observed for the exposure limit (u 5 ). For respirator, fume hood with HEPA filter, and HEPA vacuum cleaner, the greatest value difference is between the performances of < 0.1 and 0.1−0.2, whereas for fume hood -the

Table 8
The number of reference alternatives assigned to a given class on four decision attributes.

Decision
Respirator The marginal value function for quantity (u 6 ) indicates that the production of small quantities of nanomaterial is considered less risky. In contrast, for the mass production above 1kg -the marginal values are close to zero. The production of a small amount of nanomaterial reduces the chance of exposure. At higher manufacturing loads, the chances of accidental or undesired contact are higher. In the context of fume hood, g 6 has no significant impact on the comprehensive value, and for each quantity produced, this precaution is required.
The marginal value functions for the engineering controls (u 7 ) indicate that the closed systems are safer than open ones, particularly those with the negative pressure. The most desired configuration depends on the decision attribute. Fume hood is more required in the closed system, and for the remaining types of precautions, an open system with negative pressure is the most needed. The non-decreasing component (u nd 7 ) is prevailing, implying high marginal values for the closed systems, whereas the non-increasing component (u ni 7 ) assigns low values to all possible configurations of the engineering controls.
The marginal functions for the number of employees (u 8 ) reveal a slightly different shape for each decision attribute. We can observe two main peaks corresponding to 3-10 employees for respirator and HEPA vacuum cleaner or 51-100-employees for respirator, HEPA vacuum cleaner and fume hood with HEPA filter. The performances with the least assigned marginal value are 11-50 employees for respirator, HEPA vacuum cleaner, and fume hood with HEPA filter or 101-500 employees for respirator, fume hood, and fume hood with HEPA filter. For all decision attributes, the values assigned to 101-500 employees are Table 9 Class assignments provided by the experts for reference alternatives and their comprehensive values for four decision attributes.  smaller than those associated with 51-100 employees. The nonincreasing (u ni 8 ) and non-decreasing (u nd 8 ) components explain why the resulting marginal functions differ across various decisions. For respirator, the non-decreasing component increases only between 3-10 and 11-50 employees, whereas the functions for the remaining decisions in this performance area are stable. In turn, they increase for the number of employees between 1-3 and 3-10 and between 11-50 and 51-100.
The duration of exposure (u 9 ) is particularly important in the context of respirator and fume hood. When the time exceeds one hour, there is a greater safety concern, and thus all precautions are more required. A short duration of exposure can motivate the reduction of safety requirements.
The analysis of marginal functions for the number of exposures (u 10 ) indicates that if the exposures are non-existing, the marginal value is high, hence leading to the assignment to a less risky class for all decision attributes. In case there is at least one exposure, the respirator is more required. The precautions involving the HEPA filter are required when three exposures are exceeded. In turn, fume hood is equally necessary for all values when the number of exposures is known, and the marginal functions attain zero when the value of this criterion cannot be determined.
In general, the marginal value functions for the decision attributes concerning the use of the HEPA filter are similar for the airborne capacity, detection limit, quantity, engineering controls, duration of exposure, and number of exposures. In turn, the marginal functions corresponding to the two fume hoods differ on all criteria. This may suggest that these two precautions are complementary, and depending on the conditions, one should choose the fume hood either with the filter or without it. Finally, the functions for the respirator are more similar to those derived from the analysis for the HEPA vacuum cleaner and fume hood with HEPA filter than for the fume hood.

Class assignments for the reference alternatives
The comprehensive values and class assignments for the forty reference alternatives with respect to the four decision attributes are provided in Table 9. The constructed model reproduced the desired assignments for all reference exposure scenarios but a 1 , a 20 , and a 28 . These alternatives form the minimal subset that had to be removed to impose the consistency of the experts' judgments with an assumed preference model. The comparison of desired and resulting assignments for these three exposure scenarios is given in Table 12. For example, the inferred model Table 12 Class assignments derived with the constructed preference model for the reference alternatives, not aligning with the ones desired by the experts. )   1  2  3  4  1  2  3  4   a 1  3  2  2  3  3  1  5  4  a 20  3  2  2  2  5  5  5  5  a 28  4  4  4  4  5  4 3 2

Table 13
Class thresholds separating the five preference-ordered classes for four decision attributes. evaluated alternative a 20 as ''not required" (C 5 ) for all decision classes, while the experts indicated that it should be ''optional" and ''might be required". This was implied by the most preferred or nearly the best performances on the monotonic criteria g 2 , g 3 , g 5 , g 6 and g 10 , as well as the performances on the non-monotonic criteria g 1 and g 7 that were assigned the greatest marginal values according to the constructed model. The thresholds separating the decision classes on a scale of a comprehensive value for all decisions are given in Table 13. Let us remind that the range delimited by these thresholds in which a comprehensive value of a given alternative falls determines its assignment to the respective class. For example, U D 1 (a 2 ) = 0.4039 is not lesser than b D 1 1 = 0.3480 and lesser than b D 1 2 = 0.4195, which allows to reproduce the assignment of a 2 to C 2 provided by the experts. Although these thresholds have similar values for various decision attributes, we can observe that, e.g., on D 4 , they are by 0.04 − 0.07 lower than on D 2 .

Inter-decision relationships
Let us focus on the inter-decision relationships implied by the specificity of the considered multi-decision sorting problem. The impact of the individual criteria on the comprehensive values as well as the relations between the latter ones for different decision attributes are demonstrated in Fig. 8 for the four reference alternatives (a 9 , a 3 , a 13 , and a 33 ).
For example, a 9 was assigned to C 5 for D 3 , to C 3 for D 1 and D 4 , and to C 1 for D 2 . This information can be interpreted in such a way that when considering different types of precautions in the context of a 9 , their ranking is the following: fume hood with HEPA filter (D 3 ), respirator (D 1 ) and HEPA vacuum cleaner (D 4 ), fume hood (D 2 ). Such a ranking is reflected in the comprehensive values on the respective decision attributes: U D 3 (a 9 ) > U D 1 (a 9 ), U D 4 (a 9 ) > U D 2 (a 9 ). The analysis of marginal values for a 9 indicates that, depending on the decision context, the same performance can yield very different contributions to the comprehensive values. This, in turn, may result in the extreme assignments for various decision attributes (e.g., the most preferred class on D 3 and the least preferred class on D 2 ). The comprehensive value of a 9 for fume hood with HEPA filter (D 3 ) is equal to 0.5707. Such a great value is implied mainly by the following high contributions from the individual criteria: u The desired assignments of a 3 were either C 2 on D 3 and D 4 or to C 1 on D 1 and D 2 . Consequently, its comprehensive values on all decision attributes are relatively low, while being slightly higher for the fume hood with HEPA filter and HEPA vacuum cleaner than for the respirator or fume hood. The assignments to classes representing more risky scenarios are mainly due to the low marginal values from the following criteria: airborne capacity (u 3 ), exposure limit (u 5 ), engineering controls (u 7 ), number of employees (u 8 ), and duration of exposure (u 9 ). The differences in the assignments on D 1 and D 3 can be explained, e.g., with respect to toxicity (u = 0.1155), making the respirator ''required" with a comprehensive value of 0.3444 and fume hood with HEPA filter -''might be required" with a greater comprehensive value of 0.4207. When it comes to fume hood, a 3 attained very low values on all criteria, making it ''required". In case of HEPA vacuum cleaner, a higher value justifying an assignment to C 2 is implied by the significant contributions from the following criteria: toxicity, airborne capacity, detection limit, exposure limit, number of employees, and multiple exposure.
The difference in marginal values assigned to the same performances for various decision attributes as well as the interdecision relationships between comprehensive values implied by the experts' assignments can be also observed for a 13 and a 33 (see Fig. 8). On the one hand, the comprehensive values of a 13 range from 0.0783 for D 2 to 0.5659 for D 1 . On the other hand, the large differences in marginal values assigned to a 33 on various decision attributes for toxicity, detection limit, number of employees, and duration of exposure imply that it can be assigned to classes ranging from C 1 for D 2 to C 4 for D 1 . As a result, the ranking of precautions associated with a 33 in terms of safety requirements (starting from the least required) reproduced by the constructed model is as follows: respirator, fume with HEPA filter and HEPA vacuum cleaner, fume hood.

Intra-decision relationships
To justify the assignments of alternatives to the respective classes, in Fig. 9, we demonstrate the comprehensive values of selected exposure scenarios and class thresholds. For each decision attribute, we depicted a single alternative assigned to each class. The comprehensive values of exposure scenarios assigned to better classes are greater than for the alternatives assigned to the classes associated with greater risk. For example, the following relations between comprehensive values on D 1 : U D 1 (a 37 ) > U D 1 (a 33 ) > U D 1 (a 7 ) > U D 1 (a 2 ) > U D 1 (a 4 ) reflect the expert judgments. Let us explain a few example assignments in terms of marginal values attained on the respective criteria and the comparison of comprehensive values with the class thresholds.
When it comes to the evaluation of a 2 in terms of respirator (D 1 ), it attains the greatest marginal values for airborne capacity (u When comparing the assignments of a 14 and a 40 in terms of fume hood (D 2 ), these alternatives perform similarly on g 1 , g 2 ,  2 (a 40 ) = 0.0062). However, the more advantageous performances of a 40 on g 5 , g 7 , and g 10 imply that it is assigned to C 5 as compared to C 3 for a 14 . Even though a 2 attains comparable marginal values to a 40 on six criteria (g 2 , g 3 , g 4 , g 5 , g 6 , and g 7 ), it is significantly less preferred on g 1 and g 8 (e.g., u D 2 1 (a 2 ) = 0.0182 < u D 2 1 (a 40 ) = 0.1976). As a result, a 2 has a very low comprehensive value (U D 2 (a 2 ) = 0.3492), justifying the assignment to the least preferred class C 1 .

Classification of the non-reference alternatives
The model inferred from the analysis of reference alternatives can be used to classify other exposure scenarios. Thus, we first used expert knowledge to build a preference model. The latter is subsequently applied to evaluate other alternatives in a way that is consistent with the experts' value system and hence could be accepted by them. In this way, the proposed method can support nanomaterials' exposure management, suggesting the reasons for concern regarding some nanomanufacturing tasks performed by the workers [47,55].
For this purpose, let us consider five non-reference alternatives presented in Table 7. Their comprehensive values and the Table 14 Comprehensive values and class assignments for the non-references alternatives for the four decision attributes. respective class assignments are given in Table 14. Since the comprehensive values attained by these alternatives differ vastly from one decision attribute to another, the suggested classes differ too. For example, a 42 is assigned to C 1 on D 2 , to C 3 on D 3 and D 4 , and to C 4 on D 1 , whereas the classes for a 43 range from C 3 on D 2 and D 3 to C 5 on D 1 and D 4 . Note, however, that although the comprehensive values of a 44 differ with respect to various precaution types, they are all very low, justifying the assignment to C 1 on all decision attributes. For the five non-reference alternatives, the contribution of the individual criteria in the comprehensive values, as well as the assignments derived from the comparison of comprehensive values with class thresholds, are presented in Fig. 10. Let us justify the assignments obtained for two selected non-reference exposure scenarios (a 43 and a 45 ).
When it comes to a 43 , it is assigned to C 3 on D 2 and D 3 and to C 5 on D 1 and D 4 . This alternative attains the extreme values on different criteria. However, it performs relatively well on the criteria with a significant impact on the classification, i.e., g 3 , g 4 and g 5 , which justifies its relatively high comprehensive values. They are slightly lower for fume hood (D 2 ) and fume hood with HEPA filter (D 3 ) mainly due to either zero (u . This criterion forms an example of the direction in which the health managers should work to verify if any of them can be improved to increase a comprehensive value and justify the assignment to a less risky class for all decision attributes. As far as the evaluation of a 45 is concerned, it attains high or moderate marginal values on g 2 , g 3 , g 4 , g 7 , and g 8 when assessed in terms of D 1 , D 3 , and D 4 . This allows exceeding the lower threshold of class C 2 for these decision attributes. When it comes to D 2 , the significant contribution to the overall quality of a 45 is offered only by g 8 , implying the assignment to the least preferred class C 1 . A comprehensive evaluation of a 45 as ''optional" given HEPA vacuum cleaner (D 4 ) is justified mainly by the value added by quantity (g 5 ). Nevertheless, the indication of class at most C 3 for all decision attributes can be perceived as a ''safety warning flag", suggesting that this exposure scenario should be prioritized. In general, the greater risk associated with the respective class, the greater attention should be paid to its further investigation. The marginal value functions for which the alternative attains very low values should be analyzed to identify the performance modifications offering a significant increase of the comprehensive value.
In the e-Appendix (supplementary material available online), we compare the method introduced in this paper with the one we proposed in [31]. We also collate the outcomes obtained for the case study with both methods. This required a suitable methodological extension of the approach presented in [31] to a multi-decision setting considered in this work.

Conclusions and future work
We considered and formalized a new problem of multidecision sorting in Multiple Criteria Decision Analysis. In this problem, besides performances on multiple criteria, each alternative gets evaluated in terms of multiple decision attributes involving pre-defined preference-ordered classes. To solve such a problem, one needs to construct a set of individual sorting models, one for each decision attribute. They should reflect both intra-decision dependencies between the assignments of different alternatives and inter-decision dependencies between the classes desired for the same alternative on various decision attributes.
We have proposed a preference disaggregation method for dealing with the multi-decision sorting. In this approach, the DM is expected to assign a subset of reference alternatives to a single class for each decision attribute by indicating the quality or risk level on the pre-defined scale for all decision attributes. Such indirect preference information is used to learn a set of interrelated models composed of an additive value function and class thresholds separating the decision classes on a comprehensive value scale. The preference modeling involves intra-and interdecision constraints intending to reproduce the assignments of as many reference alternatives as possible.
The proposed framework has been extended with a novel proposal for dealing with criteria for which the preference direction cannot be specified a priori. Explicitly, a marginal value function for each non-monotonic criterion is represented as a sum of non-decreasing and non-increasing components. In this way, the resulting marginal function can take any arbitrary shape. Hence it can represent local monotonicity relationships in different regions of the performance scale. The interpretability of the model is enhanced by the monotonicity of non-decreasing and nonincreasing components, as well as normalization imposed by the subtraction of biases, guaranteeing that an anti-ideal alternative would attain a zero comprehensive value.
The introduction of a new type of multiple criteria problem and the dedicated methodology have been motivated by the peculiarity of nanomaterials' exposure management. In this context, each exposure scenario is described in terms of various characteristics of a given nanomaterial and working conditions related to its production. However, it is also evaluated in terms of different safety measures corresponding to various types of precautions. Each precaution can be modeled as a decision attribute capturing the potential level of concern related to a nanomanufacturing scenario. We have considered four inter-related precautions representing personal protective equipment, engineering controls, and work practices.
The analysis of desired assignments provided by the health and safety managers for forty exposure scenarios allowed us to construct four inter-related classification models. These models captured some patterns and regularities from experts' judgments for risk management in nanomanufacturing. In particular, the highest maximal share in the comprehensive values of alternatives was attributed to airborne capacity, detection limit, and exposure limit. Furthermore, the high variability of marginal values assigned to different performances on the same criterion indicated the directions for analysis of nanomanufacturing processes to reduce the risk level by vastly increasing the marginal value with a small modification of performance. For example, we can consider changing the toxicity from high to moderate, decreasing the airborne capacity from high to moderate or low, decreasing the exposure limit to less than 0.1 fiber/cc, reducing the quantity to less than 1kg, or nullifying multiple exposures. Even though the shapes of marginal value functions related to various decision attributes were similar for most criteria, some differences revealed the peculiarities of the risk management in the context of incorporating the respirator, fume hood with and without the HEPA filter, or HEPA vacuum cleaner. For example, the marginal functions for the precautions involving the HEPA filter were very similar, which is probably related to the notable reliance on this type of filter to reduce the potential concern during the production processes. On the contrary, the functions for fume hood with or without the HEPA filter were rather different, confirming their complementarity. We have also demonstrated that the constructed model can support decisionmaking by applying it to the classification of five non-reference exposure scenarios with unknown risk levels. These sorting models could thus be used to provide decision recommendations on multiple risk management measures -corresponding to various types of precautions -for nanomanufacturing processes, especially those where there is still high uncertainty in the operational conditions as well as the physicochemical and toxicological characteristics of the nanomaterials.
The potential extensions of the proposed method are fourfold. The motivation for the first development comes from a large number of constraints imposed by the intra-and inter-decision relationships and the use of binary variables that are needed to find the largest subset of reference alternatives for which an assumed model can reproduce the expert judgments. These factors imply that the proposed method, requiring a mathematical programming solver, cannot be applied in problems with thousands of alternatives and hundreds of decision attributes. An adaptation to such a setting would require the development of heuristic algorithms incorporating the machine learning techniques. As opposed to the proposed framework, they should not attempt to find an accurate, optimal solution, searching instead for a highly satisfactory model in an approximate way.
Secondly, the marginal functions for which the monotonicity direction cannot be pre-defined can be modeled differently without incorporating the non-decreasing and non-increasing components. In particular, the proposed methodology remains valid when the functions for potentially non-monotonic criteria are inferred to minimize either the number of changes in monotonicity [31] (see e-Appendix) or the sum of changes in slopes [30,56]. Similarly, the framework remains valid with threshold-based sorting procedures incorporating other preference models to compute alternatives' scores than an additive value function (e.g., the Choquet integral [57,58]).
Thirdly, the proposed method can be extended with the robustness analysis framework [4,23,58]. In this approach, one should account for all compatible multi-decision sorting models instead of a single representative one. Then, one should capture the potential variability of assignments for the nonreference alternatives given the multiplicity of analyzed models consistent with the DM's judgments. Such an approach can be extended to analyze all maximal subsets of reference alternatives for which the provided assignments are consistent with an assumed model [59].
Finally, the idea of evaluating each alternative with a set of inter-related preference models can be adjusted to other problem types. For example, various value functions can be used to assess the suitability of a given alternative to be assigned to the groups of alternatives exhibiting different characteristics and being similar in terms of the DM's preferences. Then, it should be placed in a group for which the attained comprehensive value is the greatest. Such an approach could provide a novel way of dealing with multiple criteria nominal classification [60].

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.