DMsan: A Multi-Criteria Decision Analysis Framework and Package to Characterize Contextualized Sustainability of Sanitation and Resource Recovery Technologies

In resource-limited settings, conventional sanitation systems often fail to meet their goals—with system failures stemming from a mismatch among community needs, constraints, and deployed technologies. Although decision-making tools exist to help assess the appropriateness of conventional sanitation systems in a specific context, there is a lack of a holistic decision-making framework to guide sanitation research, development, and deployment (RD&D) of technologies. In this study, we introduce DMsan—an open-source multi-criteria decision analysis Python package that enables users to transparently compare sanitation and resource recovery alternatives and characterize the opportunity space for early-stage technologies. Informed by the methodological choices frequently used in literature, the core structure of DMsan includes five criteria (technical, resource recovery, economic, environmental, and social), 28 indicators, criteria weight scenarios, and indicator weight scenarios tailored to 250 countries/territories, all of which can be adapted by end-users. DMsan integrates with the open-source Python package QSDsan (quantitative sustainable design for sanitation and resource recovery systems) for system design and simulation to calculate quantitative economic (via techno-economic analysis), environmental (via life cycle assessment), and resource recovery indicators under uncertainty. Here, we illustrate the core capabilities of DMsan using an existing, conventional sanitation system and two proposed alternative systems for Bwaise, an informal settlement in Kampala, Uganda. The two example use cases are (i) use by implementation decision makers to enhance decision-making transparency and understand the robustness of sanitation choices given uncertain and/or varying stakeholder input and technology ability and (ii) use by technology developers seeking to identify and expand the opportunity space for their technologies. Through these examples, we demonstrate the utility of DMsan to evaluate sanitation and resource recovery systems tailored to individual contexts and increase transparency in technology evaluations, RD&D prioritization, and context-specific decision making.

. Terminology use to inform and describe the DMsan package.
Term Acronym Description Decision-making for sanitation and resource recovery systems DMsan A Python package for decision-making of sustainable sanitation and resource recovery systems in low-income settings.

Multi-criteria decision analysis MCDA
A methodology that compares multiple conflicting criteria and preferences to measure the probability that an alternative will outrank other alternatives. Analytical hierarchy process AHP A method of MCDA that uses pairwise comparisons to generate criteria weights and/or rankings of alternatives. Technique for order by preference of similarity TOPSIS A method of MCDA that uses Euclidian distances from the best option to the worst option to generate rankings of alternatives. Life cycle assessment LCA A technique to quantify the environmental impacts of a technology. Techno-economic analysis TEA A technique to quantify the economic costs of a technology. Alternative -Within MCDA, an option that is compared against other options (i.e., Alternative A, Alternative B, and Alternative C). Criterion -The main or principal decision-making categories of the framework (i.e., technical, environmental, resource recovery, economic, and social). Criterion weight -The weight given to a criterion to represent its significance in MCDA. Sub-criterion -The set of decision-making categories that fall under the criterion Indicator -A specific aspect in which the technology alternative is assessed (e.g., number of jobs created). Indicator contextual driver -Used to generate an indicator weight (e.g., extent of training, community preference, population growth). Indicator weight -The contextual/location importance for a specific indicator and is generated using indicator contextual driver through AHP. Indicator score -The value assigned to an indicator to measure the ability of an individual aspect of a technology before normalization and weighting. Performance score -The total quantitative score generated for each alternative and is used for ranking.
A literature review was conducted to identify the commonly used criteria and indicators in sanitation decision-making studies. Four main criteria were identified: environmental (32 out of 35 papers), economic (29 out of 35 papers), technical (28 out of 35 papers), and social (26 out of 35 papers). The most used indicators (appearing in 3 or more papers) are summarized below (Table  S3), with all indicators used across published literature appearing in the later sections describing criteria indicators and scoring.

S2. Determination of indicator scores
Five criteria, consisting of technical, resource recovery, environmental, economic, and social, are used within DMsan to evaluate the performance of sanitation systems. This section provides details about the indicators and how to calculate or assign indicator scores.
Indicator scores are assigned for alternatives A, B, and C in each indicator scoring table in this section. System details are described in the main text of the manuscript and in Trimmer et al. 43 In summary, Alternative A is the existing sanitation system incorporating pit latrines, vacuum collection trucks, centralized treatment (sedimentation, solids drying beds, and lagoons), and recovery of nutrients for fertilizer (dried solids and nutrient-rich liquid effluent). Alternative B replaces the existing centralized treatment with an anaerobic baffled reactor, solids drying beds, and an additional planted bed for liquid treatment with solid and liquid nutrients recovered for land application and biogas for cooking fuel. Alternative C replaces the existing pit latrines with container-based urine-diverting dry toilets that use urine handcart and solids truck transportation to bring resources to the centralized treatment facility described in Alternative A, excluding sedimentation, as liquids and solids are already separated. Alternative C increases the nutrient recovery potential (relative to Alternatives A and B) through source separation. Table S5. Summary of criteria, sub-criteria, and indicators included in DMsan. Qualitative indicator scores are manually assigned using the predefined ranges described in this section (Section S2), and quantitative indicator scores are simulated using QSDsan (Section S5). Simulated indicator scores and calculations were informed using the methods and calculations described in Trimmer et al. 43 Indicators with an asterisk were excluded in the Uganda example highlighting the use of DMsan because property managers were not responsible for system disposal and cleaning efforts.

Criteria
Sub-criteria Indicator Type of Input  Table S8 Feasibility Accessibility to parts Manual score input Table S9  Transportation feasibility  Manual score input  Table S10  Construction skills required  Manual score input  Table S11  Operation and maintenance skills required  Manual score input  Table S12 Flexibility Population flexibility Manual score input Table S13  Power outage flexibility  Manual score input  Table S14  Drought flexibility  Manual score input  Table S15 Resource recovery

Resource recovery feasibility
Water recovery Simulated -Nitrogen recovery Simulated Table S16  Phosphorous recovery  Simulated  Table S16  Potassium recovery  Simulated  Table S16  Energy recovery  Simulated  Table S17  Supply chain feasibility  Manual score input  Table S18   Environmental   Life cycle  environmental  impacts   Damage to ecosystems  Simulated  Table S20  Damage to human health  Simulated  Table S20  Damage to resource availability  Simulated  Table S20  Economic  Net costs  Annual cost per capita  Simulated  Table S22   Social   Job creation  Total jobs created  Simulated  Table S24  High-paying jobs created  Simulated  Table S24 End-user acceptability Disposal frequency Simulated Table S25  Cleaning requirement  Manual score input  Table S26  Privacy  Simulated  Table S27  Odor and flies  Manual score input  Table S28  Security Manual score input -Property manager acceptability Disposal frequency* Manual score input -Cleaning requirement* Manual score input -S7

S2.1.1. Resiliency and robustness sub-criterion indicators
The resiliency and robustness sub-criterion is focused on the ability of a system to achieve adequate performance under a variety of conditions for long periods of time. Some factors that will affect resiliency and robustness include adaptability to inputs at the user interface and the type of treatment (e.g., biological, chemical, etc.).

User interface robustness (T1).
The user interface describes the type of toilet, pedestal, pan, or urinal that the user interacts with to access the sanitation system. There are five categorized types of interfaces: dry toilet, pour-flush toilet, cistern flush toilet, urine-diverting dry toilet (UDDT), and urine-diverting flush toilet (UDFT). Successful user interface selection is influenced by the following factors: availability of water for flushing, habits and preferences of the users, special needs of user groups, local availability of materials, and compatibility with the collection and storage/treatment or conveyance of the system. 45 To identify the complexity in using these interfaces and on-site storage, a 1 to 5 scale was created where one is the least complex and most simple to use and five is the most complex and most difficult to use according to former studies (Table S7). 45 The scale below was used to give a user interface robustness score to each alternative. The less complex the user interface, the less likely it will lead to user failure (if there isn't adequate training in place). Alternatives A and B have a pour flush toilet and score of 4, and Alternative C has a UDDT with a score of 2. Highly complex user interface that has laborintensive maintenance, requires training and acceptance to be used correctly, is prone to misuse and clogging, and requires a constant source of water.
Urine-diverting flush toilet (UDFT) 2 Complex user interface that requires training and acceptance to be used correctly, is prone to misuse and clogging with feces, and the excreta pile is visible.
Urine-diverting Dry Toilet (UDDT) X 3 Moderately complex user interface that may not be repaired locally, requires a constant source of water, and operating costs depend on the price of water.
Cistern Flush Toilet 4 Simple user interface that requires a constant source of water and coarse dry cleansing materials may clog the water seal.
Pour Flush Toilet X X

5
Most simple user interface that does not requires training, not prone to clogging, and does not require water use but attracts flies and odor.

Resiliency of treatment type (T2).
Treatment type resiliency is based on the type of treatment employed for wastewater treatment: anaerobic (e.g., anaerobic lagoons, anaerobic reactors, etc.), aerobic (e.g., activated sludge, fixed-bed and moving bed bioreactors, aerobic membrane bioreactors, biological trickling filters, etc.), and chemical/thermal (e.g., disinfection, ion-exchange, gasification, incineration, pyrolysis, etc.). We assumed that anaerobic systems have low resilience due an increased risk in fouling and require more operation and maintenance, 46 aerobic systems have moderate resilience, and thermal or chemical will have high resilience. Decision-makers and technology developers interested in evaluating systems outside of conventional wastewater treatment (e.g., novel technologies and resource recovery alternatives) can modify the scoring scale to evaluate the relative difference in resilience across systems and apply scores accordingly. An indicator scale of 1 to 3 was created, where one is low resilience and three is high resilience ( Table S8). Each alternative in the illustration used anaerobic treatment and received a score of 1.

S2.1.2. Feasibility sub-criterion indicators
The feasibility sub-criterion is related to the availability of required resources for a system to operate. In the literature review, feasibility, adaptability, applicability, and complexity were used to assess if a community had the resources (e.g., land, human resources, parts, etc.) needed to maintain the system. The MCDA indicators to quantify the feasibility of the system are accessibility to parts in case a part needs to be replaced, transportation feasibility related to the complexity of conveyance, construction skills required to build the system, and operation and maintenance (O&M) expertise needed for the system to function.
Accessibility to parts (T3). Sanitation systems often fail within a couple years of implementation, often due to systems breaking without attention to how they can be repaired. The accessibility to parts indicator focuses on how easily parts can be obtained during the construction and maintenance phases of an alternative's lifetime. A qualitative score of one to five is used where one represents parts that have very low accessibility (e.g., custom electrodes) and five represents parts that have very high accessibility (e.g., concrete) ( Table S9). In this example, Alternatives A and C are assumed to have very high accessibility (indicator score = 5), and Alternative B has high accessibility (indicator score = 4) because its biogas bioreactor may have some parts that need to be shipped. High accessibility as most parts may be locally available or easily accessible but some parts may need to be shipped.
Pour-flush toilet, settler, biogas reactor X 5 Very high accessibility as parts only require plastic, concrete, and other easily accessible or local material.
Dry toilet, pit latrine, sedimentation tank X X Transportation feasibility (T4). The transportation feasibility indicator evaluates the complexity of the conveyance system required for each alternative. Transportation feasibility is scored on a scale of 1 to 7, where 7 is low complexity and 1 is the highest complexity (Table S10). In this illustration, Alternatives A and C use tanker trucks (indicator score = 5), and Alternative C uses trucks and pushcart operators (indicator score = 6). For alternatives that have the necessary conveyance infrastructure already in place, DMsan users can select score 7 to represent no additional infrastructure required. For systems without conveyance of waste (e.g., scenarios in which a new pit latrine is dug instead of pumped to empty), DMsan users can set the weight of the indicator to 0 indicate that it is not being included. Most simple transportation infrastructure required / appropriate infrastructure already in community.

Jerrycan or tank
Construction skills required (T5). The construction skills required can be used to evaluate the complexity of implementation. The indicator is scored from 1 to 5, where one is very advanced skills required and five is very minimal skills required to build the system ( Table S11). The Compendium of Sanitation Technologies and Systems 45 provides insight into the construction complexity required for conventional technologies, such as constructed wetlands, biogas reactors, and other components. The scores and descriptions were informed by the construction complexities described in the compendium; however, DMsan users can assign construction skill levels through conversations with technology developers or using their own judgement (following the examples in the table) if engineers/technology developers are not available for feedback during the decision-making phase. Alternative A does not require any construction as the system is already existing and receives a score of 5. Alternative B does not require any construction for storage but does require the construction of an anaerobic baffled reactor, unplanted and planted drying beds (solids), and planted bed (liquids) and is given a score of 2. Alternative C does not require construction as the solids and liquids are transported to the site to be treated by the existing treatment system, so it is given a score of 5. Minimal to no construction skills required.
On-site treatment shipped in a container, existing system X X

Operation and maintenance skills required (T6).
The final feasibility indicator is related to the operation and maintenance skills required for the system. This indicator is measured on a scale of 1 to 5, where a score of one requires very advanced operation and maintenance skills and five requires the simplest operation and maintenance skills ( Table S12). The scores and descriptions were informed by the operation and maintenance complexities described in the compendium; however, DMsan users can assign construction skill levels through conversations with technology developers or using their own judgement (following the examples in the table) if engineers/technology developers are not available for feedback during the decision-making phase. For Alternative A, a score of 3 (moderate O&M) was assigned because the centralized treatment involves constructed wetlands and sedimentation. Alternative B uses an anaerobic baffled reactor, which resulted in a score of 2 (advanced O&M). Alternative C is the same as Alternative A when it comes to operation and maintenance required for treatment, so the score for this alternative is also 3 (moderate O&M).

S2.1.3. Flexibility sub-criterion indicators
The final sub-criterion of the technical criterion is related to the flexibility of a sanitation system to withstand societal and environmental changes while maintaining ability. The flexibility of a system is characterized by flexibility to population growth, electricity blackouts, and drought.

Population flexibility (T7).
Population flexibility relates to the ability of a sanitation system to withstand population changes, specifically population growth. It is quantified using a scale of 1 to 3, where 1 is assigned to a system with low flexibility (handles less than a 10% increase in population), 2 has some flexibility to population growth (handles a 10-25% increase in population), and 3 has the most flexibility (handles greater than a 25% increase in population) ( Table S13). Alternatives A, B, and C can handle a 10-25% population increase and were assigned a score of 2. Population increases for Alternatives A, B, and C were assigned using the author's expert judgement. DMsan users can work with technology developers or use their own expert judgement to evaluate the population flexibility of alternatives. Table S13. Indicator score descriptions to assess population flexibility and their application to Alternatives A -C.

Indicator Score
Description Can handle less than a 10% increase in population. 2 Can handle a 10-25% increase in population. X X X 3 Can handle more than a 25% increase in population.

Power outage flexibility (T8).
Power outage flexibility is related to a system's energy requirement and its ability to operate without a constant energy source. It is characterized on a scale of 1 to 3, where 1 always requires grid electricity to run as designed, 2 can handle blackouts of up to 24 hours, and 3 is no dependence on grid electricity ( Table S14). The indicator score is assigned based on the least flexible unit process of the system (e.g., if one step of the sanitation service chain always requires electricity and the remaining steps do not require electricity, the system will receive a score of 1). All Alternatives always require grid electricity to run as designed and do not have any onsite storage or generation of electricity, so all alternatives were assigned a score of 1. Drought flexibility (T9). The ability of a sanitation system to withstand drought due to its reliance on water is measured on a scale of 1 to 3, where is 1 highly dependent on water (always requires water -cleaning/maintenance, treatment, and conveyance), 2 some dependence on water (only sometimes required water -cleaning/maintenance only), and 3 is no dependence on water (never needs water -produces enough water onsite through water recovery) ( Table S15). Alternatives A and B both consider an existing sewer network that is fed by flush toilets (highly dependent on water) and receive scores of 1. Alternative C only requires water for cleaning of the urine-diverting dry toilets (some dependence on water) and receives a score of 2. Water never required

S2.2. Description and scoring of resource recovery indicators
Resource recovery is important to consider in contexts with low resource access (e.g., water, energy, nutrients) and regions shifting to circular economics with more sustainable resource acquisition. Six indicators were selected for this criterion to quantify water, energy, and nutrient recovery from the system and the complexity of the supply chain necessary to deliver recovered resources.

S2.2.1. Resource recovery feasibility sub-criterion indicators
Water recovery (RR1). Water recovery is calculated as the volume (e.g., L per day) of water recovered to be reused within the system (e.g., flush water) or used for purposes outside of the system boundary (e.g., irrigation, household cleaning, etc.). This indicator was excluded from this analysis because all three systems did not have water recovery in their designs; however, the indicator should be included in future analyses of systems that include water recovery. (RR2, RR3, RR4). The second, third, and fourth indicators quantify the nutrient recovery from the system. Recovered nitrogen, phosphorus, and potassium are calculated in QSDsan using an expected user's dietary intake (i.e., daily calories, vegetal protein, animal protein), expected food waste, and nutrient losses along the system (e.g., ammonia volatilization) (Table S16). 43 The calculations and simulations are further discussed in Section S5.

S15
Energy recovery (RR5). The fifth indicator quantifies the energy recovered from the system as a fraction of the COD input recovered as energy (Table S17). Alternatives A and C do not have energy recovery, and the energy recovered in biogas for Alternative B was simulated (detailed in Section S5). Supply chain feasibility (RR6). The sixth indicator to characterize the resource recovery criterion is related to the complexity and feasibility of the supply chain to deliver the resources. For inefficient supply chains (based on the contextual driver of the country), less complexity in steps (e.g., on-site reuse) to distribute resources is preferred. A score of 1 to 4 is assigned based on the complexity to deliver the recovered resources to a consumer, where recovered water onsite for flush water scores a one, biogas recovered onsite for a wastewater treatment plant scores a 2, nutrients recovered for agriculture purposes scores a three, and biogas for offsite distribution scores a four (Table S18). Alternatives A and C produces recovered liquid as nutrient-rich water for irrigation and recovered solids as fertilizer, which results in a score of 3. Alternative B produces biogas for cooking fuel and distributes recovered solids and liquids, which is more difficult and given a 4.

S2.3. Description and scoring of environmental indicators
The environmental criterion encompasses all aspects of how a technology impacts the environment. Articles in the literature review most often included energy and water consumption, air and water contamination, and sludge production as environmental indicators (Table S19). Table S19. Environmental indicators identified in the literature review.

S2.3.1. Life cycle environmental impacts sub-criterion indicators
Life cycle assessment is used to calculate the environmental indicators. The ReCiPe life cycle assessment methodology was selected because it is used globally, and the endpoint indicators can be compared with universal units (points). 47 The indicators include damage to the ecosystem quality, damage to human health, and damage to resource availability and were calculated using the hierarchist cultural perspective (other perspectives include individualist and egalitarian) because it is considered to be the default model (Table S20). 47 Negative results indicate the damage offsets due to resource recovery are greater than the damage produced by the system itself (i.e., the lower the damage score, the better performance). The calculations and simulations are further discussed in Section S5.
Damage to ecosystem quality (Env1). This endpoint indicator accounts for the local relative species loss in terrestrial, freshwater and marine ecosystems over space and time. 47 Damage pathways for this indicator include damage to freshwater, terrestrial, and marine species as a result of global warming, water use, freshwater ecotoxicity, freshwater eutrophication, tropical ozone, terrestrial ecotoxicity, terrestrial acidification, land use or transformation, and marine ecotoxicity.

Damage to human health (Env2).
This endpoint indicator is quantified as disability adjusted life years (DALYs), and represents the years lost for a person due to a disease or accident. 47 DALYs were converted to points to allow comparison across environmental indicators. Damage pathways for this indicator include increase in respiratory disease, increase in various types of cancer, increase in other diseases or causes, and increase in malnutrition due to particulate matter, topical ozone formation, ionizing radiation, stratospheric ozone depletion, human toxicity (cancer and non-cancer), global warming, and water use.
Damage to resource availability (Env3). This final endpoint indicator is quantified as the extra costs involved for future mineral and fossil resource extraction converted to points. 47 The increased extraction costs are a consequence of mineral and fossil resources used.

S2.4. Description and scoring of economic indicators
Economic feasibility is a critical factor in driving decision-making. Common indicators in literature include capital cost, operation and maintenance cost and recovered resource value (Table S21). Net annualized cost was selected as the primary indicator for the economic criterion to include the common cost indicators in literature. Table S21. Economic indicators identified in the literature review.

S2.4.1. Net costs sub-criterion indicators
Techno-economic analysis was used to calculate the net annualized cost for each alternative. The calculation includes costs related to construction (capital), operation and maintenance, labor, transportation, consumables, electricity, and all expenses required to construct and maintain each system. 43 The calculations and simulations are further discussed in Section S5 with complete details and modeling assumptions in Trimmer et al. 43 Annual cost per capita (Econ1). The net annualized cost is normalized per capita and year to determine an annual cost per capita. The techno-economic analysis is conducted assuming Alternative A serves 456,667 users (40,000 on existing sewer and 416,667 on latrines) and has an 8-year lifetime, Alternative B serves 50,000 users and has a 10-year lifetime, and Alternative C serves 456,667 users (40,000 on existing sewer and 416,667 on latrines) and has an 8-year lifetime. Using these assumptions, Alternative A has an annual cost per capita of 14.23 USD·capita -1 ·year -1 , Alternative B has an annual cost per capita of 7.34 USDꞏcapita -1 ꞏyear -1 , and Alternative C has an annual cost per capita of 22.06 USDꞏcapita -1 ꞏyear -1 (Table S22).

S2.5. Description and scoring of social indicators
The final criterion selected for the package focused on social drivers of decision-making. Often overlooked, the criterion is critical in understanding how a technology will be socially sustainable in a specific location. In the review, researchers considered job creation, sociocultural acceptability, compatibility with policy, visual impact, and additional indicators ( Table  S23). As a result, within the DMsan package, the social sub-criteria and indicators include job creation (total and high-paying jobs created), end-user acceptability (disposal frequency, cleaning requirement, privacy, odor and flies, and security), and property manager acceptability (disposal frequency and cleaning requirement). Compatibility with policy was assumed to be a constraint to system deployment and was not included as a social sub-criterion in the analysis. Any system that was not compatible with policy should automatically be excluded from the analysis before the MCDA is conducted. Rapidly accomplishable 32 Socio-cultural/public acceptability 3,5,8,10,[15][16][17][18][19]24,25,32,35,36 Consideration for poorest groups of society 19,32 Organizational capacity 22 Visual impact 13,17,21 Government support 13,19 Environmental impact perception 15

S2.5.1. Job creation sub-criterion indicators
Two indicators are used to characterize job creation: total jobs created and high-paying jobs created (Table S24). For specific contexts it may be more influential to create high-paying jobs (skilled jobs), any jobs (unskilled and skilled), or both depending on the unemployment rate and the population below an income level. Since the total number of jobs created is dependent on the number of high-paying jobs created, adjustments to the number of high-paying jobs within the model will automatically adjust the total jobs created. Job requirements for each alternative were determined in Trimmer et al. 43 Total jobs created (S1). The total jobs created is evaluated as the total skilled and unskilled jobs needed for an alternative. Alternatives A and C require 12 total employees and Alternative B requires 5 to 15 total employees. A uniform distribution was used in simulation of Alternative B's total jobs created.

High-paying jobs created (S2).
The high-paying jobs created is evaluated as the total skilled jobs needed for an alternative. In the illustration, Alternatives A and C do not require additional skilled employees and Alternative B requires an additional 5 skilled employees.

S2.5.2. End-user acceptability sub-criterion indicators
The end-user acceptability sub-criterion is characterized with five indicators: disposal frequency, cleaning requirement, privacy, odor and flies production, and security. Indicator scores were determined by the community survey conducted in Trimmer et al. 43 Disposal frequency (S3). This indicator represents the number of times per year that the end-user needs to dispose of the sludge in the storage container ( Table S25). The more times a user must dispose of the waste, the higher the system maintenance is for the end-user. In this example, there are two storage containers: pit latrine and container-based sanitation. The pit latrine (Alternatives A and B) requires emptying every 0.

Cleaning requirement (S4).
High cleaning requirements are often reported as dissatisfactory in a sanitation system. Within a sanitation system, users are often responsible for maintaining and cleaning the user interface. A scale was created that characterized the cleaning requirements based on the reported maintenance for each user interface in the Compendium of Sanitation Systems and Technologies. 45 Urine-diverting dry (UDDT) and flush (UDFT) toilets are the most difficult to keep clean because users may have difficulty separating both streams perfectly, which may result in extra cleaning and maintenance, with water-based diversion systems slightly easier to maintain. Dry toilets and pit latrines do not require additional education and acceptance to be used correctly and are not prone to the clogging or misuse found in UDDT and UDFT systems, so they have a lower burden of cleaning and maintenance on the user. Finally, pour flush toilets and cistern flush toilets have the easiest cleaning requirement, with pour flush toilets being a little more difficult to keep clean and maintain due to the requirement for dry cleansing materials to be collected separately and not flushed down the toilet. As a result, a UDDT is given a score of 1 as it requires the most cleaning and the cistern flush toilet is given a score of 5 as it requires the least amount of cleaning (Table S26).

Cistern Flush Toilet
Privacy (S5). Privacy in sanitation helps vulnerable populations feel more secure when using the sanitation system and empowers users to better maintain and clean their toilets. Some communities value privacy higher than other communities. While privacy can make a system more acceptable for an end-user, it may come with tradeoffs, such as costs. Privacy was quantified using the number of households sharing a toilet (Table S27). In this illustration, all three alternatives assume 3 to 5 households are sharing a single pit latrine, flush toilet, or UDDT. Although the indicator scores were constant across all three alternatives in the illustration, decision-makers and technology developers could choose to evaluate sanitation value chains with varied number of households sharing a toilet (e.g., comparing an alternative in which each household gets their own toilet against another alternative with households sharing a single toilet).

Production of odors and flies (S6).
The production of odor and flies is influenced by the type of user interface and storage. 45 A scale was created to qualitatively score how a system might produce odors and flies. The compendium describes pit latrines/dry toilets as having noticeable odors even if equipped with a vent pipe and container-based options as having no real problems with odors and vectors (flies) if used and maintained correctly. As a result, pit latrines and dry toilets were assumed to score worse than container-based options with flush systems performing the best. A system with a high production of odors and flies was given a low score (e.g., a non-ventilated pit latrine with a dry toilet = 1) and a system that produced minimal odor and flies was given a high score (e.g., any flush toilet with proper ventilation = 5) ( Table S28). DMsan users could also modify the scoring based on their applications (e.g., modify it to be a three-point scale with any non-ventilated toilets scoring 1, ventilated toilets scoring 2, and flush toilets scoring 3 if there is no difference between container-based and pit latrine odors and flies). Any flush toilet with proper ventilation X X Security (S7). Security is characterized as the distance a user must travel to use the sanitation system. In this illustration it is assumed the distance travelled does not change among the alternatives, making the indicator not applicable; however, decision-makers and technology developers can choose to vary the average distance between households and toilets across evaluation scenarios.

S2.5.3. Property manager acceptability sub-criterion indicators
Property manager acceptability sub-criterion indicators should be included in the analysis if the user interface is not owned by the user or community member. Disposal frequency and cleaning are the responsibility of the manager instead of the toilet user.

Disposal frequency (S8).
Management disposal frequency represents the number of times per year the property manager or owner of the system needs to empty the user interface. The more times a property manager disposes of waste, the higher the system maintenance. This indicator is characterized in the same manner as end-user disposal frequency, except the manager or owner is responsible instead of the toilet user. For this example, the user is responsible for maintaining the toilet, so this indicator was excluded.

Cleaning requirement (S9).
Cleaning requirement is characterized in the same manner as end-user cleaning requirement, except the manager or owner is responsible instead of the toilet user. The scores outlined in Table S26 can be used for this indicator if applicable (i.e., the toilet is not owned by the toilet user). For this example, the user is responsible for cleaning the toilet, so this indicator was excluded.

S3. Determination of indicator weights
For each indicator there is a corresponding indicator weight determined by the contextual driver (importance) of that indicator. The indicator weight for each indicator is calculated using analytical hierarchy process (AHP) pair-wise comparison criteria matrices. The contextual driver score for each indicator is normalized on a scale of 1 to 100 and compared against the other contextual driver scores within its criterion matrix. The contextual driver scores are built within DMsan (location.xlsx) and depend on context-specific information, such as the economic, ecological, and cultural landscape. Below is the description of each indicator contextual driver and the scores assigned for Uganda as well as a description of how the indicator weights are calculated using AHP. Environmental and economic indicators do not have contextual drivers because it is assumed that the environmental indicators are given equal, uniform weights (1/3 weight each) and the economic indicator is given a weight of 1.0 as it is the only indicator within the economic criterion.

Extent of training (T1).
The extent of training drives how important the indicator user interface robustness is for sanitation system selection. Failures can arise at the user interface when a community does not adequately invest in people to train end-users on how to properly use the system, including what can go in the toilet (e.g., feces, urine, paper products, water), how to clean the toilet, and other maintenance requirements. The World Economic Forum indicator 6.02 Extent of Staff Training is used for this contextual driver. 37 Extent of staff training is the extent of training that companies invest into their employees for each country and is scored on a scale of 1 to 7. The lower this score is, the higher the weight it will have in picking a simpler user interface that requires minimal training. For this example, Uganda has a score of 3.6 out of 7.

Population without at least basic sanitation (T2).
The population of a country without at least basic sanitation was selected as a contextual driver to represent the relative importance of resiliency of treatment type for sanitation system selection. Treatment type resiliency will be weighted higher in countries with high populations without at least basic sanitation because system failure in a community with widespread sanitation services is assumed to be less detrimental than in a community with low access to sanitation and no other options. The sanitation coverage data reported by World Health Organization (WHO) and the United Nations Children's Fund (UNICEF) is used to inform this indicator weight calculation. 38 At the time of the report (2020), 80% of the people in Uganda did not have access to at least basic sanitation facilities.

S3.1.2. Feasibility contextual drivers
Technology absorption (T3). Technology absorption was selected to describe the importance of the indicator accessibility to parts for sanitation system selection. A country's absorption of the latest technology could influence a community's ability to support a more advanced sanitation system that requires significant custom parts. Parts can be shipped to the community but could limit the ability of a community to replace parts as the system is operated and maintained. Countries with higher levels of advanced technology were assumed to have better access to custom parts that may be required for novel and advanced systems: more access to custom parts resulted in lower importance of accessibility to parts because all parts are viewed as accessible. The World Economic Forum indicator 9.02 Firm-level Technology Absorption rates the extent of businesses adopting the latest technologies on a scale of 1 to 7 with low scores representing low technology absorption rates. 37 For this example, Uganda has a score of 4.0 out of 7 in technology absorption.

Quality of roads (T4). Quality of roads drives how important the indicator transportation
feasibility is for sanitation system selection. The quality of roads can influence the type of conveyance system that can be implemented for a sanitation system. The World Economic Forum indicator 2.02 Quality of Roads rates the road qualities within a country on a scale of 1 to 7 with low scores representing extremely poor road conditions. 37 Uganda has a score of 3.4 out of 7 in quality of roads.

Construction skills available (T5).
Construction skills available drives how important the indicator construction skills required is for sanitation system selection. The score for this contextual driver is based on the fraction of the workforce employed in the construction field. 39 Uganda's construction workforce makes up 2.1% of the entire workforce. Across all countries in the database, the maximum fraction of the workforce employed in the construction field is 40.5%. To determine an overall score for this contextual driver, the country-specific fraction was divided by the maximum fraction (i.e., 2.1/40.5 * 100), thus the score for Uganda is 5.2 out of 100.

Professional skills available (T6).
Professional skills available drives how important the indicator operation and maintenance skills required is for sanitation system selection. The World Economic Forum indicator 12.06 Availability of Scientists and Engineers rates the extent at which scientists and engineers are available on a scale of 1 to 7 with low scores representing unavailable professionals. 37 Uganda has a score of 4.1 out of 7.

S3.1.3. Flexibility contextual drivers
Population growth rate (T7). The population growth rate drives how important the indicator population flexibility is for sanitation system selection. The population growth rate for Uganda is 3.6%. 40 Across all countries in the database, the maximum growth rate is 4.5% and the minimum growth rate is -1.8%. To determine an overall score for this contextual driver, the country-specific growth rate was divided by the difference between the maximum and minimum growth rates (i.e., 3.6/[4.5 -[-1.8]] * 100), thus the score for Uganda is 57.1 out of 100.

Electricity coverage (T8).
The electricity coverage drives how important the indicator power outage flexibility is for sanitation system selection. The contextual driver is based on the power outages in firms in a typical month reported by The World Bank. 40 Uganda has 6.3 blackouts per month. Across all countries in the database, the maximum number of blackouts per month is 75.2. To determine an overall score for this contextual driver, the country-specific fraction was divided by the maximum fraction (i.e., 6.3/75.2 * 100), thus the score for Uganda is 8.4 out of 100.

Baseline water stress (T9/RR1).
The baseline water stress drives how important the indicator drought flexibility is for sanitation system selection. The baseline water stress, reported by the World Resources Institute, measures the ratio of total water withdrawals available to renewable water supplied, where high values indicate more competition for water among users (i.e., more water stress). 41 The baseline water stress in Uganda is 0.26. Across all countries in the database, the maximum baseline water stress is 4.82. To determine an overall score for this contextual driver, the country-specific baseline water stress was divided by the maximum baseline water stress (i.e., 0.26/4.82 * 100), thus the score for Uganda is 5.4 out of 100.

S3.2. Resource recovery indicator contextual drivers
Baseline water stress (T9/RR1). As described in the previous section, the baseline water stress drives how important the indicator water recovery is for sanitation system selection. This contextual driver is used to develop indicator weights for both drought flexibility (T9) and water recovery (RR1).

Nitrogen fertilizer fulfillment (RR2).
Nitrogen fertilizer fulfillment drives how important the indicator nitrogen recovery is for sanitation system selection. This contextual driver is calculated as the ratio between the nitrogen fertilizers used and the nitrogen fertilizers needed based on crop production in the country. The World Bank reports the fertilizer use by country as Fertilizers by Nutrient and the crop production within a country as Crops and Livestock Products. 40 Nitrogen fertilizer need is calculated using the crop production and the recommended fertilizer application by crop. 42 The calculated Uganda nitrogen fertilizer fulfillment 1.4% out of 100%.

Phosphorus fertilizer fulfillment (RR3).
Like nitrogen fertilizer fulfillment, phosphorous fertilizer fulfillment drives how important the indicator phosphorous recovery is for sanitation system selection. The calculated Uganda phosphorous fertilizer fulfillment is 1.1% out of 100% Potassium fertilizer fulfillment (RR4). Like nitrogen and phosphorous fertilizer fulfillment, potassium fertilizer fulfillment drives how important the indicator potassium recovery is for sanitation system selection. The calculated Uganda potassium fertilizer fulfillment is 0.5% out of 100%.

Renewable energy consumption (RR5).
Renewable energy consumption drives how important the indicator energy recovery is for sanitation system selection. This contextual driver is based on the renewable energy consumption (% of total final energy consumption) reported by The World Bank. 40 Uganda uses 89% renewable energy for their total energy consumption.

Infrastructure quality (RR6).
Infrastructure quality drives how important the indicator supply chain feasibility is for sanitation system selection. The World Economic Forum indicator 2.01 Quality of Overall Infrastructure rates the general state of infrastructure (e.g., transportation, communications, and energy) on a scale of 1 to 7 with low scores representing extremely underdeveloped infrastructure. 37 Uganda has a score of 3.3 out of 7 for overall infrastructure quality.

S3.3. Environmental and economic indicator contextual drivers
There are no contextual drivers for the environmental and economic indicators. It is assumed that the indicator weights are equal (1/3) for the three environmental indicators (Env1, Env2, and Env3), and no indicator weight is necessary for the single economic indicator (Econ1).

S3.4.1. Job creation contextual drivers
Unemployment rate (S1). Unemployment rate drives how important the indicator total jobs created is for sanitation system selection. A social benefit of sanitation infrastructure is the number of jobs that it can create, especially for communities who have high unemployment rates.

S26
The World Bank reports the unemployment total (% of total labor force), 40 and in Uganda, the unemployment rate is 2.4%.

International poverty line (S2).
The international poverty line drives how important the indicator high-paying jobs created is for sanitation system selection. Although job creation in any capacity can help communities facing unemployment challenges, high-paying jobs are especially important in communities with a high percentage of their employed population earning below the international poverty line ($1.90ꞏday -1 ). The International Labour Organization reports the percentage of employed individuals earning below $1.90ꞏday -1 , 39 and in Uganda, the percentage of employed individuals living below $1.90ꞏday -1 is 35.1%.

S3.4.2. End-user acceptability contextual drivers
It can be difficult to determine community-specific indicator weights without surveying the individuals using the sanitation system. A survey of members of the Bwaise community evaluated which factors negatively affect their perception of sanitation systems. 43 Summarized results from the household survey in Bwaise and example survey questions can be found in the Trimmer et al. Supplementary Information. The community survey revealed that factors that cause dissatisfaction of the system include unclean facilities (66%), long waiting times (25%), fear for personal safety (24%), and limited privacy (16%). These values were used to calculate the related social indicator weights, and future research could develop community survey questions that explicitly allow toilet users to rank and input their preferences related to social indicators.

Disposal preference (S3).
In the survey, the community did not report any preference or dissatisfaction with the number of times to dispose of the waste in the storage container. As a result, no contextual driver score was assigned for this indicator.
Cleaning preference (S4). Two social indicator contextual drivers relate to unclean facilities: cleaning preference and odor and flies preference. Because it is unclear which is driving unclean facilities, it was assumed that 2/3 of the survey members report unclean facilities due to cleaning preference and 1/3 of the survey members report unclean facilities due to odor and flies present. The contextual driver score of 44 out of 100 (calculated as 2/3 * 66) was assigned to account for some of the respondents also being concerned with odors and flies.

Privacy preference (S5).
In the survey, a large portion of the community members reported that long waiting times, safety, and unclean facilities were a cause for dissatisfaction with their sanitation facility. More private toilets (shared by one or two households) could reduce the amount of waiting times, safety concerns, and unclean facilities. Furthermore, communities that feel ownership over their user interface and storage are more likely to maintain the system better. The community percentage of long waiting times and limited privacy were summed together (i.e., 25 + 16) to obtain a score of 41 out of 100 for this contextual driver.

Odor and flies preference (S6).
As discussed in the cleaning preference contextual driver, odor and flies could be a contributing factor to dissatisfaction in facility cleanliness. It was assumed 1/3 of survey respondents reporting unclean facilities are dissatisfied due to odor and flies. A score of 22 out of 100 (calculated as 1/3 * 66) was given for this contextual driver.
Security (S7). The contextual driver for end-user acceptability is security. This indicator is a critical factor in sustainable and safe sanitation systems as unsecure facilities can lead to psychological stress, threats of violence, and health issues. Since 24% of the population surveyed reported personal safety as a concern, a score of 24 out of 100 was used for this contextual driver.

S3.4.3. Property manager acceptability contextual drivers
Like the end-user acceptability sub-criteria, it is difficult to weigh the landlord's perception of sanitation systems without having some information from the specific management. It is encouraged to gather information to adequately assess the preference disposal and/or cleaning. Bwaise end-users are responsible for requesting disposal and cleaning the sanitation units, thus, management values were not applicable for this example.
Disposal preference (S8). For this example, this contextual driver was not applicable.
Cleaning preference (S9). For this example, this contextual driver was not applicable

S3.5. The analytical hierarchy process (AHP) for calculating indicator weights
The following steps are used to calculate the indicator weights for technical, resource recovery, and social indicators. It is assumed that environmental indicators use uniform weights (i.e., 1/3 weight for each indicator) and economic uses an indicator weight of 1.0 for its single indicator (annual cost per capita).
To calculate the technical, resource recovery, and social indicator weights, tables are produced for each of the three criteria containing indicator weight scores. Then within each table, the contextual driver scores are normalized on a scale out of 100, and a pair-wise comparison matrix is formed by dividing each row value by the column value. The pair-wise comparison matrix is normalized to calculate the approximate eigenvector values (vi) based on the position values (aij) and number of array elements (n) representing the number of indicators (Equation S1). The criteria weight vectors (wi) are calculated by taking the average of each vector row in the normalized pair-wise comparison matrix (Equation S2).

Equation S2
Finally, to check for consistency, the weighted sum value is calculated by multiplying each row of the original pair-wise comparison matrix by the column indicator weight and dividing the product by the indicator weight for the specific row. Then, lmax is calculated by taking the average of the weighted sum value (Equation S3). The consistency index (CI) is estimated by subtracting the number of indicators from lmax and dividing it by one less than the number of indicators (Equation S4). The random consistency index (RI) is assigned based on the size of the matrices. The consistency ratio (CR) is the quotient of CI over the (RI) (Equation S5). Once the CR is calculated, the matrix is consistent if CR is less than or equal to 0.1. =

Equation S5
The resulting indicator weights (Table S29) are used with the overall criteria weights and indicator scores to calculate the performance of each sanitation alternative. Each country included in DMsan will have its own set of indicator weights depending on the contextual driver scores for each context.

S4. Multi-criteria decision analysis methods to evaluate sanitation alternatives
There are four steps in MCDA: (1) select criteria and indicators, (2) assign criterion and indicator weights, (3) determine indicator scores, and (4) calculate performance scores of each alternative. The objective is to select the most appropriate sanitation system based on the local context. The criteria in DMsan include technical, resource recovery, environmental, economic, and social with sub-criteria and indicators based on the trends identified in the literature review (Section S1). Indicator weights were determined within criteria matrices by using the analytical hierarchy process (AHP). Additionally, for the Bwaise illustration, 1,000 criteria weight scenarios were simulated to assess sanitation system performance under the entire spectrum of stakeholder preferences (Section S4.1); however, users can simulate as many or little scenarios as desired. Finally, alternatives can be ranked using technique for order by preference of similarity (TOPSIS) calculations that incorporate indicator scores (Section S2) indicator weights (Section S3), and criteria weights.

S4.1. Criteria weight scenarios
Due to the lack of and subjective nature of expert-informed weighting, criteria weight scenarios can be simulated within DMsan to evaluate the entire spectrum of weight options for each criterion (criterion weight ranges from 0 to 1). When a criterion has a weight of 0, all indicators related are not included in the decision. Likewise, when a criterion has a weight of 1, only its supporting indicators are included in the decision. If project has stakeholder-or expertinformed criterion weights, technology developers and decision-makers can modify the code within DMsan to use this set of criteria weights instead of the simulated criteria weight scenarios. DMsan users can simulate as many criterion weights as desired. For the Bwaise illustration, 1,000 criterion weight scenarios were simulated to capture the full spectrum of the decision space ( Figure S1). Figure S1. Criteria weight scenarios simulated in illustrative example. Each line represents a single criteria weight scenario with y-values representing the individual criterion weight for each of the five criteria. Criteria weights range from 0 (i.e., the criterion is not included in the decision) to 1 (i.e., only the specific criterion is important for decision-making).

S4.2. The technique for order by preference of similarity (TOPSIS) method for calculating performance score and rank of sanitation alternatives
The following steps are used to calculate performance scores for each sanitation system alternative. First, a decision matrix (Aij) is created where i is a particular alternative, m is the total number of alternatives (m = 3), j is a particular indicator, and n is the total number of indicators (n = 28). Each value in the decision matrix is an indicator score (Xij). The indicators are categorized as "beneficial", indicating a higher indicator score is best (e.g., total jobs created), or "nonbeneficial", indicating a lower indicator score is best (e.g., annual cost per capita). Each indicator score is normalized using vector normalization (Equation S6). Each value in the normalized decision matrix is a normalized indicator score. Next, the weighted normalized decision matrix is created by multiplying each normalized indicator score by the indicator weight (IWj) and the criteria weight (CWk where k is the criterion) (e.g., the energy recovery normalized indicator score is multiplied by the energy recovery indicator weight and the resource recovery criterion weight) (Equation S7).

Equation S7
Next, the ideal best (Vj + ) and ideal worst (Vj -) are identified for each indicator. For beneficial indicators, the ideal best is the maximum weighted normalized indicator score among the alternatives, and the ideal worst is the minimum weighted normalized indicator score among the alternatives. Likewise, for non-beneficial indicators, the ideal best is the minimum weighted normalized indicator score, and the ideal worst is the maximum weighted normalized indicator score. Once the ideal best and ideal worst have been identified for each indicator, the Euclidean distance is calculated from the ideal best (Si + ) and ideal worst (Si -) for each alternative (Equation  S8, Equation S9). Finally, the performance score (Pi) of each alternative is calculated using the Euclidean distances from the ideal best and ideal worst (Equation S10). The best performing alternative is the alternative with the highest performance score.

S5. System simulation with techno-economic analysis and life cycle assessment
Simulation, techno-economic analysis (TEA), and life cycle assessment (LCA) for each alternative were performed in Python using QSDsan. Design and TEA of the alternatives follow algorithms and assumptions used in Trimmer et al. 43 Although not used in this analysis, technology developers interested in converting costs from one country to another can incorporate the price level ratios by country found in the contextual parameter database for the package, or they can set specific contextual parameter values in QSDsan if local data are available. For LCA, the life cycle impact assessment method of ReCiPe (the H ["hierarchist"] perspective) 48 was used instead of TRACI in the original paper; however, DMsan users could change the method depending on the desired environmental impact indicator (e.g., using TRACI to include eutrophication potential). Any or all indicators for a given LCIA method can be used as individual indicators in DMsan. It should be noted that although total life cycle environmental impacts were assessed for each system, DMsan users could break down LCA results by sanitation unit process or life cycle stage within QSDsan, if desired, to identify problem areas for different types of damage across the sanitation service chain. Life cycle inventory data were obtained from the ecoinvent database (v3.7.1, at the point of substitution) 49 through BW2QSD, 50 EcoInventDownloader, 51 and BrightWay2. 52 For each impact item (e.g., cement), a keyword string (e.g., "market cement, unspecified") with constraints (e.g., location to be either "GLO" for global, or "RoW" for rest of the world) was used in in BW2QSD to select impact items that satisfy the keyword and constrains from the ecoinvent database, then minimum, mean, and maximum of the impact characterization factors (CFs) of all selected items were used as the lower bound, midpoint (baseline), and upper bound of a triangular distribution for this impact item in uncertainty analysis. The completion list of the keywords and constraints for the impact items can be found online (_lca_data.py in the bwaise module of EXPOsan). 53 Complete codes for all Python libraries used in this study, 50-54 the three alternatives, 53 and uncertainty analysis, 53 can be found online. A total of 167, 165, and 152 uncertainty parameters were included for Alternatives A, B, and C, respectively (Tables S30-S35). Except for all characterization factors (CFs) used in LCA, all baseline and uncertainty ranges of parameters were from Trimmer et al. 43,53    a U, T, and N represent uniform, triangular, and normal distribution, respectively. For normal distribution, the lower value is μ and the upper value is sigma. a When two values are presented, the first value is for the existing plant and the second value is for the alternative plant.