Developing a community of practice around an open source energy modelling tool

Energy modelling is critical for addressing challenges such as integrating variable renewable energy and addressing climate impacts. This paper describes the updated code management structure and code updates, the revised community forum and the outreach activities that have built a vibrant community of practice around OSeMOSYS. The code management structure has allowed code improvements to be incorporated into the model, the community forum provides users with a place to ask and answer questions, and the outreach activities connect members of the community. Overall, these three pillars show how a community of practice can be built around an open source tool and provides an example for other developers and users of open source software wanting to build a community of practice.


Introduction
The importance of long-term energy planning and system modelling is receiving growing interest in the literature due to the increasing complexity of systems with variable renewable energy, as well as the increased focus on sustainable development. At the same time, there has been increased interest in ensuring these energy system analyses are open, transparent and accessible to effectively support policy and decision-making [1,2]. Oberle and Elsland [3] provide an overview of different open source energy models and conclude that the features and availability of such models is improving. Of the open models they assessed, four models had both high sectoral and geographic coverage (OSeMOSYS, GCAM, renpassG!S and REMIND). These four models have very different foci: GCAM is a partial equilibrium model to evaluate the impact of global policies on climate impacts [4,5], renpassG!S is a simulation model for evaluating the operational performance of a given system configuration [6], REMIND is an integrated energy-economy-climate model [7,8] while OSeMOSYS is a long term energy planning model [9,10].
Each of these models have different community structures. GCAM is maintained in the United States by the Joint Global Change Research Institute (JGCRI), a partnership between academia and the US Pacific Northwest National Lab [5]. renpassG!S is hosted by the Center for Sustainable Energy Systems (ZNES) but does not appear to have an active online presence other than the GitHub page [11]. REMIND is developed by Potsdam Institute for Climate [12] and most online references connect back to Potsdam maintained pages/information. In contrast, over the last few years members of the OSeMOSYS modelling community have built a community of practice that supports users and is not dependent on any one institution or entity to survive.
OSeMOSYS has been used by national governments, the United Nations, university researchers and others to perform analysis covering energy security, short term dispatch, long term energy planning as well as expanding into analysis of the Climate, Land, Energy and Water systems (CLEWS) modelling.
This paper outlines recent model enhancements, methodological developments and tool enhancements of the OSeMOSYS model. It provides an overview of some recent applications of the model, and explains the governance structures that have developed around the OSeMOSYS community of practice. We conclude with an update on some of the major outreach activities of the OSeMOSYS community. Overall, OSe-MOSYS has developed a vibrant and active community of researchers and developers that support a well respected and extensively used modelling tool and this online community is an example for other open Final version published as: Niet, T., Shivakumar, A., Gardumi, F., Usher, W., Williams, E., & Howells, M. (2021). Developing a community of practice around an open source energy modelling tool. Energy Strategy Reviews, 35,100650. https://doi.org/10.1016/j.esr.2021.100650

Tools and methodology developments
Several methodological advances to the modelling tools have been developed over the last year. These include both model structure changes, scripts for improving model workflow, updates, and optional enhancements.

Alternate storage equations
OSeMOSYS comes with a set of equations for the modelling of storage for long term energy planning developed by Welsh et al. [13]. These equations create a simplified version of a daily load profile within a series of long-term time slices to model cyclic daily nature of storage usage. For short term storage with consecutive short term time slices or for large storage hydro where the intra-day variations are negligible these equations are computationally inefficient. Simplified storage equations that use consecutive time slices were developed for OSe-MOSYS by Taco Niet of Simon Fraser University. These equations, which require the use of numerically consecutive time slices, track the storage level between time slices but do not attempt to re-create the intra-day variations as done by Welsch et al. [13].
The alternate storage equations are built around three main constraints. First, the starting level for each time slice is tracked based on the previous time slice and the activity in that time slice, as shown in Equation (1).
S is the starting level of storage for a given time slice, t, and R in and R out are the rate at which the storage system is being charged or discharged, respectively, and Δt is the size of the time slice.
Second, the maximum and minimum capacity of the stored energy is checked at the start of each time slice to confirm it is neither below the minimum charge level or above the total installed capacity of storage, as shown in Equations (2) and (3), with IC representing the installed capacity and S min representing the minimum charge level.
Two additional changes, namely a maximum storage installed capacity parameter and a storage refilling constraint, were also implemented as part of this alternate storage code and these two enhancements can be applied to the original storage code as well if needed.
The GLPK implementation of these storage equations is available on GitHub [14]. Kuling and Niet [15] compared the operation of the two storage models available for OSeMOSYS and concluded that the new equations were more computationally efficient in all cases but that more investigation into the impact of the storage structure was required as there were some operational differences that may impact the results.

Preprocessing script to reduce matrix size
One challenge with large OSeMOSYS models is the significant number of zero entries for both the InputActivityRatio and the Out-putActivtyRatio, since each technology generally only inputs and outputs one or two fuels. This is not an issue with a smaller models, but when the number of technologies and fuels gets up into the hundreds all these extra zeros create a very sparse matrix that the solver needs to process. This is exacerbated when multiple modes of operation exist for different technologies as, again, most technologies operate only in one or two modes, adding to the sparsity of the matrix. This is inefficient, both in terms of memory use and in terms of computational time, often causing models to be un-solvable on even high-end computers.
The GNU MathProg language provides a solution to this issue by allowing one to specify sub-sets of technologies. This allows one to specify that connections to a given technology should only be created for specific combinations of InputActivityRatio and Mode or Out-putActivityRatio and Mode. When this is setup correctly the matrix generation time for the model, the corresponding memory usage, and the corresponding computational time to solve the model reduces drastically.
To provide OSeMOSYS users with a streamlined and faster solving model structure a pre-processing script was developed by Abhishek Shivakumar. This script takes in an OSeMOSYS data file and creates a list of commodity-technology-mode combinations indicating to the matrix generator which connections need to be built. This is based on the userentered InputActivityRatio and OutputActivityRatio. A similar process is used for storage technologies as these technologies also significantly increase memory use and computational time.
The preprocessing script performs the following steps:  Table 1 provides a comparison of matrix size, solve time and memory usage results from a few example model runs to illustrate the value of the pre-processing script. Three different models were tested across three versions of OSeMOSYS: Short code (v1); Short code with internal preprocessing (v2); Short code with external pre-processing (v3). v1 is the standard OSeMOSYS code available on GitHub. Internal preprocessing in v2 is carried out by the GNU MathProg code itself to create sets based on connections that exist between technologies, fuels and modes of operations. In v3, external pre-processing is carried out using a Python script as described earlier in this section.

Visualization scripts
A number of efforts have been made to provide better visualization options for model outputs. An example visualization script developed by Abhishek Shivakumar provides a method of visualizing an OSeMOSYS model that uses a standardized set of technology and fuel names. The script, written in Python and available as an interactive Jupyter Notebook on GitHub [16], allows users to visualize the outputs of the model quickly and easily by providing a set of predetermined graphs that include primary energy supply, fuel imports and exports, electricity installed capacity and electricity generation mix.
Quickly generating a 'standard set of figures' -figures that are commonly used in energy systems analysisis particularly useful when comparing results across multiple years or multiple scenarios. Figs. 1 and 2 show results for a national energy model of Viet Nam as an example. The visualization script allows a usera modeller, government analyst, or policy makerto view model results interactively. The underlying dynamics in the power generation sector are quickly and easily made available for analysis, with the flexibility to freely incorporate additional result parameters where required for further analysis. source models to follow.

Otoole -OSeMOSYS tools for energy
otoole is a Python package which aims to consolidate the range of Python helper scripts into one installable, community-maintained toolset. Current tools include: • a preprocessing script to create a versionable data package of csv files from a range of sources, including a GLPK datafile or an Excel workbook containing parameter and set data; • the ability to convert a data package of metadata-described csv files into an OSeMOSYS data file; • processing output files from different solvers into a common result format; and • creating a Reference Energy System diagram of the model.
More scripts and tools are being developed as needed. Further details can be found on the website https://otoole.readthedocs.io.
In parallel, work has begun on establishing a standard data structure for OSeMOSYS models which are self-documenting, computer-readable and human-readable. This work was a logical outcome of the restructuring of the OSeMOSYS repositories (described later) and the need to test each of the model implementations. A requirement of testing is that the same input data can be presented across the different implementations, to check that the results are also the same. The first version of this data structure is now available (https://github.com/OSeMOSYS/s implicity) using the teaching example model "Simplicity". The data structure was implemented using a Tabular Data Package (http://fricti onlessdata.io/specs/tabular-data-package/), and is essentially a folder of comma-separated value (CSV) files, the relation and structure of which are described in a JSON metadata file. The Python package 'otoole' (https://otoole.readthedocs.io/en/latest/index.html) can be used to convert the Tabular Data Package into a data file which can be read by the GNU MathProg implementation, or convert an Excel workbook, or GNU MathProg data file into a Tabular Data Package.
Together, otoole and the OSeMOSYS data standard facilitate the creation of powerful computational workflows through simplified processes for combining data, scripting for pre-processing, automation of model runs and post-processing of results. Using a workflow management tool, such as snakemake, and with the support of otoole, it is now possible to integrate OSeMOSYS into a fully reproducible scientific workflow.  Niet posted a python script [18] to allow one to setup a series of OSeMOSYS scenarios by varying input parameters and submitting the resulting set of model runs to the cluster for processing. The script takes in a python list of scenarios for each variable to be considered and, in a series of nested for loops, creates the parameter file for each scenario and a list of model run commands to be executed.

HPC process scripts
The second stage of Niet's script segregates the model runs into groups based on an estimate of how long each individual model run will take. For very fast runs it is often sensible to group large numbers of runs into a single group and submit them to Slurm as a single job. This reduces the scheduler overhead and usually results in significantly faster throughput.
A final feature of this script is the ability to efficiently use whole nodes of the compute cluster. This allows many scenarios to be run using the GNU Parallel [19] computing program available on most high powered clusters. Parallel takes in a file with a list of commands to execute in series and then runs them on the cores of a node as and when they become available. As an example, a set of 300 scenarios that each take 15 min to solve can be run on a 32 core node in around 150 min within a single job using parallel, instead of requiring 300 separate jobs. This makes the scheduler work much more efficiently and, in many cases, structuring jobs to use whole nodes of the HPC system allows them to run in parts of the cluster reserved for whole node operations, making it likely the job will be scheduled quickly.
It should be noted that a small edit to the base OSeMOSYS code is needed to allow Niet's script to operate. The base OSeMOSYS model code assumes that all the output files from all model runs will be named the same. This does not work when multiple model runs are all trying to save data at the same time on a node of the cluster. To avoid this the output file names are created as parameters and given unique names in each parameter file. This way each model run can create a unique file and not conflict with other runs. The code changes to enable this are available on GitHub [20]. Similar functionality was also developed by William Usher of KTH (see https://github.com/KTH-dESA/osem osys_workflow/tree/model_runs for details of this other method of achieving the same result).
The fact that multiple users of OSeMOSYS independently developed and uploaded HPC scripts to GitHub shows the complexity of analysis OSeMOSYS has been applied to.

Renewable capacity target vs. renewable energy target
A number of jurisdictions have been setting installed capacity targets for renewable energy rather than renewable energy generation targets. Although this is generally not considered the best approach to energy policy, modelling the impact of renewable energy capacity targets allows for the evaluation of these policies on system build out and operation. Eric Williams of King Abdullah Petroleum Studies and Research Center (KAPSARC) developed code in a GAMS version of OSeMOSYS that sets a renewable capacity target instead of a renewable energy target. The code that implements this in GAMS is available in an OSe-MOSYS Enhancement Proposal on GitHub [21].

Applications
OSeMOSYS has been applied to a variety of case studies over the previous years, building a vibrant community of practice of users and contributors working together to improve the sophistication and accessibility of the tool. In this section we first provide an overview of some of the latest published works using OSeMOSYS following the review by Gardumi et al., in 2018 [10]. We then describe some of the international capacity development activities that are ongoing, and finish with a description of the building community of practice using OSeMOSYS.

Published works
A large number of works have been published after those reviewed by Gardumi et al., in 2018 [10], featuring new versions of OSeMOSYS, code modifications of existing versions and applications, from global scale, down to national, regional and local (cities).
A new version of the code is formulated in Python, by Dreier et al. [22], by the name OSeMOSYS-PuLP. This version is different from the previously existing python model and is available on the GitHub repository of the modelling framework [23]. Specifically, it adds a Monte-Carlo feature to the original formulation of the tool, therefore One continental application uses the South America Model Base (OSeMOSYS -SAMBA, see Ref. [36]) to look into the dynamics of power systems integration in South America from a Brazilian point of view [37]. With a cooperative game theory approach, the scenario results are used to calculate the bargaining power of each country, through the Shapley value.
Most of the other publications focus on national scale. Work by Burandt et al. based on GENeSYS-MOD assesses pathways to decarbonise electricity supply, transportation, heat supply and industry in China [38]. A paper by Anjo et al. assesses the long-term impacts of demand response in the energy investments planning of Portugal [39]. To account for demand response, the authors use code enhancements by Welsch et al. [40]. Another national application by Dhakouani et al. looks into the impact of the introduction of energy efficiency policies in Tunisia onto the integration of renewable energy [41]. The authors build on a model presented by Dhakouani et al. [42], introducing an energy efficiency action, peak clipping (as scheduled outages) and assess the power system reliability factor. Rady et al. present an electricity system model of Egypt, to support planning towards the least-cost supply mix [43]. Keller et al. estimate costs and reduced emissions of coal-to-biomass retrofit in Alberta, Canada [44].
Finally, an application on a village scale in India soft-links a longterm investment optimization model in OSeMOSYS with 1) a bottomup model for long-term projection of household electricity demand based on socio-economic parameters and 2) a stochastic load-profile generator [45]. This is the second published application on a village scale, after the one by Fuso Nerini et al., in 2015 [46].

Summer schools and capacity development
The OSeMOSYS model has been used for several years for two major international development activities. First, the ICTP Summer School on Modelling for Sustainable Development has been teaching OSeMOSYS for the last four years and second, the United Nations has been using the OSeMOSYS model to do modelling capacity development in emerging economies. Both these activities work to build capacity in modelling for sustainable development.
The latest ICTP summer school, held June 2019, brought together 16 participants from a variety of countries including Jordan, Columbia, Brazil, Morocco, Ecuador, Indonesia, Mongolia, South Africa, Pakistan, Uganda, Ethiopia and Sierra Leone to learn about modelling and bring modelling skills back to their home countries. Representatives from government and academia spent three weeks learning about modelling with OSeMOSYS alongside other training sessions on the Open Source Spatial Electrification Tool (OnSSET) and on Spatial Analysis with QGIS.
The syllabus for the two and a half week course is listed in Table 2. In general, the philosophy is to take participants who have a basic understanding of energy systems and programming and introduce them to the key concepts of energy modelling and optimization with very simple models, and then more nuanced features of the OSeMOSYS model are explored. To support participant learning, exercises are structured to  [24], using the master GNU MathProg model as a reference. The version is cross-checked with the one by Noble [25], but was developed independently. Is was also independently developed from the GAMS formulation by Löffler et al. [26]. After translation, the code was further modified, by: 1) reformulation into a two-level stochastic problem; 2) introduction of policy level as a discrete random variable and 3) introduction of minor constraints, mostly related to the increase of installed capacity. This new version is applied to the assessment of cost of policy uncertainty in the electric sector in the ERCOT area (covering 73% of Texas). Brozynski and Leibowicz further modified the above version in Ref. [27], adding constraints to improve the representation of power and transportation at a city scale. Additions target: 1) constraining private transportation to be non-dispatchable; 2) improving the charging and discharging process of electric vehicles; 3) inclusion of capacity growth constraints; 4) constraints to represent instant peak demand and need for supply technologies to meet it; 5) demand response capability and 6) integration of purchases of carbon offsets. The updated version is applied to the study of decarbonisation pathways for the city of Austin, Texas that will allow the city to comply with the Texas Community Climate Plan. Finally, another work by Leibowicz et al. extends the above to improve the representation of decarbonisation of urban residential building energy services [28]. Constraints are added to the previous GAMS formulation to: 1) separate residential building end-use technologies from the centrally coordinated dispatch of electric generation assets; and 2) constrain the market penetration rate of each technology. The improved formulation is again applied to the case of Austin, Texas.
Additional works present the addition of functionalities to existing versions of OSeMOSYS. That is, they perform minor modifications to the standard version of the code, by adding constraint equations and modifying the objective function. Gardumi et al. introduce three functionalities to improve the representation of balancing options for variable renewables in long-term electricity system optimization studies [29] [32]. They apply the modified formulation to the assessment of infrastructure investments which could decrease the curtailment cost, in Alberta, Canada, in a one-year model. Groissböck and Pickl introduce equations to account for power plants retirements and to constrain the yearly capacity additions of renewables [33]. While the latter is originally allowed by OSeMOSYS, it must be defined a priori by the user, through a specific parameter. The authors allow the capacity addition to be constrained endogenously, by introducing an exponential growth factor. They apply the enhanced formulation to assess the impacts on investments of several fuel-price reforms in Saudi Arabia. Finally, Palmer-Wilson et al. add equations to account for and constrain the land use by power supply technologies, thereby touching upon the Energy -Land use part of the Nexus methodology [34]. The modified formulation is applied to study the impact of land requirements on electricity system decarbonisation pathways in Alberta, Canada.
Besides the above works featuring new or modified versions of the tool, numerous publications present applications. On a global scale, Sarmiento et al. use the GENeSYS-MOD version of OSeMOSYS in GAMS (as from Ref. [26]) to analyze scenarios for the integration of renewable energy sources in Mexico [35].

activities.
For uses in Higher Education Institutions, the teaching content is changed and enriched to fit the scopes of the Bachelor's or Master's courses it is used in. Every case is different. With the intent of only providing an example, Fig. 3 shows the structure of a Master's level course held at KTH Royal Institute of Technology [48,49]. The boxes marked with bold borders in Fig. 3 are the course modules which build most on the original teaching kit. Table 4 reports the Intended Learning Outcomes of the course.
As can be inferred from the above, the potential uses of the teaching kit are numerous and they will differ across institutions, groups of recipients and scope of the teaching. A rapid and efficient exchange of teaching material across all these potential uses in the OSeMOSYS Community should be based on two pillars: • Availability of teaching material as short, stand-alone and openaccess licensed modifiable modules, which may be combined in numerous different ways to form lectures and courses; • Database of Intended Learning Outcomes and course structures compiled by course from the OpTIMUS Community.

Open source governance
One of the prerequisites for a successful open source project is the concentration of effort on a few (well maintained) tools, rather than the diffusion of many different versions. There is an important balance to be struck between free and permissive use of the OSeMOSYS GNU Math-Prog implementation, enshrined in the Apache 2.0 license, and the ensure that basic modelling concepts are well developed and understood before moving on to more complex concepts. By the end of the three weeks the participants each have a model of a region they are interested in and present their findings during the poster session on the last day of the summer school. A recent conference paper on the teaching philosophy of the summer school was presented at the 2019 International Conference on Sustainable Development (ICSD) Conference [47]. UN Capacity Development Activities have taken place in the last year in Ethiopia, Cameroon, Indonesia, Bolivia, Nicaragua and Costa Rica and are expected to continue as various countries see the value in using modelling for sustainable development.

Community of practice and outreach
The main developments in the community of practice and outreach are the expanded use of OSeMOSYS in higher educational institutions around the world and the development of example teaching materials for training in OSeMOSYS.

Use in higher education
OSeMOSYS is widely used as an educational tool. It introduces students to key dynamics that occur between the different parts of energy and resource systems as they develop to meet final demands. Compared to similar bottom-up energy/resource modelling tools, these dynamics appear clearer in OSeMOSYS for students and non-experts. This is due to the tool being open source and to the existence of a simple formulation of it in GNU MathProg modelling language. The GNU MathProg equations are written in a form which is close to the algebraic formulation and therefore are relatively human-readable. Moreover, all steps of the calculation are clearly separated in different equations, to make the logical flow fully explicit. On one-hand, this increases the computational burden (in fact, more concise formulations of OSeMOSYS have been created to overcome this issue and they are used for large problems), but on the other hand it makes OSeMOSYS more accessible.
Such a setting explains the fast increase in the use of the tool in higher education in the last 5 years. Table 3 lists the Higher Education Institutions which communicated to the OSeMOSYS community that they are using the tool.

Teaching material
Several different sets of documentation and tools for learning energy and systems modelling and the use of OSeMOSYS have been generated through the years by Higher Education Institutions and International Organizations involved in capacity development practices. The documentation and tools include theory lectures (usually as.pptx files), tutorials, simple learning applications, user manuals and videos. Recent efforts are packaging such material into a teaching kit [47].
The kit consists of self-standing short modules, which can be combined to produce lectures and courses, tailored to different types of audiences (e.g. university classes, as well as trainees from Governments). Each module is open source licensed.
An example of a course using the teaching kit is the Joint Summer School on Modelling Tools for Sustainable Development, organised by the OpTIMUS Community every June at the United Nation's International Centre for Theoretical Physics (ICTP) in Trieste, Italy as mentioned earlier. The course is an intensive learning module of two weeks and a half, guiding the attendants through the basic theory of energy systems modelling and the creation of a simple but realistic and complete country application. For details of the teaching kit, the learning outcomes and related outputs please refer to the recent work by Kubulenso et al. [47] and to section 3.2 above.
Similar uses of the teaching kit (with similar course structures) are made at the Energy Modelling Platform events. These are regional energy modelling schools organised by the OpTIMUS Community yearly. Currently, events are regularly organised in Africa, Asia and North America. Finally, the kit is used within national capacity development concentrated effort required to improve and evolve the formulation itself.
In the case of OSeMOSYS, the success and growth of the project has led to multiple parallel implementations of the original model formulation. On one hand, this is a desirable outcome as it shows that people are adopting and using OSeMOSYS for its intended purpose -building capacity and lowering the barrier to entry to energy system modelling. It also illustrates one of the benefits of OSeMOSYS -the ease of extension and adaptation of the formulation. On the other hand, unchecked derivations of OSeMOSYS risk dilution of effort, which at worst undermines the community itself. At best, this situation is confusing, particularly to those who are new to OSeMOSYS and energy system modelling. This confusion is likely to increase the barrier to entry and so make OSe-MOSYS a less attractive option.
A recent example is the independent development by three different research groups of GAMS implementations of OSeMOSYS. GAMS (General Algebraic Modelling Language) is a popular closed-source language for developing optimization models and provides interfaces to many powerful solvers. Many institutions, including universities and research groups, have existing licenses for GAMS or have built up institutional knowledge and capacity of GAMS, and so would prefer to work in this language. The potential problems with this are first: duplicated effort -the three groups have essentially performed the same task three times -translating the GNU MathProg implementation of OSeMOSYS into GAMS. Secondly, inconsistency -each of these parallel implementations differs from one another and from the original formulation, as the teams customise the code for their particular situations. Any improvements or adaptations made in the GAMS implementation of the OSeMOSYS formulation cannot be easily contributed back to the original. Overtime, the divergence becomes more severe until the different implementation are essentially separate energy system model frameworks.
In response to the above situation, the OSeMOSYS steering committee (see section 3.4.1 below) has decided to implement a number of improvements to the OSeMOSYS structure and processes. The aim of this is to communicate to users, developers and the wider OSeMOSYS community the expectations, code of conduct and sequence of steps required to help use and maintain OSeMOSYS. This does not preclude continued parallel development of OSeMOSYS implementations. OSeMOSYS will continue to be licensed under a permissive open source Apache 2.0 license. Rather, the steering committee aims to lower the barrier to entry and make it easier for potential contributors to further improve and develop OSeMOSYS.
The strategy for delivering these improvements is described across four sections below. In the first section, we describe the OSeMOSYS governance structure. We then describe how we have restructured the OSeMOSYS code and tool management using Git and GitHub to manage versions, host documentation and automate testing across different implementations. In the third section, we provide an update on the avenues for OSeMOSYS community engagement including documentation, the forum and the tools provided on GitHub. And in the final section, we outline improved data management practices to support the efforts towards greater interoperability between OSeMOSYS implementations and with other energy system models.

Governance structure
The governance structure for managing OSeMOSYS consists of a small team of administrators of the OSeMOSYS code, at least one "champion" for each model implementation, and the Steering Committee. The Steering Committee is comprised of volunteers from the OSeMOSYS user group, and meets at least once per year to discuss progress, issues, and make strategic decisions about the future of OSe-MOSYS. The chair of the Steering Committee is currently Will Usher, and he has responsibility for organising the meetings, circulating minutes and following up on the actions. The OSeMOSYS code administrators are responsible for arbitrage of bug reports and enhancement ideas, reviewing contributions and planning developments and allocating tasks to open source contributors. The code champions are "power users" or primary developers, each responsible for one implementation. These individuals hold responsibility for their individual implementation of the OSeMOSYS formulation (e.g. in GAMS, Pyomo, PuLP or GNU MathProg). In addition to these formal roles, there exist a multitude of other important roles, such as "communications", responsible for compiling regular newsletters, mentors and advisors to the administrators and so on. As the OSeMOSYS community continues to mature and grow, the community and the steering committee will aim to recognise and document these essential roles for the community.

Code and tool management
The OSeMOSYS code base, documentation and implementations are stored in Git repositories 1 as shown in Fig. 4. The repositories are hosted on GitHub in the OSeMOSYS organisation. The main OSeMOSYS   Fig. 3. General structure of the OSeMOSYS Masters' level course. 1 Git is a software tool for distributed version control of code. It allows individual developers to make changes on separate feature branches of the code and commit their changes in parallel. Once a sequence of changes has been completed, the feature branch is merged into the master branch. Version control is widely used in software development, and increasingly in scientific and research communities.
repository provides several key features which acts to bring together the different implementations to achieve the primary aims of consistency and clarity. One repository is used for each implementation of the OSeMOSYS formulation, and these are referenced within the main OSeMOSYS repository. Using separate repositories for each of the OSe-MOSYS implementations allows them to develop in ways most appropriate for their community, and also passes responsibility for the code maintenance and management to the community. Thus the OSeMOSYS administrators can focus on enabling the wider community to better maintain and improve OSeMOSYS.
The documentation common to all OSeMOSYS implementations is hosted in the main repository, with implementation-specific documentation included in the relevant implementation repository. Any changes to the OSeMOSYS implementation should be reflected in the formulation, and hosting the documentation in the same repository makes this easier. Documentation is automatically rendered and hosted on osemosys.readthedocs.io whenever changes are made in the GitHub repositories.
The main repository also contains automated tests which aim to ensure that the OSeMOSYS formulations across the OSeMOSYS implementations are identical. These automated tests consist of a series of simple model data-sets which are then compiled and run in each of the implementations whenever a change is made in one of the model implementations. This will facilitate the process of ensuring consistency between the different model implementations. If a change in one of the implementations influences the results, the tests for that implementation Description Grading criteria 1 Describe common energy systems modelling and scenario analysis approaches and identify their key strengths and limitations Pass/Fail quiz: Pass: In a Quiz published after the relevant classes, the student has identified the common energy system modelling and scenario analysis approaches presented in the class and has identified the key strengths and limitations discussed in the class; Fail: The student fails to take the Quiz published after the relevant classes by the deadline or provides wrong or no answers to any of its questions. 2 Write a basic linear energy system optimization problem in GNU MathProg modelling language Individual Assignment 1 (Grade E toA): E: The student has written and successfully run the optimization function and demandsupply constraint of a basic energy system optimization problem, upon being provided a text description; C: The student has written and successfully run all the equations of a basic energy system optimization problem, upon being provided a text description; A: The student has written and successfully run all the equations of a basic energy system optimization problem, upon being provided a text description, has run suggested scenarios and has explained the differences in the results; 3 Apply a selected energy systems modelling tool in the analysis of stylized long-term energy planning problems Individual Assignment 2 (Grade E toA): E: The student has created a simple energy system model, following given instructions, and has obtained results; C: The student has created a simple energy system model, following given instructions, has carried out all steps of the creation as indicated by the tutor and has obtained reasonable results, similar to the ones provided by the tutor; A: The student has created a simple energy system model, following given instructions, has carried out all steps of the creation as indicated by the tutor, has obtained reasonable results, similar to the ones provided by the tutor and has critically commented on the results and theirpotentialreal life implications 4 Analyze various sample energy system situations and appropriately distill insights, given limited and uncertain information Individual Assignment 3 (Grade E toA): E: The student has modelled and obtained results for a set of pre-defined scenarios and has commented on how reasonable each result is; C: The student has modelled and obtained results for a set of pre-defined scenarios, has commented on how reasonable each result is and has provided relevant comparisons between independently selected scenarios; A: The student has modelled and obtained results for a set of pre-defined scenarios, has commented on how reasonable each result is, has provided relevant comparisons between independently selected scenarios and has successfully created one additional (and relevant) scenario. 5 Include a basic representation of the links between climate, water, land use and energy into an energy system model Individual Assignment 4 (Grade E toA): E: The student has modelled and obtained results on water and land use impacts of a sample energy system for situations already conceptualised externally; C: The student has modelled and obtained results on water and land use impacts of a sample energy system for situations already conceptualised externally and has commented on them; A: The student has modelled and obtained results on water and land use impacts of a sample energy system for situations already conceptualised externally, has commented on them, has compared them where relevant and has critically analysed the interlinkages (including trade-offs and synergies) between waterland use and energy systems. 6 Undertake a thorough and detailed analysis of a selected national energy system, including independent data gathering, problem definition, model choice, generation of solutions and interpretation Group Assignment 5 (Grade E toA): E: The student has carried out a well-balanced shared of a group project, contributing to the successful creation of a country energy system model; C: The student has carried out a well-balanced shared of a group project, contributing to the successful creation of a country energy system model and actively contributing to the critical analysis of the results and the extraction of policy insights; A: The student has carried out a well-balanced shared of a group project, contributing to the successful creation of a country energy system model from the start, actively contributing to the critical analysis of the results and the extraction of policy insights and creatively adding novel elements to the model.

Table 4
OSeMOSYS Masters' course intended learning outcomes.
will fail and a warning to the OSeMOSYS maintainers will be sent notifying them that a change has been initiated. Finally, releases of OSeMOSYS will be made periodically and will follow semantic versioning (semver.org) using tags in each of the Git repositories. Tags provide a permanent reference to a particular point in the history of a repository. Releases will be made independently for each of the OSeMOSYS implementations, but there will be a clear mapping between the versions. This mapping will be formalised in the main OSeMOSYS repository, where the releases will be composed.
However, there are many alternative ways to contribute to OSe-MOSYS. These include improving the documentation, identifying and reporting bugs and proposing new ideas for extending or improving the OSeMOSYS formulation.

Community support forum
A Google group has been actively used for community support since September 2018 and the forum has become a home base for an active and vibrant community. This forum replaced an earlier Reddit support forum which was not as effective since the questions expired after a given time preventing older questions from being answered by the community. Over 175 questions have been posed to the new Google group and, for the most part, questions are answered by community members within a day of being posted. This forum has been active to both support fellow users and to capture support information for future reference. Of the 175 questions, most have not been asked previously on the forum indicating that most users will first search the forum prior to posting their question. This archiving of support activity will help the community build strength moving forward and ensure that new users can find help for common errors while allowing more experienced users to answer new and more unique questions. Fig. 5 shows a word cloud of the subjects of the different items posted to the community support forum. As would be expected, many of the subjects mention OSeMOSYS and modelling (or modelling). Interestingly, MoManI is a common subject as well since it is a common OSe-MOSYS interface. Other subjects that are common include "water," likely related to the use of OSeMOSYS for CLEWs analysis, "Python", indicating the popularity of Python for running OSeMOSYS and analysing results, and "storage," indicating the importance of storage in system analysis with variable renewable energy.

Data management best practices
The OSeMOSYS community has been active in developing standards for data management best practices as part of the roundtable discussions on strategic energy planning [50]. These roundtable discussions, led by Oxford Policy Management and UK Aid, have developed best practices for both data management and energy systems modelling, and the OSeMOSYS community has been an active participant in these discussions.
One major outcome of these roundtable discussions is the u4RIA (universally Retrievable, Repeatable, Reconstructible, Reproducible, Interoperable and Auditable) standard for data management best practices [40]. This standard provides best practices for those doing strategic energy planning to ensure that results and assumptions are reliable and that others can replicate the work. In addition, it provides guidance to those doing sustainable development activities in how they interact and work with emerging economies on data management.

Impact and conclusions
This paper has described recent enhancements and uses of the OSe-MOSYS open energy systems model and described how a community of practice is growing around this open source tool. This growing community of practice, built around the pillars of the redesigned OSeMOSYS  GitHub repository and code management processes, the OSeMOSYS community support forum and a number of practice and outreach activities show how a community can be built around an open source modelling tool.
Effective code management practices and a well organized GitHub repository have allowed a number of enhancements to become published that would otherwise not have been widely available. Prior to the recent reorganization the organization of the GitHub repository was not well documented and the ability to contribute code was challenging. Since the reorganization, including the publishing of guidelines for OSeMOSYS Enhancement Proposals, many code additions and scripts for processing data and running models have been contributed. These contributions would not have been publicly available without the GitHub reorganization and the clear instructions for contributors. To build a community of practice around an open source tool requires good code management practices as illustrated by the revamped code and tool management structure.
The OSeMOSYS support forum, and the archiving of the questions in the Google group for future reference, has allowed for many online discussions that help users solve modelling challenges they were experiencing and succeed in their modelling efforts. The move from a Reddit based forum to a Google group enhanced the ability of this forum to support the community of practice -questions are now always available and follow-up questions can be asked at anytime, the group is more visible than the previous Reddit forums and many more questions are being asked on the forum than before the changeover. This forum has complemented the changes to the GitHub repository and combined are doing a great job supporting the community of practice.
Finally, the community practice and outreach activities have strengthened the community in numerous ways illustrating the importance of such activities to build a strong community of practice. For example, those who answer questions on the support forum often know each other from having attended a summer school, or are colleagues who attended UN capacity development workshops, and sometimes they are both. In this way the outreach activities strengthen the support forum and vice versa. Outreach activities that occur without a support forum in place to help the community help each other will likely be less effective.
With a large number of code developments and applications, the community has created a wide body of knowledge. This knowledge is used by academics in Higher Education teaching and research, as well as by experts of international organizations in capacity development activities. And it is jointly shared by them in spaces where they all contribute, from the forum to the summer schools. Thus, the community-based approach to model development and application greatly narrows the science-policy gap, or rather dissolves it. It also ensures that a long-standing and independent research base, as carried out in academia, supports uninterrupted national energy and resource planning independent of the duration of government cycles. One of many examples is the current prominence of OSeMOSYS as a tool for informing RES policies in Tunisia and the MENA region, built upon initial academic modelling efforts such as those of Dhakouani et al. [41,42]. Those modelling efforts were successful especially thanks to: 1) the presence of a community supporting the uptake of the tool in Tunisia, 2) the fast learning curve (i.e. readily usable as integral part of a PhD project), 3) the open source nature (which allowed the PhD student to contribute to code development) and 4) the accessibility of the GNU MathProg formulation (which untapped the creativity of the student without requiring him/her to first develop high-level programming skills).

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.