Mapping the spatial distribution of global mariculture production

information on the location of mariculture production is sparse. Identifying where mariculture production occurs remains a major challenge for understanding its environmental impacts and the sustainability of individual farms and the sector as a whole. We compiled known mariculture locations and applied a simple production-allocation approach to map remaining global mariculture locations across 73 countries using the key determinants of distance to shore and ports, and average productivity (tonnage) of known farms. Our map represents 96% of reported fish and invertebrate mariculture production for 2017, but excludes algae which constitutes half of global mariculture production. We provide, for the first time, a publicly available spatial database of known and estimated mariculture locations. We discuss the utility and limitations of the existing data and our modeling approach, and highlight the key data gaps and future challenges for mapping aquaculture. Our results provide a vital resource for mariculture and environmental researchers, but we emphasize the need for a standardized, ground-truthed global spatial database of aquaculture locations and farm-level attributes (e


Introduction
Feeding the world's increasingly numerous and affluent human population whilst maintaining the integrity of the Earth's natural systems is one of the greatest challenges facing humanity (Rockström et al., 2009;Springmann et al., 2018).Of particular importance is the growing demand for animal protein, as these foods tend to have high environmental impacts per calorie or gram of protein (Poore and Nemecek, 2018), and their consumption rapidly increases with growing per capita GDP (Tilman and Clark, 2014).Marine aquatic fish and invertebrate aquaculture is now a substantial source of animal protein and has been one of the fastest growing food sectors in the world over the past 20 years (Edwards et al., 2019;FAO, 2020;Naylor et al., 2021), with estimated first-sale value of US$106 billion (FAO, 2020).
Emerging opportunities to increase mariculture production (e.g., offshore aquaculture, Costello et al., 2020;Gentry et al., 2017aGentry et al., , 2017b)), and increasing recognition of the potentially lower environmental impacts of mariculture compared with land-based animal products (Hall et al., 2011;Hilborn et al., 2018aHilborn et al., , 2018b;;Poore and Nemecek, 2018;Tilman and Clark, 2014), suggest that mariculture holds significant promise for providing sustainable and nutritious food sources to help meet growing protein demand.
However, to realise this potential, policy makers, investors, and regulators need accurate and reliable information on the location of mariculture, and the amount and type of species being produced.Without this information it is impossible to understand the impacts of mariculture on the environment (e.g., water pollution, land-use change, disease outbreaks, escapes) and people (e.g., nutrition, economics, competition with other sectors and resources) (Kuempel et al., 2020), or to invest in and plan for the sustainable growth of the sector (Ottinger et al., 2018b).Yet, unlike other major food production systems, like marine industrial fisheries (Kroodsma et al., 2018;Watson and Tidd, 2018), crops (Theobald et al., 2020a(Theobald et al., , 2020b)), and livestock (Gilbert et al., 2018;Robinson et al., 2014), we do not have a detailed understanding of the spatial footprint of mariculture globally (Campbell and Pauly, 2013).This is particularly critical given climate change, which has the potential to affect mariculture in many different ways, including loss or reduction of suitable area due to sea level rise, more frequent extreme weather events, changing productivity, ocean acidification, and increases in sea surface temperature (FAO, 2020;Reid et al., 2019a).Without an understanding of the current distribution of mariculture, adapting and planning for future growth in a changing climate is impossible (Froehlich et al., 2018;Reid et al., 2019b).
Several approaches have been used to help fill this information gap, including compiling aquaculture locations through expert elicitation and governmental databases (e.g., Aquaculture Stewardship Council Data Mapping Study, 2020;FAO, 2019a;Fiskeridirektoratet, 2019), remote sensing techniques (e.g., Fu et al., 2019;Ottinger et al., 2016;Ottinger et al., 2018a), and generalized suitability mapping approaches (Gentry et al., 2017a;Vörösmarty et al., 2010;Wang et al., 2019).While these efforts hold significant potential, they do not fully account for current reported production levels across multiple taxa.
Here, we take a different approach to determining where mariculture occurs by combining previously disparate datasets to better estimate the number and location of mariculture farms for data-limited taxa and/or countries, which describes the majority (88%) of production.We combine available mariculture location information with reported marine and brackishwater aquaculture production (FAO, 2019b) to estimate the number of mariculture farms within each reporting country, and spatially allocate those farms into suitable mariculture areas based on distance to shore and distance to port.In doing so, we create a database of known and estimated mariculture locations for six different major categories of taxa (Table 1), providing the most comprehensive global information to date.This collation of data provides a powerful overview of mariculture production and distribution to support planning in the sector, exploring questions around the potential impacts of climate change on the sector, and mapping the environmental impacts from mariculture.Furthermore, these data highlight knowledge gaps and will hopefully spur future improvements to mariculture mapping efforts.

Methods
We categorised mariculture production into six broad categories (salmonidae fish, unfed or algae fed bivalve molluscs, shrimps and prawns, bluefin tuna, general marine fish, and non-shrimp crustaceans), based on functional and taxonomic groupings consistent with existing literature (e.g., MacLeod et al., 2020;Tacon and Metian, 2008;Tacon and Metian, 2015) and comprising 96% of global fish and invertebrate mariculture production for 2017.Our map does not include seaweed mariculture due to a lack of information on farming locations that would be needed to inform our modeling.We included all countries (73) that reported more than 500 t production of any single mariculture species in 2017 to the FAO (FAO, 2019b).We excluded countries reporting those species with lower production levels because they tended to report highly variable production levels and produced a tiny fraction of global mariculture production in 2017 (<0.1% combined), implying established production systems for those species do not yet exist in these places.
We first compiled the most comprehensive database to date of known locations of mariculture farms, then estimated the number of farms within each country, and finally mapped suitable mariculture locations within each country.Mapping and analysis was performed using R version 4.0.2(R Core Team, 2020).All data sources are provided in Table S1 and data are publicly available on The Knowledge Network for Biocomplexity website.

Locating mariculture farm information
We sourced spatial aquaculture data from national data portals, peer reviewed articles and grey literature (e.g., FAO National Aquaculture Sector Overview (NASO) Fact Sheets), data inquiries to governmental and non-governmental agencies, and the Aquaculture Stewardship Council (ASC), among others.A full list of mariculture data, their sources, and a summarization of the methods used for every country and species grouping can be found in Table S1.
Geographic mariculture location data (latitude, longitude) came in a variety of forms (points, polygons) and with varying levels of metadata and farm level attributes (e.g., lease vs. active sites, farm size, dates of operation, etc.).If information regarding current activity was not readily available for a given dataset, we necessarily included all reported locations, which means we may include farms that are not currently operational.The vast majority of locations lacked information on farm size and so we only provided point locations of the farms and did not attempt to quantify the area covered by the farms.For the few instances when polygon data were provided, the point location was estimated as the centroid of the polygon.
We used six "data types" in our workflow (Fig. 1, Figs.S1-6): A) The total number of farms in a country was reported and all farm locations were provided; B) The total number of farms in a country was reported, but no location details were provided for those farms; C) The total number of farms in a country was reported, but location details were only provided for a subset of those farms; D) The total number of farms in a country was not reported, rather a subset of farms within the country were reported and had locations provided; E) The total number of farms was not reported, rather a subset of the farms were reported, but no location information was provided.F) No information on the number of farms or their locations were reported.
We acknowledge that some countries we classed as comprehensively reporting the total number of farms may not be fully accurate due to a lack of metadata regarding the activeness of the site or dates of operation.Any country that had governmental data, or governmental derived data (e.g., EMODnet, FAO national aquaculture sector overviews, HEL-COM), was considered to have comprehensive farm count data unless literature suggested otherwise.

Estimating farm numbers
We estimated the total number of farms within a country and species grouping for data types D, E, and F (Fig. 1, Table S2).Using the country/ species grouping combinations with presumed comprehensive farm count data (N = 74, 17% of production included, data type A/B/C, see Table S3), we calculated average global tonnes of production per farm, P i , for each species group, i (Table 2) as: Where P i, c is the amount of production of species group, i, in country, c, and N i, c , is the number of comprehensively known farms producing species group, i, in country, c.
For each data-limited country, d, and species grouping, i (data types D and E), we then calculated the number of farms, F i, d , required to produce the level of reported production, P i, d , as: Where N i, d is the number of known farms in data limited country, d, that produce species group, i.
We used the global shrimp tonnes per farm value to estimate the number of farms for crustaceans because no countries reported comprehensive farm data for (non-shrimp) crustacean production.There were five instances where a country and species group were classed into data type D, but our modeling predicted no more farms, beyond those reported, were needed to support reported national production (Table S3).Additionally, only production of bivalve molluscs in India was classed as data type E, and our modeling predicted no more farms were needed, beyond those reported, to support reported national production (Table S3, Fig. S5).In these cases, we assumed that the reported farms represented all of the production in the country.

Estimating farm numbers example
Here we provide an example of how we estimated the number of farms for unfed or algae fed bivalve mollusc mariculture in the datalimited country, Mexico.There is one known bivalve mollusc farm location in Mexico, (N biv, mex = 1).However, based on the level of reported biomass, (P biv, mex = 15400 MT) and the global average production per farm for bivalve molluscs (P biv = 283.4tonnes), we assume that many more bivalve mollusc farms would be needed to support national production.We therefore estimate the number of remaining farms to place, F biv, mex , and round up to the nearest whole number: This resulted in a total of 55 farms (1 known, 54 estimated) for unfed or algae fed bivalve molluscs in Mexico on the final map.

Mapping suitable mariculture locations
To locate farms for countries and taxa with incomplete spatial data (data types B, C, D, E, and F), we identified potential mariculture areas based on distance from ports and distance from the coast, and randomly placed farms within resulting suitable areas.We assumed mariculture farms could be located anywhere from 1 km to 40 km from a port.The maximum distance value of 40 km was calculated by finding the distance to the closest port from all known mariculture locations (16,138 farms), and taking a global mean.This was done under the hypothesis that more aquaculture production would occur closer to ports due to increased access and reduced travel distances, leading to a reduction in operational costs for both platform and non-platform based mariculture (Kaiser et al., 2011).The minimum distance to the nearest port (1 km) assumes that farms need to be located a safe distance away from shipping traffic (Yoo and Jeong, 2020).All ports, excluding those classified as "large ports" in the World Port Index were considered (World Port Index, 2019).We placed all farms, except shrimp and prawn farms, exactly 200 m offshore due to economic and technological challenges of siting production farther offshore (Froehlich et al., 2017;Holmer, 2010).Shrimp and prawn farms were placed exactly onshore because these farms largely operate in coastal ponds (Kungvankij and Chua, 1986).Additionally, any area located within the Arctic circle was excluded as very little to no production occurs that far north because extreme cold temperatures slow growth necessary for profitable aquaculture operations (Froehlich et al., 2018).Each farm was randomly sampled and placed within the suitable areas, with no restriction on the number of farms placed within proximity to one another.

Farm placement accuracy assessment
To assess the overall accuracy of our placement method, we rasterized known farm point locations for data type A farms (8552 farms) to a resolution of 0.083 degrees (~9 km of coastline at the equator, 5 arc minutes).Each grid described whether at least one farm was in that location (e.g., each cell value was 0 or 1).We then used our global suitability map of the same resolution and compared the observed farm raster to the suitability raster to determine whether we were accurately predicting suitable mariculture areas (e.g., how many farms in type A countries are in areas that we deem suitable for mariculture).In addition, we created the same rasters at finer resolutions (equal-area grid sizes of 25 km 2 and 36 km 2 ) to see if our approach remained accurate at finer resolutions.We assessed the accuracy of our model using a confusion matrix and ROC (receiver operating characteristic) and AUC (area under the ROC) performance metrics.The AUC is a measure of the omission and commission rates, where the omission rate measures false negatives (places where there are farms that were deemed suitable), and the commission rate measures false positives (places where there are not farms that were considered suitable).True negatives vastly outweighed that of the other categories, since the known distribution of mariculture farms is largely concentrated, especially in countries with large coastlines.

Results
In total, we map 95,443 mariculture farms (Table 1, Fig. 2).Of these, 16,138 (17%, from 60 countries) have known locations, and 9951 (10%) are known to exist but lack specific location information and were placed into suitable locations.The remaining 69,354 (73%) farm locations were estimated based on country-level production statistics and suitable mariculture areas.Fig. 2 illustrates the global distribution of estimated and known mariculture farms.Fig. 3 illustrates the spatial detail within the map.
China represented 61% of all real and modeled farms on our map (N = 58,166), but only 5739 were real locations, a very low number in comparison to the estimated number and its associated production.Chile, Norway, the United States, China, and Ireland identified locations for a substantial number of farms (>1000).Only five countries (Cambodia, Mauritius, Myanmar, North Korea, and Russia), accounting for 0.5% of mariculture production included in our study, had farm Fig. 2. Global mariculture farm map and distribution of farms in longitude and latitude.
G. Clawson et al. locations that were completely estimated across all taxa (i.e., no locations were known).However, 17 countries (representing 10% of fish and invertebrate mariculture production) had comprehensive farm locations across all taxa (i.e., no estimation or interpolation required, Table S3).The remaining 56 countries produced at least one species group that required interpolation or estimation.Thirty-six countries have comprehensive "number of farm" data across all types of mariculture they produce (data types A, B, or C), accounting for 12% of the global fish and invertebrate mariculture production.
Information on site status (active vs. non-active) and levels of production (e.g., tonnes) were noticeably lacking from available datasets, as were farm area and culture type (intensive, semi-intensive, extensive).Salmonidae fish were the most data rich taxon (98% of farms had location information, 1.1% of farms were known with no location information, 0.9% estimated, Fig. S7), followed by bluefin tuna (34% of farms had location information, 66% of farms were known with no location information, 0% estimated, Fig. S8).Non-shrimp crustacean mariculture was the least data rich (Fig. S9).
Assessments of model accuracy at 0.083-degree resolution indicate that our approach is reasonably good at predicting suitable and unsuitable areas for mariculture farms (0.81 AUC).The vast majority of the ocean is unsuitable for mariculture (using current technologies) and does not have aquaculture farms (98% of the ocean).Despite this imbalance (which, if based on chance alone, nearly all farms would fall in "unsuitable" areas), our model fairly accurately identified suitable mariculture locations, with 65% of observed farms occurring in cells deemed suitable.One limitation of our approach was the frequent occurrence of "suitable" areas without any observed aquaculture.This was due to several factors: farms often occur in concentrated areas (vs.random disbursement); mariculture farms have not reached their biological potential and consequently, there are far fewer farms than what is possible; and, our model does not include additional variables that may further limit aquaculture placement.As a consequence, our randomized placement was often too dispersed.Inclusion of more environmental variables to restrict suitable areas, like depth, temperature, or salinity could possibly help to increase the accuracy.Using equal-area grid sizes of 25 km2 and 36 km2 to assess the accuracy of our suitability map, the AUC dropped to 0.78 and 0.79 respectively.Given the uncertainty associated with an AUC of 0.81, we suggest using our modeled data at a resolution of 0.083 degrees or coarser.

Discussion
We compiled known mariculture locations and applied a simple and repeatable production-allocation approach to map global mariculture based on distance to shore, distance to ports, and number of known farms.By combining previously disparate datasets, our map provides a novel and important new resource for aquaculture research, which can be used to assess potential impacts of mariculture on people and the environment, as well as understand trade-offs of future growth.We also provide an estimate of the number of farms at a national level, which can be refined at regional scales with improved data and/or locally developed models to more accurately map farm locations and production.This global map provides two distinct benefits.First, it gives us a starting point for understanding the distribution of mariculture, which allows us to highlight major data gaps and spur future improved mapping efforts.Second, it allows for the mapping of environmental pressures and impacts to inform policy decisions and plan for climate change impacts.

Caveats and limitations
We acknowledge that our dataset may exclude some available data that were not uncovered in our search.Several major producing countries, like China, have a massive amount of production with relatively little location data.By revealing this lack of explicit farm location data, our results highlight how countries' abilities to evaluate the impacts of mariculture are hindered.However, even revealing these data gaps is an important contribution, highlighting where future efforts for mapping mariculture should be directed.
While a seemingly simple problem to fix, collecting and collating detailed location data on mariculture can be resource intensive, data can quickly become obsolete as the sector rapidly evolves and relocates in response to environmental and societal pressures (Reid et al., 2019b), and small-scale or traditional mariculture operations can be hard to identify and track (FAO, 2020).However, many countries have the capacity to collect and maintain databases of commercial mariculture location information and can do more to report this data in a standardized and accessible format.Countries with permitting processes likely already collect these data, but in most cases do not make them readily available or report varying levels of metadata.For example, the United States reports the total number of farms within the country, but lacks specific locations in many instances.This is likely due to confidentiality, especially for small-scale farmers, which is protected by law in some places (FAO, 2020).
These data gaps prevent reliable comparisons with other food production types, in turn, limiting the development of environmentally and socially responsible food, agriculture, and development policies, as well as assessments of the cumulative impacts of food (e.g., Kuempel et al., 2020).These shortfalls are particularly detrimental given mariculture's rapid growth and increasing importance, both economically and as a vital component of food production and security.
Another key limitation is the lack of data on particular taxa.For nonshrimp crustaceans we had to estimate farm-level productivity using shrimp and prawn data, because no comprehensive farm level production or location information exists.Nearly 64% of non-shrimp crustacean production included in this assessment is mud crab/swamp crab production, which have relatively lower stocking densities when compared to penaeid shrimp (which make up 100% of global shrimp and prawn production included), our approach therefore likely underestimates the number of farms for non-shrimp crustaceans (Shelley and Lovatelli, 2011).Additionally, our "general marine fish" species group is composed of 38 different species of fish (13% of global production).Because of the variability of species within the group, there is likely to be a wide range of error associated with the distribution of farms, and potential future understanding of the spatial footprint of these taxa.
It is clear that greater availability of high-quality, standardized data will be essential for improving our understanding of the location of aquaculture farms, their intensity and efficiency of production, and the likely social, economic, and environmental impacts of aquaculture farming.Given the wide distribution of different species farmed across the world, this would be a major undertaking (see Ottinger et al., 2018b for details).Prioritizing global assessments for specific species or production types is necessary, but very time and resource intensive.
We applaud efforts by countries such as Norway, Canada, Chile, and Ireland that have robust aquaculture licensing and reporting guidelines, and that make data publicly accessible.Even where spatial data are available, important details for achieving a full picture of aquaculture production can be missing.For example, data rarely include details on the area farmed, species-specific production, culture type, or activeness of the farm, preventing a complete understanding of both the probable intensity of localised impacts and the distribution of production across producing regions.In addition, while identifying very data-poor countries is easy, even for better-reported countries, the completeness of their datasets is unclear, limiting our ability to build a robust picture of national or global mariculture.

Moving forward
Overcoming these limitations requires better data availability and reporting across countries, with a standardized, complete global spatial mariculture database being the gold standard to aspire to.We suggest that farms should report, at a minimum, information about what types of species are grown, production environment, farm size and stocking rates at each location at annual intervals.Whilst new remote sensing technologies could help determine aquaculture locations (Ottinger et al., 2018b), these data will still need to be supplemented with farm level data to fully assess aquaculture production.At the country or province/ state level, policies requiring more precise and detailed mariculture data as is the case for many commercial fisheries in nations like the United States (e.g., Hilborn et al., 2020) could help improve the quality of the information.Improved provincial production reporting will help to remedy the mismatches that often happen between what countries report to the FAO and what they publish in regional or national reports (Metian et al., 2014).Importantly, better data can also improve governance and sustainability of the sector itself by increasing the ability to accurately assess environmental and economic impacts, improve the predictability of farming practices over time, track volatility of the system long-term, and support adaptive responses under the threat of increasing pressures to the system (e.g., climate change, COVID-19) (Froehlich et al., 2021;Hishamunda et al., 2016).
Major producers with the capacity to collect and provide data to international organisations, such as the FAO, should be incentivized to do so.For example, the USA has a paucity of spatial data on general finfish and shrimp aquaculture locations, while China, the world's largest mariculture producer severely lacks spatial data across all species groups.The spatial data that does exist for China is derived from satellite data and neural network modeling, which is not ground truthed and lacks specifics on species type (Fu et al., 2021).China has greatly improved the quality of statistical reporting in recent years (Cao et al., 2015;Wang et al., 2015), however sights should now be set on spatially explicit information.Both countries have extensive agricultural data services and infrastructure, and the ability to mobilise vast resources to improve understanding of the aquaculture sector and how to improve its management (e.g., Cui et al., 2018;U.S. Department of Agriculture, 2021).Globally, the FAO is likely the best place to coordinate national efforts to ensure data reporting, collation, and comparability, based on its expertise in collating and reporting production and trade information that have greatly improved our understanding of food systems.Currently, efforts to do this exist within the FAO (FAO, 2019b), but further resources are urgently needed to create a more comprehensive data set and to provide clearer guidelines for the minimum reporting standards.
Extending this level of support for the fastest growing food sector in the world, aquaculture, should, we believe, be a priority over the coming years.Elsewhere, funding bodies could prioritise the collection and collation of high-quality, spatially explicit data on aquaculture production systems, incorporating species, production type, yields, and spatial extent where feasible (Halpern et al., 2019).Examining how countries such as Norway achieve excellent data coverage, including farm-level information, could provide guidance as to how to improve matters elsewhere.We also renew calls to support the free and open sharing of such data through organisations such as the FAO (FAO, 2019b), as well as national and subnational equivalents (Halpern et al., 2019).
Mariculture is a prominent and rapidly growing sector that has the potential to supply relatively low-impact, high quality nutrition to millions more people, but can also have lasting impacts on marine and coastal environments and socio-ecological systems.To systematically and sustainably guide mariculture production, reduce competition with other human activities for land and water resources (Ottinger et al., 2016), and plan for the potential effects of climate change, fine-scale, spatially-explicit production data is critical.Our generalized modeling approach provides a look at the global distribution of mariculture farms, something that has never been available before, and could be further improved through more sophisticated approaches based on farm level attributes (e.g., production type, regional production statistics).Coordinated efforts should focus on filling key data gaps and providing support and guidance for creating a global, standardized mariculture location database to better inform policies and practices across scales.

Declaration of Competing Interest
HEF is a member of the Technical Advisory Group of the Aquaculture Stewardship Council.All other authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this article.

Fig. 1 .
Fig. 1.Schematic of workflow required for estimation and mapping of each data type.

Fig. 3 .
Fig. 3. Example mariculture maps zoomed in on smaller areas, using the Open Street Map (https://www.openstreetmap.org)data as background mapping.(A) Chile (part of Aysén region); (B) Ecuador (part of Guayas province); (C) Australia (part of the Spencer Gulf); and (D) China (part of Zhejiang province).

Table 2
Average tonnes per farm per species class estimates based on comprehensive farm information.* * Non-shrimp crustaceans assume the same value as shrimps and prawns.G.Clawson et al.