An overview of richness and distribution of mosses in Brazil

Background and aims – Mosses comprise avascular terrestrial plants whose relationship with other plant lineages is not yet fully understood. These plants have a worldwide distribution, but gaps in their distribution have not yet been clarified for Brazil. Based on a large database, compiled from different sources, we present an overview of the moss distribution in Brazil in order to assess the species richness in different areas, as well as the factors that interfere with this distribution. Material and methods – The study area corresponds to the whole Brazilian territory. We collected data on moss occurrences using different online databases and bibliographies. The data were refined, keeping only the records with taxonomic identification of the species, using valid names and correct geographical coordinates We subsequently plotted the records on a map with a 1° × 1° grid pattern. To ascertain the representativeness of the grid, an analysis of estimated richness was carried out. Key results – A total of 969 species of moss were surveyed, from 26 690 records obtained. The number of species per cell ranged from 1 to 242, and 394 cells were occupied for a total of approximately 1 300 cells. Moss richness in Brazil is subjected to varied sampling effort. The Atlantic Forest showed the greatest richness, both as a result of favourable environmental conditions as well as due to a greater sampling intensity. With the exception of a few localities, the Amazon domain had a low sampling and, consequently, a low richness. Conclusions – The results show that the higher richness is observed in the southern and central parts of Brazil, and this is because of the occurrence of areas that have some type of protection (conservation units), environmental conditions related to high humidity, high elevations, and greater sampling effort.


INTRODUCTION
Mosses are avascular, cryptogamic plants that, together with liverworts and hornworts, are part of the group of bryophytes, for which there is still no consensus about their phylogenetic relationships (Goffinet et al. 2009;Morris et al. 2018;Puttick et al. 2018). Several studies have been carried out since the beginning of the 1980s aiming at improving knowledge about the bryoflora of Brazil (Yano 1981(Yano , 1984(Yano , 1989(Yano , 1995(Yano , 1996(Yano , 2004Costa et al. 2011). According to some authors (Costa et al. 2011;Costa & Peralta 2015;BFG 2018), about 1500 species of bryophytes are reported for Brazil, and about 880 of them are mosses. Southeastern and southern Brazilian regions are reported as the richest for the country (BFG 2018). Costa & Peralta (2015) have stressed the importance of taxonomic studies and improved floristic surveys to improve the current knowledge about species occurrences.
Regarding the richness in the five Brazilian phytogeographical domains (Fiaschi & Pirani 2009), the Atlantic Forest appears to be the richest domain, including almost 90% of the mosses known for the country, followed by the Cerrado (a heterogeneous domain, formed mainly by tropical savannas) and Amazon domains, and lastly there are the Campos sulinos (Brazilian southern grasslands) and the Caatinga (a domain formed mainly by seasonally dry tropical forests) (BFG 2018).
The insufficient availability of biodiversity data is a recurrent problem in a scenario of accelerated species loss and degraded habitats (Bisby 2000;Costello & Wieczorek 2014). Prance (1977) already stated that the diversity in the tropics was already reduced before a basic inventory was made. Biodiversity shortfalls exist because of poor knowledge for species on: taxonomy (Linnean), distribution (Wallacean), abundance (Prestonian), evolutionary patterns (Darwinian), abiotic tolerance (Hutchinsonian), species traits (Raunkiaeran), and biotic interactions (Eltonian) (Hortal et al. 2015). We highlight herein the Linnean and Wallacean shortfalls, constituting the approach of our study. Linnean shortfall is referred to as a gap in the number of known species in relation to what actually exists, considering the species that have not yet been sampled, mainly in regions not explored by the researchers, and by the insufficiency of specialists in a group and low financing for studies with a taxonomic approach (Diniz-Filho et al. 2013). The Wallacean shortfall is caused by geographic biases that are extremely related to the sampling effort in one or a few areas, causing distortions in the knowledge of the actual distribution of the species (Hortal et al. 2008). Sampling effort is linked to several factors such as the proximity to research centres (causing the "museum effect"), the presence of a specialist or group of specialists, the proximity to roads, and an insistence on sampling areas of high richness or rare species (Moerman & Estabrook 2006;Hortal et al. 2008;Sastre & Lobo 2009;Boakes et al. 2010;Oliveira et al. 2016). In this way, sampling efforts in the same areas can lead to biases in understanding the real distribution of species (Oliveira et al. 2016;. Thus, understanding and locating gaps in biodiversity can provide guidelines for conservation (Bini et al. 2006). Accordingly, the small-scale efforts to increase knowledge of the flora through floristic and systematic surveys, herbaria, and species lists are of great importance as they are information sources for large-scale projects (Thomas et al. 2012). Institutions and/or groups of researchers conducting such research can fill knowledge gaps and increase our understanding of the local biodiversity (Ponder et al. 2001;Moerman & Estabrook 2006).
Online databases with biodiversity data allow us to group species occurrence data from various sources such as herbaria and museum collections, as well as data from periodicals (Yesson et al. 2007). In this context, such databases have successfully been used in plant ecology, providing information for the analysis of species distribution (Werneck et al. 2011;Barros et al. 2012;Alvez-Valles et al. 2018), for the investigation of endemic areas (Echternacht et al. 2011;Werneck et al. 2011;Menini Neto et al. 2016;Alvez-Valles et al. 2018), and for our understanding of the effects of climate change on plants (Feeley & Silman 2011;Patiño et al. 2016).
The main aim of this study is to understand the distribution of moss species in Brazil based on online databases in order to detect areas with the highest richness values, to identify sampling bias, and to evaluate the quality of online databases for large-scale studies on this plant group.

MATERIAL AND METHODS
The study area corresponds to the whole Brazilian territory. The shapefile was obtained from the website of the Ministry of Environment (MMA 2014) (http://mapas.mma.gov.br/ i3geo/datadownload.htm). The classification of the five phytogeographic domains was based on Fiaschi & Pirani (2009).

Obtaining and refining data
Data on moss occurrences were mainly obtained from the botanical collections database, available online at SpeciesLink (http://www.splink.org.br/), and complemented with the virtual databases of the Rio de Janeiro Botanical Garden herbarium (http://jabot.jbrj.gov.br/) and the New York Botanical Garden herbarium (New York Botanical Garden 2014) (http://sweetgum.nybg.org/science/). Different references (data not yet available online) were also used up until 2015 (Amorim 2013;Souza et al. 2015;Weber et al. 2015). This initial database consisted of 126 863 records for Brazil.
The refinement of the data consisted of different processes and stages. First, we only retained those specimens identified at species level and for which the determination was made by specialists, to ensure the reliability of the information. We subsequently excluded records without any precise identification of the sample origin, as well as the exclusion of duplicate data. The geographical coordinates were checked for each record, and when absent it was obtained through Google Earth. When not found, the coordinate of the indicated municipality was taken as the collection site. The names were checked observing the synonyms by consulting the Tropicos.org database (Missouri Botanical Garden 2016), and only valid names were retained.

Quantitative richness analysis
For this study, the records were plotted on a map, and a grid of 1° × 1° was later produced, thus with proportional size. We subsequently analyzed the richness and the number of records per cell with the purpose of discovering the number of species within the country. An estimated richness analysis was performed using the Jackknife 2 estimator to verify the representativeness of the cells, which takes into account the number of species in more than one sample (i.e. grid cell) (Magurran 2011). A Spearman's correlation test between the number of collections (records) and the estimated richness was performed to compare the sampling and the potential species richness. The analyses were performed using Diva-GIS v.7.5 (Hijmans et al. 2012) and Past v.3.19 (Hammer et al. 2001).

RESULTS
With this study, we provide a more precise approach to investigate moss richness in Brazil. In total, 969 species of mosses were collected from 26 690 records; 394 grid cells were occupied, for a total of approximately 1 300 cells, with the number of species per cell ranging from 1 to 242 ( fig. 1). The records are listed by occurrence in the Brazilian states and phytogeographic domains in supplementary file 1.
Moss richness was separated into the following seven classes of equal intervals of richness and the cells were separated with only a single species, record, and estimate of occurrence: maximum (242 to 211), very high (210 to 176), high (175 to 141), mean (140 to 106), regular (105 to 71), low (70 to 36), and very low (35 to 2) ( fig. 2A-B). Only two cells presented "maximum richness", which coincides with the regions where the Federal District and the coast of the state of São Paulo are situated (in a stretch of the Serra do Mar State Park). Three other cells were found in areas that also have high richness values such as Serra do Cipó National Park (Minas Gerais state), Itatiaia National Park, part of the Serra do Mar State Park (on the boundary between the states of Minas Gerais, Rio de Janeiro, and São Paulo), and the southwestern region of Paraná state; one cell was found in the southwestern region of the state of Paraná. These areas are generally connected or close to areas with regular richness (> 70 species). We highlight the northeastern regions of the state of Rio Grande do Sul, with cells with high richness ranging from 175 to 106 species from the coast of Paraná to the southern coast of Rio de Janeiro. In the state of Minas Gerais, we also highlight the southeastern region in the Serra da Mantiqueira, and the Caparaó National Park region closer to the border with Espírito Santo state; these areas also present very high richness values. Some regions deserve special mention since they presented moss richness between 100 and 80, considered herein as regular, namely the Chapada Diamantina National Park region in the centre of Bahia, the centre of the state of Espírito Santo with the Vale do Rio Doce Natural Reserve and the Pedra Azul State Park, the centre-west of Brazil in the state of Goiás with the Chapada dos Veadeiros National Park, and the state of Amazonas near the cities of Manaus and São Gabriel da Cachoeira. Generally, most of the grid cells presented richness values between 35 and 2. The state of Pará has few sampled areas, and most of the state did not present collection records for mosses. Other areas that stand out as places with low richness and sampling are Amapá, Acre, and Mato Grosso (central and western regions), Ceará (central and southern regions), Amazonas, and Rondônia (central and southwestern region).
The analysis of the number of records was qualified into nine arbitrary classes due to the greater variability of the data and with the purpose of refining the observations about the sampling data in the Brazilian territory ( fig. 2C-D).  The analysis showed that most of the cells had between 200 and 2 collection records, and only three cells had more than 1000 records. The Jackknife 2 estimator showed very high potential richness in relation to the number of species already recorded ( fig. 2E-F). Thus, similar to the richness values and for the purpose of comparison, the obtained values were divided into seven classes. The cells with maximum richness had an estimated potential of 425 to 372 species. This corresponds to an increase of 80% in registered richness. The estimated potential for very low richness was 60 to 2 species, corresponding to an increase of 58%. Although the richness estimators present a potential estimate, the values reveal that the areas in Brazil have a large sampling deficiency, which makes its real interpretation difficult at this moment.
From the perspective of phytogeographic domains ( fig. 2B, D, and F), the Atlantic Forest contains cells with maximum richness (242 to 211). With the exception of a few cells, it was shown to be practically all cells had at least one record, as well as the areas that correspond to Campos sulinos, in which practically all the cells had some data, but with very low richness (35 to 2). The Cerrado domain also showed maximum richness, however, several areas in the centre and north of the domain do not have records or collections. When we observe the domain of the Caatinga, most of the areas are sampled, but the richness is very low (35 to 2), with the exception of the Chapada Diamantina National Park in the centre of Bahia. The Amazon showed very low richness (35 to 2) in practically all areas that correspond to this domain. This was the domain that presented the biggest knowledge gap with extensive areas that did not contain a single record.
The correlation between the estimated richness and the number of records was high and statistically significant (ρ = 0.94, p < 0.001), showing that the majority of the cells that presented high richness were those with the highest numbers of records. Thus, in a contextualized way, the greatest richness is tied to the highest number of collections, although this is not the only richness regulator within an area.

Moss richness in Brazil
The 969 species listed in this study represent a landmark in the knowledge of mosses and it is important for further floristic studies in Brazil. We found about 8% more species than the currently listed number in the Brazilian Flora (BFG 2018). These data demonstrate the value of records available in databases and recent publications (SpeciesLink 2002;JBRJ 2014;Amorim 2013;Souza et al. 2015;Weber et al. 2015). According to recent species estimates, the number of species of mosses in Brazil could vary between 880 and 892 accepted valid taxa, referring to the last decade (Costa & Luizi-Ponzo 2010;Costa et al. 2011;Yano 2011;Costa & Peralta 2015). Thus, showing mainly that knowledge about the species number of mosses is scarce in Brazil.
We emphasize that we only retained the data for the species with collection records for which extraction of geographical coordinates was possible. By doing so, we noticed that information on the collection site is not always available for all species known for Brazil. The collection of associated species is common for bryophytes in general (Rydin 2009), however, a limitation occurs when these data are entered into BRAHMS (a software program for research and curation management of biological collections commonly used by Brazilian herbaria). It is not possible to enter more than one specific name in BRAHMS; therefore, the information about associated species is lost in the database when these is more than one species per sample and only one species is registered. However, this does not mean that the species does not occur in a particular area, it only means that the record is not available in an online database. The data available in the SpeciesLink database show that 160 of the 883 moss species recorded in that database do not have records on the platform, 282 have one to five records, 194 have six to 20 records, and only 247 have more than 20 records (http://lacunas.inct.florabrasil.net/2013/index). These data show that about 73% of the species cited for the country have less than six records. This is in agreement with the record analysis, suggesting that mosses in Brazil are undersampled. Based on this, it is crucial to increase the collection size and the databases. Glime & Wagner (2017) showed that curators adopt different methodologies to catalogue the plants in the herbarium when there are mixtures of bryophytes. We suggest one of the proposals to catalogue species that grow mixed, such as the one used by Genevieve Lewis-Gentry and co-workers (Glime & Wagner 2017). Then, put a separate barcode for each different plant on the single packet and note all the other species in a remarks field. If possible, separate the plants into different packages. In this way, the database has only one entry for each record, minimizing the problem of undersampling.
The Atlantic Forest domain had the highest richness in our study, mainly in the mountainous regions, as also mentioned in the literature (Costa & Luizi-Ponzo 2010;Costa & Peralta 2015). This can be explained by several factors facilitating the establishment of bryophyte species, namely: the variety of microhabitats, the topographic mosaic (elevational gradient), and the high water availability (high levels of rainfall and humidity) (Costa & Peralta 2015;Batista & Santos 2016). This domain not only presents the greatest richness of mosses, such as bryophytes in general, but also those of angiosperms, ferns, and some animal groups (Myers et al. 2000;Stehmann et al. 2009;Werneck et al. 2011;Both et al. 2014;Costa & Peralta 2015;Oliveira et al. 2017), being considered one of the conservation hotspots in the world (Mittermeier et al. 2004).
We found that phytogeographic domains like the Caatinga are areas with a lower richness, even though they have a low sampling. The conditions that favour high richness, such as high humidity (Glime 2007), are not present in the Caatinga, which presents a vegetation that varies from an open thorny scrubland to low dry forests, conditioned by a prevailing semiarid climate (Fiaschi & Pirani 2009). Likewise, the predominance of open grassy formations (Fiaschi & Pirani 2009) in Campos sulinos does not contribute to a greater number of species of mosses, since mosses usually prefer substrates such as tree trunks and more elevated areas (Rydin 2009). The Amazon domain has all the necessary conditions for a greater species richness than what is observed (Fiaschi & Pirani 2009), however, the richness found there was not high for most cells. Costa & Peralta (2015) observed that, in Brazil, the Amazon forest has less than 50% of the number of species as compared to the Atlantic Forest, being observed by us in well-documented areas (e.g. northwest of the Amazon domain, or in the areas surrounding Manaus). Considering the microhabitats and the availability of water, we can say that the Amazon presents conditions similar to those of the Atlantic Forest but does not have the same latitudinal (3° to 30°) and elevational gradient, due to the Serra do Mar, Serra Geral, and Serra da Mantiqueira. Thus, it contains less heterogeneity of habitats, consequently less richness (Rydin 2009). In the Amazon domain, the variation in moss richness in different phytophysiognomies has already been reported (Moraes & Lisboa 2006) and it has been shown that the greatest richness occurs in areas of terra firme forest or with less anthropic impact (Lisboa et al. 1999;Moraes & Lisboa 2006). Even in areas where species diversity was considered high, endemism was low . Therefore, cells with no or a low number of occurrences are very likely to be undersampled. In addition, according to BFG (2018), this domain has many collection gaps, which makes it difficult to ascertain the true richness of species in the area.
The mosaic of phytophysiognomies that are present within the Cerrado domain (Batalha 2011;Fiaschi & Pirani 2009) shows a great potential for a high richness of mosses, having several fragments of seasonal forests, formations in which larger trees predominate, whose crowns form a canopy and whose main growth patterns are also associated with wet seasons (Batalha 2011). The Cerrado seems to have a great potential for study, as it is the only domain that presented maximum richness. In 2009, the addition of collected specimens and taxonomic studies corroborate these data, which culminated in the publication of Flora do Distrito Federal (Cavalcante & Amaral-Lopes 2017). However, areas with greater richness are also those with the highest numbers of records. These areas also generally correspond to the focus areas of the herbaria that house the largest bryophyte collections and the largest research centres, such as the Herbarium of the University of Brasília (UB), the Botany Institute of São Paulo (SP), and the Rio de Janeiro Botanical Garden (RB), together with the presence of researchers specialized in bryophyte taxonomy. There is strong evidence for sampling bias, the so-called "museum effect" (Ponder et al. 2001), which implies greater numbers of species in areas close to specialized institutions. This sampling bias occurs for different groups and in different parts of the world: for birds on the European and Asian continents (Boakes et al. 2010), for terrestrial invertebrates in Australia (Ponder et al. 2001), for Coleoptera in Spain (Hortal et al. 2008), and for spiders and angiosperms in the Atlantic Forest (Werneck et al. 2011;Oliveira et al. 2017).

The museum effect constitutes a problem for studies on biodiversity distribution, but an opportunity for knowledge
Taxonomic surveys are uncoordinated on the assumption of an absence of common sense on floristic research guidelines and tend to repeatedly examine ("taxonomic insistence") some localities and landscape types that have previously been recognized as areas with high species richness, creating distortions in data (Soberón & Peterson 2004;Moerman & Estabrook 2006;Sastre & Lobo 2009). The oversampling near institutions occurs for reasons of efficiency, logistics, and convenience (Moerman & Estabrook 2006;Sastre & Lobo 2009). These distortions cause precision problems in the geographical representations of species richness generated using the individually compiled taxonomic and distributional information by the taxonomists (Sastre & Lobo 2009;Oliveira et al. 2016Oliveira et al. , 2018. The studies for bryophytes tend to be directed to areas with high humidity, thus allowing for the establishment of a greater number of species (Glime 2007), as well as mountain areas which present different environmental gradients, and therefore allow for the establishment of different microclimates and a greater richness of bryophytes (Batista & Santos 2016;Amorim et al. 2017;Santos et al. 2017). Higher richness values are additionally related to areas that have some type of protection such as conservation units of different categories. This is justified by the already mentioned factors of high humidity and different environmental gradients, and mainly because these are areas with a greater degree of conservation, presenting less anthropic disturbance and the possibility of finding a more intact flora, even with low efficiency (Silva et al. 2014).
Our results allowed us to locate the areas with the greatest sampling deficiencies and thus showing how to reduce the Linnean and Wallacean shortfalls. The strong influence of the Linnean shortfall makes the actual distribution of moss species in the country unclear, especially for those that are endemic or undersampled, thereby showing evidence of Wallacean shortfall. Thus, moss richness in Brazil is also influenced by sampling bias and may lead to potential problems for ecological studies. The Linnean shortfalls for mosses occur for two reasons, namely that there are areas that were not sampled or do not have their herbarium records in an online database, and/or the low number of researchers in relation to area in Brazil is low, suggesting that some species may not have been studied or not even collected yet (Hortal et al. 2015). Riddle et al. (2011) found that the body size of the individuals is often correlated with the rate of new species discovery, and mosses usually do not exceed 10 cm (Glime 2007), showing that smaller organisms tend to be more susceptible to Linnean shortfall (Hortal et al. 2015). Moss sampling was low in most of Brazil, which can be solved by diffusing and/or correcting the effects mentioned. With the recent budget cuts for research in Brazil (Angelo 2016(Angelo , 2017, the pursuit of biodiversity knowledge, including mosses, may become difficult due to the low number of researchers and the lack of funding from other sources. In addition to the changes in the Forest Code (Meira et al. 2016), deforestation and mining (Roriz et al. 2017) can generate irreparable losses. In this way, sampling deficiency can be rectified by enhancing the sampling effort in the "very low richness" areas. In the study conducted with five Metzgeria Raddi species (liverworts), only 49% of the potential distribution range was covered by forests (Barros et al. 2012), demonstrating the high sensitivity and fragility of bryophyte habitats. From this percentage, not all fragments that have high areas of adequacy present records and are considered forests with potential to harbour still unexplored taxa (Barros et al. 2012). Regarding angiosperms in the Atlantic Forest, the most representative inventories should be expanded to poorly sampled areas (Werneck et al. 2011).
Areas with low data availability, such as the southern centre of the state of Pará and southeastern Amazon, both in the Amazon domain, are regions that require the installation of research centres. Fieldwork in this area is relatively costly due to its great extent and the difficulties to access sampling points. Thus, the installation of research centres in these areas would increase the number of records, thereby increasing our knowledge of the local flora (Moerman & Estabrook 2006). The lower richness for mosses in certain areas, such as the transition zone between the Amazon and the Cerrado in northern Mato Grosso and Tocantins or the Caatinga domain, is not because of low sampling effort, but mainly because of the lack of ideal environmental conditions for these species, such as high humidity (Glime 2007;Fiaschi & Pirani 2009).
According to Costa & Peralta (2015), with the exception of the state of Piauí, all Brazilian states showed an increase in the number of bryophyte species when compared to the richness of bryophyte species observed five years earlier (Costa & Luizi-Ponzo 2010). A recent publication of the BFG (2018) summarized the projects in recent years that have substantially increased the knowledge of the Brazilian flora, mainly highlighting financed projects of "Brazilian species list", "Reflora", and "Flora do Brasil 2020". This work also highlights the programs created for biodiversity knowledge: the Biodiversity Research Program, the Taxonomy Training Program, the National Institutes of Science and Technology, National Biodiversity Research System, the Brazilian Biodiversity Information System, and the Taxonomic Catalogue of Brazilian Fauna. Three years after the publication of the "Brazilian species list", there was a significant increase in knowledge for some important groups of terrestrial plants, such as the addition of 470 species of angiosperms, 77 species of ferns and lycophytes, and 44 species of bryophytes (BFG 2018). Thus showing that the increase in the number of educated specialists, scientific studies, and funding in engaged projects improve biodiversity knowledge. In other words, this increase is the beginning of the knowledge about the true richness of the Brazilian flora. We believe that floristic studies can be optimized with coordinated taxonomic research (Sastre & Lobo 2009), and our knowledge of the actual distribution of moss species will significantly increase. Thus, the proposals mentioned above are important to reduce the Wallacean shortfall.
The moss richness in Brazil reflects varying sampling intensities for different regions, with areas of the Atlantic Forest presenting greater richness values and also greater sample efforts, being similar to the southeastern region of the country. Moss richness is strongly influenced by sampling bias, the so-called "museum effect", in which areas that present greater richness values are also those with higher numbers of records. In this sense, the establishment of research centres and/or researchers in under-sampled areas would increase the number of records, thereby increasing our knowledge of the local flora. Greater addition, greater availability, and improved quality of moss data are required in online databases. If taxonomic research is coordinated, the tendency is to optimize floristic studies and to approximate the actual distribution of moss species.

SUPPLEMENTARY FILE
Supplementary file 1 -Database with the 26 690 records of 969 moss species in Brazil with indication of occurrence in the Brazilian states and the Phytogeographic Domains. https://doi.org/10. 5091/plecevo.2021.1635.2427