Data Mining for Global Trends in Mountain Biodiversity

Mountains have always been a favorite environment for biodiversity research because their steep environmental gradients and isolation have turned them into evolutionary laboratories. As such, they provide perfect locations to study environmental controls of elevational and latitudinal biodiversity gradients if suitable data sets are available. Recent advances in bioinformatics, including the development of powerful data mining techniques and large digital biodiversity and other environmental data sets, have triggered a new wave of biodiversity research at large spatial and temporal scales. In brief, datamining techniques are used to uncover patterns in data, for example with the aim to test ecological and evolutionary hypotheses. Although this is a rapidly expanding research field, few books deal specifically with the application of data mining techniques in biodiversity research, and up to now there has been no such book focusing exclusively on mountain environments. This book, the third publication in the Global Mountain Biodiversity Assessment (GMBA) series with CRC Press, seeks to fill this gap and stimulate the creative use of georeferenced biodiversity data ‘‘to answer old questions with new tools’’ by ‘‘highlighting the scientific power of nine biological databases for furthering ecological and evolutionary theory related to mountain biota’’ (p vii). Over 70 authors have contributed to the book, which is largely an outcome of two GMBA workshops held in 2006 and 2007. The book opens with a brief preface by the editors, which sheds light on the context and scope of the book. With the exception of Chapter 16, the synthesis of the book, the following chapters are essentially individual papers with few, if any, cross-references. As there is no introduction that gives guidance on the content of the book, we recommend starting with Chapter 16, which provides some of the structure and synthesis that a reader might otherwise be missing. Chapter 1 then demonstrates well the role and power of geophysical information systems for exploring and explaining mountain biodiversity. Chapters 2–4 discuss general issues such as the availability and limitations of primary biodiversity data, the need to check the completeness and quality of such data before they can be used in analyses, and the paramount importance of high-quality metadata. Chapters 5–15 are essentially individual case studies: five from Europe, two each from China and North America, and one from South America. Together, these case studies cover all levels of biodiversity, from genes to species and ecosystems; most analyze biodiversity patterns and processes along elevational gradients. The geographically most comprehensive case study (Chapter 8), for example, compares elevational gradients of regional and local species richness based on data sets from four continents; one of its key conclusions is that data mining of regional archive data provides different insights into biodiversity patterns and processes than local field data, and that both approaches are thus complementary and cannot replace each other. Highly relevant from a practical perspective are Chapters 12 and 14, which show how georeferenced databases can act as management tools for biodiversity conservation and protected area management, and Chapters 13–15, which demonstrate how data mining techniques can be used to study climate change effects on mountain biota. Chapter 17, previously published in Mountain Research and Development (vol 27, no. 3, pp 276– 281), is somewhat strangely placed at the end of the book, as the GMBA’s research agenda contained therein would have provided an excellent framework early on in the book, for example, for the selection and sequence of individual case studies. Unfortunately, given the price of the book, there are some shortcomings. The book is entirely illustrated in black and white, and while this is sufficient for most figures, it renders some illegible (eg Figure 6.1). The book includes neither a list of tables and figures nor a consolidated reference list, and the index is not very useful or well structured. It seems to largely replicate the structure of individual chapters, so that for example the terms ‘‘data analysis’’ or ‘‘data collection’’ do not have their own entries as one would expect; instead they are listed several times under the shortened titles of individual chapters. Overall, the book provides a rich resource of valuable information and stimulation for those who are willing to dig into the detail of the individual chapters. As a whole, it demonstrates well how data mining techniques can complement, but not necessarily replace, expensive experiments, thus furthering ecological and evolutionary theory. However, it is not an easily accessible reference book for data mining novices because too much of the valuable information on approaches, limitations, and lessons learned is scattered across the individual chapters. We suspect that this book might not attract as broad an audience as the first publication in the GMBA series (Körner and Spehn 2002), for example, but it will certainly appeal to a select audience including mountain researchers and macroecologists. With a view to the future, the book points the reader to the Mountain Biodiversity Portal (http://www. mountainbiodiversity.org) that has just been launched by the GMBA and the Global Biodiversity Information Facility (GBIF). This tool has the potential to greatly facilitate access to mountain biodiversity data because it allows users to find GBIF data for specific elevational and thermal belts within their region of interest. A very MountainMedia Mountain Research and Development (MRD) An international, peer-reviewed open access journal published by the International Mountain Society (IMS) www.mrd-journal.org

Mountains have always been a favorite environment for biodiversity research because their steep environmental gradients and isolation have turned them into evolutionary laboratories. As such, they provide perfect locations to study environmental controls of elevational and latitudinal biodiversity gradients if suitable data sets are available. Recent advances in bioinformatics, including the development of powerful data mining techniques and large digital biodiversity and other environmental data sets, have triggered a new wave of biodiversity research at large spatial and temporal scales.
In brief, data mining techniques are used to uncover patterns in data, for example with the aim to test ecological and evolutionary hypotheses. Although this is a rapidly expanding research field, few books deal specifically with the application of data mining techniques in biodiversity research, and up to now there has been no such book focusing exclusively on mountain environments. This book, the third publication in the Global Mountain Biodiversity Assessment (GMBA) series with CRC Press, seeks to fill this gap and stimulate the creative use of georeferenced biodiversity data ''to answer old questions with new tools'' by ''highlighting the scientific power of nine biological databases for furthering ecological and evolutionary theory related to mountain biota'' (p vii).
Over 70 authors have contributed to the book, which is largely an outcome of two GMBA workshops held in 2006 and 2007. The book opens with a brief preface by the editors, which sheds light on the context and scope of the book. With the exception of Chapter 16, the synthesis of the book, the following chapters are essentially individual papers with few, if any, cross-references. As there is no introduction that gives guidance on the content of the book, we recommend starting with Chapter 16, which provides some of the structure and synthesis that a reader might otherwise be missing. Chapter 1 then demonstrates well the role and power of geophysical information systems for exploring and explaining mountain biodiversity. Chapters 2-4 discuss general issues such as the availability and limitations of primary biodiversity data, the need to check the completeness and quality of such data before they can be used in analyses, and the paramount importance of high-quality metadata.
Chapters 5-15 are essentially individual case studies: five from Europe, two each from China and North America, and one from South America. Together, these case studies cover all levels of biodiversity, from genes to species and ecosystems; most analyze biodiversity patterns and processes along elevational gradients. The geographically most comprehensive case study (Chapter 8), for example, compares elevational gradients of regional and local species richness based on data sets from four continents; one of its key conclusions is that data mining of regional archive data provides different insights into biodiversity patterns and processes than local field data, and that both approaches are thus complementary and cannot replace each other.
Highly relevant from a practical perspective are Chapters 12 and 14, which show how georeferenced databases can act as management tools for biodiversity conservation and protected area management, and Chapters 13-15, which demonstrate how data mining techniques can be used to study climate change effects on mountain biota. Chapter 17, previously published in Mountain Research and Development (vol 27, no. 3, pp 276-281), is somewhat strangely placed at the end of the book, as the GMBA's research agenda contained therein would have provided an excellent framework early on in the book, for example, for the selection and sequence of individual case studies.
Unfortunately, given the price of the book, there are some shortcomings. The book is entirely illustrated in black and white, and while this is sufficient for most figures, it renders some illegible (eg Figure 6.1). The book includes neither a list of tables and figures nor a consolidated reference list, and the index is not very useful or well structured. It seems to largely replicate the structure of individual chapters, so that for example the terms ''data analysis'' or ''data collection'' do not have their own entries as one would expect; instead they are listed several times under the shortened titles of individual chapters.
Overall, the book provides a rich resource of valuable information and stimulation for those who are willing to dig into the detail of the individual chapters. As a whole, it demonstrates well how data mining techniques can complement, but not necessarily replace, expensive experiments, thus furthering ecological and evolutionary theory. However, it is not an easily accessible reference book for data mining novices because too much of the valuable information on approaches, limitations, and lessons learned is scattered across the individual chapters. We suspect that this book might not attract as broad an audience as the first publication in the GMBA series (Kö rner and Spehn 2002), for example, but it will certainly appeal to a select audience including mountain researchers and macroecologists.
With a view to the future, the book points the reader to the Mountain Biodiversity Portal (http://www. mountainbiodiversity.org) that has just been launched by the GMBA and the Global Biodiversity Information Facility (GBIF). This tool has the potential to greatly facilitate access to mountain biodiversity data because it allows users to find GBIF data for specific elevational and thermal belts within their region of interest. A very MountainMedia Mountain Research and Development (MRD) An international, peer-reviewed open access journal published by the International Mountain Society (IMS) www.mrd-journal.org similar tool already allows users of the World Database on Protected Areas (http://www.wdpa.org) to find GBIF data for a protected area of interest. Thanks to these collaborative efforts, researchers will increasingly get the data they require without the need to carry out time-consuming overlays of species and other data sets for their region of interest. The GMBA/GBIF Mountain Biodiversity Portal is a fine example for the technical possibilities of our time and will certainly help to further stimulate the creative use of georeferenced biodiversity data promoted by this book.