Literature records in MZNA-LIT: primary biodiversity records in environmental assessments in Spain

Occurrence Observation
Latest version published by University of Navarra – Department of Environmental Biology on Nov 19, 2024 University of Navarra – Department of Environmental Biology

Download the latest version of this resource data as a Darwin Core Archive (DwC-A) or the resource metadata as EML or RTF:

Data as a DwC-A file download 1,263 records in English (89 KB) - Update frequency: as needed
Metadata as an EML file download in English (37 KB)
Metadata as an RTF file download in English (19 KB)

Description

The data set comprises primary biodiversity records (PBR) on protected species identified during environmental assessments in Spain from 2013 to 2023. The data were extracted from Records of Decisions (RODs) published in the Spanish Official State Gazette, focusing on species listed under the Spanish Catalogue of Threatened Species and the List of Wild Species under Special Protection Regime. These EA-related data belong to dark data, stuck in Records of Decision and thus rarely accessible, limiting their availability for other conservation purposes. Through automated data extraction and manual verification, this data set offers standardized and georeferenced EA-related dark data for future conservation planning and decision-making.

Data Records

The data in this occurrence resource has been published as a Darwin Core Archive (DwC-A), which is a standardized format for sharing biodiversity data as a set of one or more data tables. The core data table contains 1,263 records.

1 extension data tables also exist. An extension record supplies extra information about a core record. The number of records in each extension data table is illustrated below.

Occurrence (core)
1263
Reference 
1263

This IPT archives the data and thus serves as the data repository. The data and resource metadata are available for download in the downloads section. The versions table lists other versions of the resource that have been made publicly available and allows tracking changes made to the resource over time.

Versions

The table below shows only published versions of the resource that are publicly accessible.

How to cite

Researchers should cite this work as follows:

MZNA (2024). Literature records in MZNA-LIT: primary biodiversity records in environmental assessments in Spain. v1.2. University of Navarra, Museum of Zoology. Occurrence dataset. https://doi.org/10.15470/bvznpy

Rights

Researchers should respect the following rights statement:

The publisher and rights holder of this work is University of Navarra – Department of Environmental Biology. This work is licensed under a Creative Commons Attribution (CC-BY 4.0) License.

GBIF Registration

This resource has been registered with GBIF, and assigned the following GBIF UUID: 3d6fbb6d-8699-4f93-9adc-c1fd6bc03f4e.  University of Navarra – Department of Environmental Biology publishes this resource, and is itself registered in GBIF as a data publisher endorsed by GBIF Spain.

Keywords

Occurrence; Observation; Environmental Assessment; Protected species; Public archives; Dark Data

Contacts

Maite Telletxea
  • Metadata Provider
  • Originator
  • Point Of Contact
  • PhD student
University of Navarra
31008 Pamplona
Navarra
ES
MZNA Museum of Zoology
  • Originator
  • Institution
University of Navarra
31008 Pamplona
Navarra
ES
David Galicia
  • Curator
University of Navarra
31008 Pamplona
Navarra
ES
Rafael Miranda
  • Author
University of Navarra
31008 Pamplona
Navarra
ES
Arturo H. Ariño
  • Custodian Steward
University of Navarra
31008 Pamplona
Navarra
ES
Ángel Chaves
  • Curator
University of Navarra
31008 Pamplona
Navarra
ES
Ana Amézcua
  • Curator
University of Navarra
31008 Pamplona
Navarra
ES
María Imas
  • Curator
University of Navarra
31008 Pamplona
Navarra
ES

Geographic Coverage

The data set primarily comprises occurrence records from Peninsular Spain (99.84%). It also includes two other records from the Balearic and Canary Islands.

Bounding Coordinates South West [28.951, -13.61], North East [43.659, 2.729]

Taxonomic Coverage

The data set comprises records of 59 species corresponding to five classes, 16 orders, and 23 families. The species correspond to 31 non-Chiroptera threatened species listed in the Spanish Catalogue of Threatened Species (11 endangered and 20 vulnerable) and 28 Chiroptera species (one endangered, 11 vulnerable, and 16 listed in the List of Wild Species under Special Protection Regime).

Species Aegypius monachus (Buitre negro), Aphanius iberus (Fartet), Aquila adalberti (Águila imperial ibérica), Aquila fasciata (Águila perdicera), Ardeola ralloides (Garcilla cangrejera), Aythya nyroca (Porrón pardo), Barbastella barbastellus (Murciélago de bosque), Botaurus stellaris (Avetoro común), Charadrius alexandrinus (Chorlitejo patinegro), Charadrius morinellus (Chorlito carambolo), Chersophilus duponti (Alondra de Dupont o Ricotí), Chioglossa lusitanica (Salamandra rabilarga), Ciconia nigra (Cigüeña negra), Circus pygargus (Aguilucho cenizo), Emys orbicularis (Galápago europeo), Eptesicus isabellinus (Murciélago hortelano mediterráneo), Eptesicus serotinus (Murciélago hortelano), Erythropygia galactotes (Alzacola), Fulica cristata (Focha moruna), Gypaetus barbatus (Quebrantahuesos), Hypsugo savii (Murciélago montañero), Larus audouinii (Gaviota de Audouin), Marmaronetta angustirostris (Cerceta pardilla), Microtus cabrerae (Topillo de Cabrera), Milvus milvus (Milano real), Miniopterus schreibersii (Murciélago de cueva), Myotis alcathoe (Murciélago ratonero bigotudo pequeño), Myotis bechsteinii (Murciélago ratonero forestal), Myotis blythii (Murciélago ratonero mediano), Myotis capaccinii (Murciélago ratonero patudo), Myotis daubentonii (Murciélago ratonero ribereño), Myotis emarginatus (Murciélago ratonero pardo), Myotis myotis (Murciélago ratonero grande), Myotis mystacinus (Murciélago ratonero bigotudo), Myotis nattereri (Murciélago de Natterer), Nyctalus lasiopterus (Nóctulo grande), Nyctalus leisleri (Nóctulo pequeño), Nyctalus noctula (Nóctulo mediano), Oxyura leucocephala (Malvasía cabeciblanca), Pandion haliaetus (Águila pescadora), Phalacrocorax aristotelis (Cormorán moñudo), Phoenicurus phoenicurus (Colirrojo real), Pipistrellus kuhlii (Murciélago de borde claro), Pipistrellus nathusii (Murciélago de Nathusius), Pipistrellus pipistrellus (Murciélago enano), Pipistrellus pygmaeus (Murciélago de Cabrera), Plecotus auritus (Murciélago orejudo dorado), Plecotus austriacus (Murciélago orejudo gris), Pterocles alchata (Ganga común), Pterocles orientalis (Ganga ortega), Rana pyrenaica (Rana pirenaica), Rhinolophus euryale (Murciélago mediterráneo de herradura), Rhinolophus ferrumequinum (Murciélago grande de herradura), Rhinolophus hipposideros (Murciélago pequeño de herradura), Rhinolophus mehelyi (Murciélago mediano de herradura), Testudo graeca (Tortuga mora), Tetrax tetrax (Sisón común)

Temporal Coverage

Formation Period 07/2012-01/2023

Project Data

This thesis aims to enhance the efficiency of biodiversity data management to improve conservation efforts. It examines the dark data generated from environmental management-related activities, which often remain misused due to accessibility challenges. By identifying barriers to data flow, assessing data mobilization impacts on national biodiversity understanding, and developing improved data protocols, the project seeks to make critical biodiversity information more accessible and usable. The ultimate goal is to ensure that high-quality biodiversity data is available for informed decision-making and effective conservation planning, following FAIR data principles (Findable, Accessible, Interoperable, Reusable).

Title DATA for BiodivERsity Governance: looking for the efficiency of biodiversity data management for conservation (DATABerG).

The personnel involved in the project:

Maite Telletxea Martínez
Rafael Miranda Ferreiro
Arturo H. Ariño Plana
David Galicia Paredes

Sampling Methods

We searched environmental Records of Decision (RODs) in the Official State Gazette (https://www.boe.es/) to identify pronouncements with biodiversity data. We processed these reports and automatically detected species citations. Those fieldwork-based records, so-called Primary Biodiversity Records, constitute this published data set.

Study Extent The data set contains species records from 232 Spanish localities or municipalities where environmental assessments have been conducted, in 90% of cases, locations suitable for installing a photovoltaic solar plant or a wind farm. Spain is a country located in southwestern Europe and includes most of the Iberian Peninsula, the Balearic Islands, the Canary Islands, and five small areas in North Africa. Due to its geographical position, its varied topography, and the influence of different climates, Spain is characterized by the presence of four biogeographical regions: Mediterranean bioregion, Atlantic bioregion, Alpine bioregion, and Macaronesian region.
Quality Control The performance of automatic biodiversity data detection was assessed by calculating precision and recall (Kohavi & Provost, 1998; Fahmy Amin, 2022) based on correctly detected, incorrectly detected, and undetected records. Precision, representing the accuracy of the detections, was 0.937, while recall, reflecting the proportion of total records detected, was 0.948. False positives, such as species names within organization titles, affected precision, while recall was compromised by misspelled or incomplete species names. Despite these issues, the detection system performed well overall. Georeferencing uncertainty was evaluated following Marcer et al. 2020.

Method step description:

  1. Environmental Records of Decision (RODs) were collected from the Official State Gazette by searching for “evaluación ambiental” in the “Other provisions” database. Using the Octoparse data scraper (Octoparse, n.d.), we extracted the content of these pronouncements as text strings for further analysis. Species citations were automatically detected using RStudio (R Core Team, 2022), considering scientific names, common names, and possible synonyms included in the CEEA and the LESRPE. These species records were manually reviewed and categorized into three types: Primary Biodiversity Record (PBR, based on fieldwork), absence (species not recorded despite fieldwork), and literature-based. PBRs were georeferenced a posteriori using Google Maps (https://www.google.com/maps), calculating their uncertainty following best practice guidelines (Chapman & Wieczorek, 2020; Marcer et al., 2020). The data were incorporated into the MZNA database (Zootron v4.5; Ariño, 1991) and were standardized following the Darwin Core Standard (Darwin Core Maintenance Group, 2023), resulting in a database with 32 fields.

Bibliographic Citations

  1. Kohavi, R. & Provost, F. (1998). Glossary of term. Machine Learning, 30: 271‑274. https://doi.org/10.1023/a:1017181826899. https://doi.org/10.1023/a:1017181826899
  2. Fahmy Amin, M. (2022). Confusion Matrix in Binary Classification Problems: A Step-by-Step Tutorial. Journal of Engineering Research, 6(5). https://doi.org/10.21608/erjeng.2022.274526. https://doi.org/10.21608/erjeng.2022.274526
  3. Marcer, A., Haston, E., Groom, Q., Ariño, A., Chapman, A., Bakken, T., Braun, P., Dillen, M., Ernst, M., Escobar, A., Fichtmüller, D., Livermore, L., Nicolson, N., Paragamian, K., Paul, D., Pettersson, L., Phillips, S., Plummer, J., Rainer, H., Rey, I., Robertson, T., Röpert, D., Santos, J., Uribe, F., Waller, J., Wieczorek, J. (2020). Quality issues in georeferencing: From physical collections to digital data repositories for ecological research. Diversity and distributions, 27(3): 564‑567. https://doi.org/10.1111/ddi.13208. https://doi.org/10.1111/ddi.13208
  4. Octoparse. (n.d.). Web scraping tool & free web crawlers. https://www.octoparse.com/. https://www.octoparse.com/
  5. R Core Team. (2022). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna. URL: https://www.R-project.org. https://www.R-project.org
  6. Chapman, A. & Wieczorek, J. (2020). Georeferencing Best Practices. GBIF Secretariat, Copenhagen. https://doi.org/10.15468/doc-gg7h-s853. https://doi.org/10.15468/doc-gg7h-s853
  7. Ariño, A. H. (1991). Bibliography of Iberian Polychaetes: a data base. Ophelia, suppl. 5: 647–652. https://doi.org/10.1163/9789004629745_068. https://doi.org/10.1163/9789004629745_068
  8. Darwin Core Maintenance Group (2023) Darwin Core List of Terms. Biodiversity Information Standards (TDWG). http://rs.tdwg.org/dwc/doc/list/2023-09-18. http://rs.tdwg.org/dwc/doc/list/2023-09-18

Additional Metadata

Purpose The aim of the present data set is to make the dark data generated during environmental assessments FAIR (Findable, Accessible, Interoperable, and Reusable). Publishing these data in a publicly accessible platform creates an opportunity for their potential reuse in future conservation decisions, ensuring that these decisions are based on the best available evidence.
Alternative Identifiers 10.15470/bvznpy
3d6fbb6d-8699-4f93-9adc-c1fd6bc03f4e
https://ipt.gbif.es/resource?r=mzna-lit