Description
Biodiversity data is data about species recorded in space and time. Such data is useful for the modeling of future species distribution, impact of climate change etc. Many natural studies inadvertently collect biodiversity data in different shape and form therefore it is often challenging to reshape the data into standardised form.
Type of data/experiments/methods
All publishable biodiversity data should follow Darwin Core data standard. Data in spreadsheets or databases can easily be converted to the Darwin Core standard using the IPT (Integrated Publishing Toolkit). This standard supports the following data structures:
Occurrence data
Sampling event data
Checklist data
Metadata Standards
Ecological Metadata Language (EML)
Metadata standard used is EML. To create metadata for your dataset, you will fill in a form during the IPT dataset publication process.
Ontologies
EML accepts various vocabularies, some examples include:
- ECSO (Ecosystem Ontology)
- EnvO (Environment Ontology)
- NCBI Taxonomy
- OBOE (The Extensible Observation Ontology)
- ROR (Research Organization Registry)
Sources for Reusable Data
GBIF
- All published biodiversity data, from different sources, is available through GBIF portal.
- Data user guidelines
- Identifiers:
- Filtered datasets are provided with unique DOIs for tracking data use.
- It is also possible to use the GBIF API with R or Python to retrieve data.
Storage and Computing
Storage is provided by individual Integrated Publishing Toolkit (IPT) providers.
Data Deposition Repository
File formats that are supported by IPT are:
- Plain text formats
- Tab-separated values (TSV)
- Comma-separated values (CSV)
- Open formats
- XLSX and XLS
- SQL databases (e.g. MariaDB, PostgreSQL)
GBIF Norway IPT
- GBIF Norway’s IPT Homepage
- Contact GBIF to get user credentials
- Post on the GBIF Norway GitHub issue list for general questions about data publication. You can also see questions others have posted here.
Useful Links
- In addition to the GBIF Norway HelpDesk at GitHub, there is a global GBIF GitHub issue list for data publication and IPT usage questions
- Darwin core extensions
- Earth BioGenome Project (EBP) Norway
- European Reference Genome Atlas (ERGA)