Image
Type: Data Repository

Finnish Biodiversity Information Facility

Image

Use Case Status Before Joining EOSC Data Commons

FinBIF hosts over 500 datasets, whose definition varies depending on the data owner: some divide their data into small units, while others group large amounts into one. Although all datasets relate to biodiversity, they differ significantly in collection methods, scope, reliability, and additional information. Metadata quality also varies.
Researchers who did not collect the data often struggle to identify which datasets suit their needs or how to find relevant ones. They face several challenges in understanding and using available data. For example:

  • Dataset structure varies widely, including how a “record” is defined. FinBIF’s largest dataset has 12 million records, while other datasets may record data in a single entry.
  • International users may expect different data collection methods.
  • FinBIF primarily shares raw occurrence data. Many users, however, expect cleaned and validated datasets, with duplicates removed and spatial accuracy improved.

Objectives in the Project

  • Help biodiversity researchers discover relevant datasets and understand and use them effectively;
  • Assessment and improvement of the level of FAIRness of FinBIF metadata;
  • Use an AI-based search tool that allows finding and browsing open and restricted datasets without special knowledge;
  • Better and more comprehensive machine-readable metadata.

Integration with EOSC Data Commons Services and Components

Expected Results

We will supply our metadata through modern APIs and enrich it with contextual information, enabling the EOSC Matchmaker to handle heterogeneous metadata and answer natural-language queries.
We will use the FAIR Assessment Toolkit to evaluate our datasets systematically. This requires interoperability with our APIs and transparent standards for assessing the quality of identifiers and metadata.

Discover EOSC Data Commons Use Cases

Loading...