The Finnish Biodiversity Information Facility (FinBIF3) is an open access data repository for researchers, government, and the public. FinBIF brings together Finland’s biodiversity collections and datasets into a single, open-access source. Currently, FinBIF has more than 50 million species observations, 3 million images, 40000 taxa, and 500 datasets. Our online portal allows you to browse, search, and download information about all forms of biological life, and to record and share your own observations. FinBIF is dedicated to making biodiversity data openly accessible and widely usable.
Use Case Status Before Joining EOSC Data Commons
FinBIF hosts over 500 datasets, whose definition varies depending on the data owner: some divide their data into small units, while others group large amounts into one. Although all datasets relate to biodiversity, they differ significantly in collection methods, scope, reliability, and additional information. Metadata quality also varies.
Researchers who did not collect the data often struggle to identify which datasets suit their needs or how to find relevant ones. They face several challenges in understanding and using available data. For example:
- Dataset structure varies widely, including how a “record” is defined. FinBIF’s largest dataset has 12 million records, while other datasets may record data in a single entry.
- International users may expect different data collection methods.
- FinBIF primarily shares raw occurrence data. Many users, however, expect cleaned and validated datasets, with duplicates removed and spatial accuracy improved.
Objectives in the Project
- Help biodiversity researchers discover relevant datasets and understand and use them effectively;
- Assessment and improvement of the level of FAIRness of FinBIF metadata;
- Use an AI-based search tool that allows finding and browsing open and restricted datasets without special knowledge;
- Better and more comprehensive machine-readable metadata.
Integration with EOSC Data Commons Services and Components
We will supply our metadata through modern APIs and enrich it with contextual information, enabling the EOSC Matchmaker to handle heterogeneous metadata and answer natural-language queries.
We will use the FAIR Assessment Toolkit to evaluate our datasets systematically. This requires interoperability with our APIs and transparent standards for assessing the quality of identifiers and metadata.
Technical Integration Plan
Since it does not provide an OAI-PMH endpoint, for the PoC FinBIF will closely follow the technical discussions and developments to be ready for integration in the next release. FinBIF will make initial FAIR assessments of selected datasets.
The project will publish the first release of EOSC Matchmaker and FinBIF will start integrating APIs.
Integration of FinBIF into EOSC Matchmaker. Also, FAIRness and metadata harmonisation is achieved across all datasets at that point.

