IMIS

Publicaties | Instituten | Personen | Datasets | Projecten | Kaarten
[ meld een fout in dit record ] Print deze pagina

LifeWatch observatory data: phytoplankton annotated trainingset by FlowCam imaging in the Belgian Part of the North Sea [LifeWatch observatory data: phytoplankton annotated trainingset by FlowCam imaging in the Belgian Part of the North Sea]
Citatie
Decrop, W., Lagaisse, R., Jonas, M., Muyle, J., Amadei Martínez, L., & Deneudt, K. (2024). LifeWatch observatory data: phytoplankton annotated trainingset by FlowCam imaging in the Belgian Part of the North Sea (Versie v1). Zenodo. https://doi.org/10.5281/zenodo.10554845. https://marineinfo.org/id/dataset/8645
Contact:

Beschikbaarheid: Creative Commons License Deze dataset valt onder een Creative Commons Naamsvermelding 4.0 Internationaal-licentie.

Beschrijving

The images were collected in the framework of the Belgian Lifewatch Research Infrastructure. During multidisciplinary campaigns, a number of fixed stations in the Belgian Part of the North Sea (BPNS) are visited on a monthly (onshore stations) or seasonal (offshore stations) basis. Samples are taken using a 55µm mesh size Apstein net and fixed in Lugol's iodine solution. In the lab, the samples are processed using a VS-4 FlowCAM model at 4X magnification targeting a particle size range of 55-300µm. The identification of the image data is done with the use of a CNN and followed by a manual validation step. Since May 2017, this dataset has provided micro- and phytoplankton observations, mainly covering diatoms, dinoflagellates and cilliates, for the Belgian Part of the North Sea (BPNS).

 

This dataset comprises a trainings datasplit of 337,613 images distributed across 95 classes, with each class containing a minimum of 100 and a maximum of 10,000 images. The goal of this dataset is to be able to facilitate model training, here we have organized the data into a standard split, with 80% allocated for training, 10% for validation, and another 10% for testing purposes. This dataset structure ensures a balanced representation and supports scientific rigor in subsequent analyses.

meer

Technical details 

Data preprocessing 

Raw FlowCam output data is fully processed using in-house datapipelines, the VisualSpreadsheet software is only used for data acquisition during the lab run of the sample. Raw images and binary images are never saved during the FlowCam run, we only work on the image collages saved at the end of the run. Single images are cut from these collages using each image coordinates width and height pulled from the .lst file using in-house python code. The background of the images is not removed. These images are then predicted and annotated in-house at VLIZ.

Data splitting 

The training dataset is 80% used for training, 10% for validation and 10% for prediction. 

Classes, labels and annotations

The dataset comprises 337,613 images distributed across 95 classes, with each class containing a minimum of 100 and a maximum of 10,000 images. Taxonomic coverage of the dataset comprises mainly of diatoms, dinoflagellates and cilliates, but to a lesser extent also zooplankton and other protists.

Parameters 

The images are read using cv2.imread and the values are used as parameters.

Data sources 

Images are collected during the monthly monitoring of phytoplankton communities in the Belgian Part of the North Sea during the LifeWatch multidisciplinary campaigns by FlowCam VS-4 benchmodel (Fluid Imaging Technologies, Yarmouth, Maine, U.S.A.).

Data quality 

All images are predicted and subsequently manually validated to ensure the quality of the trainingset.

Image resolution 

The size range imaged is 55-300µm. Images are acquired using a Sony XCD SC90 digital gray-scale camera. Images are during training of CNN resized to 100px by 100px.

Spatial coverage 

The data comes from a number of fixed stations in the Belgian Part of the North Sea (BPNS). 

Nine stations onshore are visited monthly:

StationLongitudeLatitude
1302.9053551.27055
7803.05728351.471367
3302.80908351.434117
2302.8503551.308683
7103.13828351.441217
2152.6107551.274867
ZG022.50071751.33515
1202.70248351.186083
7003.22101751.377

Eight additional offshore stations are visited seasonally:

StationLongitudeLatitude
LW012.25651.568667
LW022.55651.8
4352.79033351.580667
W07bis3.01251751.588033
W082.3551.458333
W092.751.75
W102.41666751.683333
4212.4551.4805

 

Temporal coverage 

The monitoring was initiated in May 2017 and has been running continuously every month.


Scope
Thema's:
Biologie > Plankton > Fytoplankton
Kernwoorden:
Fytoplankton, Neural networks, ANE, België, Bacillariophyceae, Ciliophora, Dictyochophyceae, Dinophyceae, Prymnesiophyceae

Geografische spreiding
ANE, België [Marine Regions]

Spreiding in de tijd
1 Mei 2017 - 23 Januari 2024

Taxonomic coverage
Bacillariophyceae [WoRMS]
Ciliophora [WoRMS]
Dictyochophyceae [WoRMS]
Dinophyceae [WoRMS]
Prymnesiophyceae [WoRMS]

Parameter

Bijdrage door
Vlaams Instituut voor de Zee (VLIZ), meerdata leverancier

Gerelateerde datasets
Maakt deel uit van:
LifeWatch observatory data: phytoplankton annotated image library by FlowCam imaging for the Belgian part of the North Sea., meer
LifeWatch observatory data: phytoplankton observations by FlowCam imaging in the Belgian Part of the North Sea, meer

Project
iMagine: Imaging data and services for aquatic science, meer
LifeWatch: Flemish contribution to LifeWatch.eu, meer


Dataset status: Gestart
Data type: Data
Data oorsprong: Onderzoek
Datum van vrijgave: 2024-01-22
Metadatarecord aangemaakt: 2024-09-23
Informatie laatst gewijzigd: 2024-09-24
Alle informatie in het Integrated Marine Information System (IMIS) valt onder het VLIZ Privacy beleid