Newly accepted paper: Efficient quality controll for Ocean Data

This article proposes exploiting active learning (AL) to assist QC experts, reducing their workload by proactively selecting informative data points for labeling. Targeting the data distribution challenge, AL, coupled with imbalance-resilient classifiers, enhances model performance in recognizing erroneous data points. To mitigate the cold-start problem in AL, we propose outlier detection for initializing classifiers, significantly reducing annotation costs. Our approach is tested on data generated by 5 Argo floats, demonstrating its feasibility to lessen the labeling workload for experts and tackle significant data imbalance.

diagram

This work has been partially funded by the European Union’s Horizon research and innovation program by the CLARIFY (860627), BLUECLOUD 2026 (101094227), ENVRI-HUB next (101131141), EVERSE (101129744) and OSCARS (101129751), by the LifeWatch ERIC, and by the NWO LTER-LIFE project.

Related