Infothek
This page contains automatically translated content.
New conference article for CIIS
A conference paper was accepted at the International Conference on Computational Intelligence and Intelligent Systems (CIIS) 2023. The paper, titled Leveraging Repeated Unlabeled Noisy Measurements to Augment Supervised Learning, was authored by Birk Martin Magnussen, Claudius Stern, and Bernhard Sick. The abstract is as follows:
Often, producing large labelled datasets for supervised machine learning is difficult and expensive. In cases where the expensive part is due to labelling and obtaining ground truth, it is often comparably easy to acquire large datasets containing unlabelled data points. For reproducible measurements, it is possible to record information on multiple data points being from the same reproducible measurement series, which should thus have an equal but unknown ground truth. In this article, we propose a method to incorporate a dataset of such unlabelled data points for which some data points are known to be equal in end-to-end training of otherwise labelled data. We show that, with the example of predicting the carotenoid concentration in human skin from optical multiple spatially resolved reflection spectroscopy data, the proposed method is capable of reducing the required number of labelled data points to achieve the same prediction accuracy for different model architectures. In addition, we show that the proposed method is capable of reducing the negative impact of noisy data when performing a repeated measurement of the same sample.
News
New conference article for CIIS
A conference paper was accepted at the International Conference on Computational Intelligence and Intelligent Systems (CIIS) 2023. The paper, titled Leveraging Repeated Unlabeled Noisy Measurements to Augment Supervised Learning, was authored by Birk Martin Magnussen, Claudius Stern, and Bernhard Sick. The abstract is as follows:
Often, producing large labelled datasets for supervised machine learning is difficult and expensive. In cases where the expensive part is due to labelling and obtaining ground truth, it is often comparably easy to acquire large datasets containing unlabelled data points. For reproducible measurements, it is possible to record information on multiple data points being from the same reproducible measurement series, which should thus have an equal but unknown ground truth. In this article, we propose a method to incorporate a dataset of such unlabelled data points for which some data points are known to be equal in end-to-end training of otherwise labelled data. We show that, with the example of predicting the carotenoid concentration in human skin from optical multiple spatially resolved reflection spectroscopy data, the proposed method is capable of reducing the required number of labelled data points to achieve the same prediction accuracy for different model architectures. In addition, we show that the proposed method is capable of reducing the negative impact of noisy data when performing a repeated measurement of the same sample.
Dates
New conference article for CIIS
A conference paper was accepted at the International Conference on Computational Intelligence and Intelligent Systems (CIIS) 2023. The paper, titled Leveraging Repeated Unlabeled Noisy Measurements to Augment Supervised Learning, was authored by Birk Martin Magnussen, Claudius Stern, and Bernhard Sick. The abstract is as follows:
Often, producing large labelled datasets for supervised machine learning is difficult and expensive. In cases where the expensive part is due to labelling and obtaining ground truth, it is often comparably easy to acquire large datasets containing unlabelled data points. For reproducible measurements, it is possible to record information on multiple data points being from the same reproducible measurement series, which should thus have an equal but unknown ground truth. In this article, we propose a method to incorporate a dataset of such unlabelled data points for which some data points are known to be equal in end-to-end training of otherwise labelled data. We show that, with the example of predicting the carotenoid concentration in human skin from optical multiple spatially resolved reflection spectroscopy data, the proposed method is capable of reducing the required number of labelled data points to achieve the same prediction accuracy for different model architectures. In addition, we show that the proposed method is capable of reducing the negative impact of noisy data when performing a repeated measurement of the same sample.