Contributo in atti di convegno, 2023, ENG, 10.1109/CVPR52729.2023.00335

Leveraging inter-rater agreement for classification in the presence of noisy labels

Bucarelli M.S.; Cassano L.; Siciliano F.; Mantrach A.; Silvestri F.

Amazon, Luxembourg and Sapienza University of Rome, Rome, Italy; Amazon, Luxembourg; Sapienza University of Rome, Rome, Italy; Amazon, Luxembourg; Sapienza University of Rome, Rome and CNR-ISTI, Pisa, Italy

In practical settings, classification datasets are obtained through a labelling process that is usually done by humans. Labels can be noisy as they are obtained by aggregating the different individual labels assigned to the same sample by multiple, and possibly disagreeing, annotators. The interrater agreement on these datasets can be measured while the underlying noise distribution to which the labels are subject is assumed to be unknown. In this work, we: (i) show how to leverage the inter-annotator statistics to estimate the noise distribution to which labels are subject; (ii) introduce methods that use the estimate of the noise distribution to learn from the noisy dataset; and (iii) establish generalization bounds in the empirical risk minimization framework that depend on the estimated quantities. We conclude the paper by providing experiments that illustrate our findings.

CVPR - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3439–3448, Vancouver, CANADA, 17-24/06/2023

Keywords

Machine learning

CNR authors

Silvestri Fabrizio

CNR institutes

ISTI – Istituto di scienza e tecnologie dell'informazione "Alessandro Faedo"

ID: 488368

Year: 2023

Type: Contributo in atti di convegno

Creation: 2023-11-10 16:32:33.000

Last update: 2023-11-10 16:32:33.000

External IDs

CNR OAI-PMH: oai:it.cnr:prodotti:488368

DOI: 10.1109/CVPR52729.2023.00335

ISI Web of Science (WOS): 001058542603070

Scopus: 2-s2.0-85167949498