Supervised learning relies on the existence of training data, typically labeled by human experts. The size and quality of this data has a critical impact on the quality of the learned inferred function. But human experts are expensive. We consider the particular case of a classification problem. At each question, an expert is given two elements and tells whether they belong to the same class, with a small probability of error. We propose a scheme ensuring the quality of the final classification while minimizing the number of questions asked to the experts.