Thesis defence

Learning With Unlabeled Data



imgActu
©️ .

M. Renaud Vandeghen will publicly defend his thesis entitled "Learning With Unlabeled Data".

Summary

In the first part, we explore semi-supervised learning techniques for object detection, using the pseudo-labeling paradigm.
The first contribution explores how to account for the uncertainty of pseudo-labels during training. In particular, we propose to scale
the loss contribution of each pseudo-labeled example by a factor related to its confidence score. We show that linearly scaling the loss by the confidence score is the most effective strategy compared to the baseline model, especially in very low-label regimes.
Then we explore how to obtain a robust thresholding value to select pseudo-labels, without the need of costly hyperparameter search.
We propose to use an adaptive thresholding strategy, where the threshold is determined by the distribution of confidence scores.
We show that this heuristic can be used for each class independently, and that it matches the performance of the greedy threshold at no computational cost.
This new thresholding strategy is therefore particularly useful since it can be applied in any data domain.
We also add a refinement stage in the teacher-student framework, where the student model is finetuned on the labeled data only, before being used as a new teacher.

In the second part, we explore self-supervised learning, in the context of masked image modeling.
In the first work, we show that reconstructing a highly masked, randomly resized crop of an image is an effective pretraining task for object-centric representation learning. In particular, we show that this new pretraining strategy based on crops yields better performance than learning on video frames, while also being more computationally efficient.
Finally, in our final contribution, we improve the representation learning of video based models, by combining the masked video modeling task on pixel and trajectory signals. 
This dual reconstruction task encourages the model to learn both spatial and temporal information.
During pretraining, the learning objective of the model is to reconstruct both masked spatial information, either in pixel or latent space, as well as the masked trajectory, obtained by an off-the-shelf point tracker.
The trajectory information is also used to build a motion aware masking strategy, which further improves the learned representations.
We show that both signals are complementary, and that their combination leads to state of the art results, especially in motion-centric tasks.

Practical information

Defence will take place on Tuesday 7th April at 14:00 to all at auditoire R7 de l’Institut Montefiore, Bât. B28, au Sart Tilman or via PhD Channel.

Published on

Share this news

cookieImage