To Learn Feb 28, 2021 To learn train self supervvised distilation: big model teacher is fully trained on a dataset, then the teacher teaches, together with the dataset, to the student