This Week in Machine Learning – 14 May 2021

Hello! Hope you have a great week. I encounter an interesting machine learning papers, one course and one open source project this week:

Emerging Properties in Self-supervised Vision Transformers [paper][github][blog post]

Self-supervised Vision Transformer with no supervision. Figure taken from (Caron et al., 2021).

Recent Vision Transformers (ViT) model, adopting Transformer model in NLP, has shown promising results toward generic and scalable architectures for computer vision tasks. This paper study self-supervised ViT model and discuss two emerging properties:

  1. Self-supervised ViT features contain explicit information about semantic segmentation of an image.
  2. Self-supervised ViT features also an excellent k-NN classifiers.
The Vision Transformer treats an image as a sequence of patches, analogous to a series of word embeddings in NLP Transformer model. Figure taken from Nabil’s blog post.

From the findings, the authors develop a self-supervised learning framework called DINO (Knowledge Distilation with no labels). As indicated in the name, the framework uses knowledge distillation strategy to train the model. But instead of using pre-trained model as teacher and running knowledge distillation as post processing step to self-supervised pre-training, the teacher network also performs distillation from student network using self-supervision objective. In other word, both student and teacher network are doing codistillation.

Nabil Madali has a great blog post discussing more detail about this paper.

Machine Learning Engineering for Production (MLOps) Specialization [url]

Coursera just launched a new course for building production end-to-end ML systems. Bringing machine learning models to production systems involves many tasks such as discovering data issue and data drift, conducting error analysis, managing computation and scaling. MLOps course discusses how to conceptualize, build, and maintain integrated machine learning systems that continuously operate in production. You will get yourself familiar with the capabilities, challanges, and consequences of machine learning in production.

Course website:

Opyrator: Quickly Turn Machine Learning Codes into Microservices [github]

Figure taken from Opyrator Github Repo.

This open source project combines FastAPI, Streamlit, and pydantic to quickly make your python functions into production-ready microservices. It utilizes FastAPI to automatically generate HTTP API, and Streamlit to automatically generate a web UI. A very useful tool to quickly showcase your machine learning models.

Opyratory demo website:

Figure taken from Opyrator Github Repo.

Stay safe, and see you next week!

Author: Philips Kokoh

Philips Kokoh Prasetyo is a Principal Research Engineer at the Living Analytics Research Centre (LARC) in the Singapore Management University. He enjoys analyzing data from many different perspectives. His current interests include machine learning, natural language processing, text mining, and deep learning.

Leave a Reply

Your email address will not be published. Required fields are marked *