Nowadays, constructing a large-scale dataset is the prerequisite to achieving the task in our hands. Sometimes the task is a niche, and it would be too expensive or even not possible to construct a large-scale dataset for it to train an entire model from scratch. Do we need to train a model from scratch in all cases?
Imagine we would like to detect a certain animal, let’s say an otter, in images. We first need to collect many otter images and construct a training dataset. Then, we need to train a model with those images. Now, imagine we want our model to learn how to detect koalas. What do we do now? Again, we collect many koala images and construct our dataset with them. Do we need to train our model from scratch again with the combined dataset of otter and koala images? We already had a model trained on otter images. Why are we wasting it? It learned some features to detect animals which could also come in handy for detecting koalas. Can we utilize this pre-trained model to make things faster and simpler?
Yes, we can, and that is called transfer learning. It is a machine learning technique that enables a model trained on one task to be used as a starting point for another related task. Instead of starting from scratch, this leads to faster and more efficient training and improved performance on the new task in most cases.
So all we need to do is find an existing model and use it as a starting point for our new training. Is it that simple, though? What if we change the problem to a more complicated one? Like image segmentizing the objects on the road for autonomous driving. We cannot just take a pre-trained model and use them as it is. If the model was pre-trained on city roads, it might not perform well when applied to rural roads. Just look at the difference!
One of the biggest, if not the biggest, challenges in transfer learning is adapting the model to the difference between the source and the target dataset. We use the term domain gap to refer to the significant difference between the distribution of features in the source and target datasets. This difference can cause problems for the pre-trained model as it would be difficult for the model to transfer the knowledge from the source to the target domain. Therefore, identifying and reducing the domain gaps is crucial when we plan to do transfer learning. These gaps can happen in any field, but they are particularly important for the safety-critical fields where the error cost is too high.
However, identifying domain gaps is not a straightforward task. We need to do certain evaluations to identify the domain gap between datasets:
- Analyze the statistical properties, like class feature distributions, to identify any significant differences.
- Visualize the data in a low-dimensional space, preferably in the latent space, to see if they form distinct clusters and compare their distribution.
- Evaluate the pre-trained model on the target dataset to assess its initial performance. If the model performs poorly, it might indicate a domain gap.
- Hold some ablation studies by removing certain components of the pre-trained model. This way, we can learn which components are transferable and which are not.
- Apply domain adaptation techniques like domain adversarial training or fine-tuning.
They all sound nice and fine, but all these operations require intense manual labor and consume a lot of time. Let us discuss this using a solid example which should make things clear.
Assume we have an image segmentation model, DeepLabV3Plus, which is trained on the Cityscapes dataset that contains data from more than fifty European cities. For simplicity, let’s say we work with a subset of the Cityscapes dataset using two cities, Aschen and Zurich. To train our model, we want to use the KITTI dataset that is constructed using data captured during driving in a mid-size city, rural area, and highway. We must identify the domain gap between those datasets to adapt our model properly and eliminate potential errors. How can we do it?
First, we need to find out if we have a domain gap. To do that, we can take the pre-trained model and run it on both datasets. Of course, first, we need to prepare both datasets for evaluation, find their error, and then compare the results. If the average error between the source and the target dataset is too high, that indicates we have a domain gap to fix.
Now we know we have a domain gap, how can we identify the root cause of it? We can start by finding the samples with the highest loss and compare them to find their common characteristics. It could be the color variation, roadside object variation, car variation, area that the sky covers, etc. We must first try fixing each of these differences, normalizing them properly to ensure they fit the source dataset’s characteristics, and reevaluate our model to see if the “root” cause we found was actually the root cause of the domain gap.
What if we had a tool that could do all these for us automatically so we could focus on the real aspect, solving the problem we have in hand? Thankfully, somebody thought about it and came up with the TensorLeap.
TensorLeap is a platform to enhance the development of deep neural network-based solutions. TensorLeap offers an advanced suite of tools to aid data scientists in refining and explaining their models. It provides valuable insights into the models and identifies their strengths and weaknesses. On top of that, the included tools for error analysis, unit testing, and dataset architecture are extremely helpful in finding the root cause of the problem and making the final model effective and reliable.
You can read this blog post to learn how it can be used to solve the domain gap problem in Cityscapes and KITTI datasets. In this example, TensorLeap’s automatic preparation of optimal latent space and various analytic tools, dashboards, and insights helped quickly spot and reduce three domain gaps, significantly improving the model’s performance. Identifying and fixing those domain gaps would have taken months of manual work, but with TensorLeap, it can be done in a matter of hours.
Note: Thanks to the Tensorleap team for the thought leadership/ Educational article above. Tensorleap has supported this Content.
Ekrem Çetinkaya received his B.Sc. in 2018, and M.Sc. in 2019 from Ozyegin University, Istanbul, Türkiye. He wrote his M.Sc. thesis about image denoising using deep convolutional networks. He received his Ph.D. degree in 2023 from the University of Klagenfurt, Austria, with his dissertation titled "Video Coding Enhancements for HTTP Adaptive Streaming Using Machine Learning." His research interests include deep learning, computer vision, video encoding, and multimedia networking.