Abstract
Real-world images often suffer from variation in weather conditions such as rain, fog, snow, and
temperature. These variations in the atmosphere adversely affect the performance of computer vision
models in real-world scenarios. This problem can be bypassed by collecting and annotating images for
each weather. However, collecting and annotating images in such conditions is an extremely tedious
task, which is time-consuming as well as expensive. So, in this work, we address the forementioned
problems. Among all the weather conditions, we focus on the distortions in the image caused by high
temperature, also known as atmospheric turbulence. These distortions introduce geometrical deformation around the boundaries of an object in an image which causes a vision algorithm to perform poorly
and pose a major concern. Hence, in this thesis, we address the problem of artificially generating atmospheric turbulence and restoring the images from it.
In the first part of our work, we attempt to model atmospheric turbulence. Since such models are
critical to extending computer vision solutions developed in the laboratory to real-world use cases.
And, simulating atmospheric turbulence by using statistical models or by computer graphics is often
computationally expensive. To overcome this problem, we train a generative adversarial network(GAN)
which outputs an atmospheric turbulent image by utilizing less computational resources than traditional
methods. We propose a novel loss function to efficiently learn the atmospheric turbulence at the finer
level. Experiments show that by using the proposed loss function, our network outperforms the existing
state-of-the-art image to image translation network in turbulent image generation.
In the second part of the thesis, we address the ill-posed problem of restoring images degraded due
to atmospheric turbulence. We propose a deep adversarial network to recover the images which are
distorted due to atmospheric turbulence and show the applicability of restored images in several tasks.
Unlike previous methods, our approach neither uses any prior knowledge about atmospheric turbulence
conditions at inference time nor requires the fusion of multiple images to get a single restored image.
To train our models, we synthesized turbulent images by following a series of efficient 2D operations.
Thereafter, using our trained models we run inference on real and synthesized turbulent images. Our
final restoration models DT-GAN+ and DTD-GAN+ qualitatively and quantitatively outperforms the
general state-of-the-art image-to-image translation models. The improved performance of our model is
due to the use of optimized residual structures along with channel attention and sub-pixel mechanism
which exploits the information between the channels and removes atmospheric turbulence at the finer level. We also perform extensive experiments on restored images by utilizing them for downstream
tasks such as classification, pose estimation, semantic keypoint estimation, and depth estimation.
In the third part of our work, we study the problem of the semantic segmentation model in adapting
to hot climate cities. This issue can be circumvented by collecting and annotating images in such
weather conditions and training segmentation models on those images. But, the task of semantically
annotating images for every environment is painstaking and expensive. Hence, we propose a framework
that improves the performance of semantic segmentation models without explicitly creating an annotated
dataset for such adverse weather variations. Our framework consists of two parts, a restoration network
to remove the geometrical distortions caused by hot weather and an adaptive segmentation network that
is trained on an additional loss to adapt to the statistics of the ground-truth segmentation map. We
train our framework on the Cityscapes dataset, which showed a total IoU gain of 12.707 over standard
segmentation models.
In the last part of our work, we improve the performance of our joint restoration and segmentation
network via a feedback mechanism. In, the previous approach the restoration network does not learn
directly from the errors of the segmentation network. In other words, the restoration network is not task
aware. Hence, we propose a semantic feedback learning approach, which improves the task of semantic
segmentation giving a feedback response into the restoration network. This response works as an attend
and fix mechanism by focusing on those areas of an image where restoration needs improvement. Also,
we proposed loss functions: Iterative Focal Loss (iFL) and Class-Balanced Iterative Focal Loss (CBiFL), which are specifically designed to improve the performance of the feedback network. These
losses focus more on those samples that are continuously miss-classified over successive iterations. Our
approach gives a gain of 17.41 mIoU over the standard segmentation model, in