Ren, Long; Pan, Zhibin; Cao, Jianzhong; Liao, Jiawen
With high sensitivity to capture rich details, visible imaging equipment can take images containing more textures and contours which are important to visual perception. Unlike visible cameras, infrared imaging devices can detect targets invisible in visible images, because the imaging principle of infrared sensors derives from differences of thermal radiation. Thus, the purpose of image fusion is to merge as much meaningful feature information from the infrared and visible images into the fused image as possible, such as contours as well as textures of the visible image and thermal targets of the infrared image. In this paper, we propose an image fusion network based on variational auto-encoder (VAE), which performs the image fusion process in deep hidden layers. We divide the proposed network into image fusion network and infrared feature compensation network. Firstly, in the image fusion network, the encoder of the image fusion network is created to generate the latent vectors in hidden layers from the input visible image and infrared image. Secondly, two different latent vectors merge into one based on the product of Gaussian probability density; accordingly, the decoder begins to reconstruct the fused image with the descent of the loss function value. Meanwhile, Residual block and symmetric skip connection methods are added to the network to enhance the efficiency of network training. Finally, due to the defect of the loss function setting in the fusion network, an infrared feature compensation network is designed to compensate critical radiation features of the infrared image. Experimental results on public available datasets demonstrate that the proposed method is superior to other traditional and deep learning methods in both objective metrics and subjective visual perception.
The result was published on INFRARED PHYSICS & TECHNOLOGY. DOI: 10.1016/j.infrared.2021.103839