//No Comment - Semantic Compression & Face Synthesis |
Written by David Conrad |
Thursday, 02 February 2017 |
• Semantic Perceptual Image Compression using Deep Convolution Networks • Face Synthesis from Facial Identity Features Sometimes the news is reported well enough elsewhere and we have little to add other than to bring it to your attention. No Comment is a format where we present original source information, lightly edited, so that you can decide if you want to follow it up.
Semantic Perceptual Image Compression using Deep Convolution NetworksWe know that image compression should be working better than the current efforts. Jpeg is good but most images are redundant to the point where you could adequately describe them in a few words! AI probably could be used to create a semantic model of the image and generate a good likeness from a few bytes but for the moment this is too difficult. What you can do is to use stanard jpeg compression but fine tuned with the help of a neural network. The neural network finds the most important areas of the image and there the Jpeg compression is applied lightly but in areas that the network thinks the human doesn't care about the image is compressed more agressively: It has long been considered a significant problem to improve the visual quality of lossy image and video compression. Recent advances in computing power together with the availability of large training data sets has increased interest in the application of deep learning cnns to address image recognition and image processing tasks. Here, we present a powerful cnn tailored to the specific task of semantic image understanding to achieve higher visual quality in lossy compression. A modest increase in complexity is incorporated to the encoder which allows a standard, off-the-shelf jpeg decoder to be used. While jpeg encoding may be optimized for generic images, the process is ultimately unaware of the specific content of the image to be compressed. Our technique makes jpeg content-aware by designing and training a model to identify multiple semantic regions in a given image. Unlike object detection techniques, our model does not require labeling of object positions and is able to identify objects in a single pass. We present a new cnn architecture directed specifically to image compression, which generates a map that highlights semantically-salient regions so that they can be encoded at higher quality as compared to background regions. By adding a complete set of features for every class, and then taking a threshold over the sum of all feature activations, we generate a map that highlights semantically-salient regions so that they can be encoded at a better quality compared to background regions. Experiments are presented on the Kodak PhotoCD dataset and the MIT Saliency Benchmark dataset, in which our algorithm achieves higher visual quality for the same compressed size.
Face Synthesis from Facial Identity FeaturesGoogle Research. University of Massachusetts and MIT have implemented a way of creating a full frontal mug shot using photos taken at any angle. The trick is to create a 3D model of the face and then render it. Presumably this means you could create a photo taken from any angle: We present a method for synthesizing a frontal, neutral-expression image of a person's face given an input face photograph. This is achieved by learning to generate facial landmarks and textures from features extracted from a facial-recognition network. Unlike previous approaches, our encoding feature vector is largely invariant to lighting, pose, and facial expression. Exploiting this invariance, we train our decoder network using only frontal, neutral-expression photographs. Since these photographs are well aligned, we can decompose them into a sparse set of landmark points and aligned texture maps. The decoder then predicts landmarks and textures independently and combines them using a differentiable image warping operation. The resulting images can be used for a number of applications, such as analyzing facial attributes, exposure and white balance adjustment, or creating a 3-D avatar. It is also fairly clear that this technique could be used to provide standard id photos from a collection of "random" images. And the technique is robust enough to work on degraded photos and illustrations. If you have always wanted know what rosy the riveter looked like then:
To be informed about new articles on I Programmer, sign up for our weekly newsletter,subscribe to the RSS feed and follow us on, Twitter, Facebook, Google+ or Linkedin.
Comments
or email your comment to: comments@i-programmer.info
|
Last Updated ( Thursday, 02 February 2017 ) |