//No Comment - Fighting Bots,Open Source Image Captioning & An Open Source Deep Face Recognition SDK

Written by Mike James

Monday, 26 September 2016

• Even Good Bots Fight

• Show and Tell: image captioning open sourced in TensorFlow

• VIPLFaceNet: An Open Source Deep Face Recognition SDK

Sometimes the news is reported well enough elsewhere and we have little to add other than to bring it to your attention.

No Comment is a format where we present original source information, lightly edited, so that you can decide if you want to follow it up.

Even Good Bots Fight

In recent years, there has been a huge increase in the number of bots online, varying from Web crawlers for search engines, to chatbots for online customer service, spambots on social media, and content-editing bots in online collaboration communities. The online world has turned into an ecosystem of bots. However, our knowledge of how these automated agents are interacting with each other is rather poor. In this article, we analyze collaborative bots by studying the interactions between bots that edit articles on Wikipedia.

We find that, although Wikipedia bots are intended to support the encyclopedia, they often undo each other's edits and these sterile "fights" may sometimes continue for years. Further, just like humans, Wikipedia bots exhibit cultural differences.

Our research suggests that even relatively "dumb" bots may give rise to complex interactions, and this provides a warning to the Artificial Intelligence research community.

googleresearchbanner

Show and Tell: image captioning open sourced in TensorFlow

Today, we’re making the latest version of our image captioning system available as an open source model in TensorFlow. This release contains significant improvements to the computer vision component of the captioning system, is much faster to train, and produces more detailed and accurate descriptions compared to the original system. These improvements are outlined and analyzed in the paper Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge, published in IEEE Transactions on Pattern Analysis and Machine Intelligence.

Today’s code release initializes the image encoder using the Inception V3 model, which achieves 93.9% accuracy on the ImageNet classification task. Initializing the image encoder with a better vision model gives the image captioning system a better ability to recognize different objects in the images, allowing it to generate more detailed and accurate descriptions. This gives an additional 2 points of improvement in the BLEU-4 metric over the system used in the captioning challenge.

Another key improvement to the vision component comes from fine-tuning the image model. This step addresses the problem that the image encoder is initialized by a model trained to classifyobjects in images, whereas the goal of the captioning system is to describe the objects in images using the encodings produced by the image model. For example, an image classification model will tell you that a dog, grass and a frisbee are in the image, but a natural description should also tell you the color of the grass and how the dog relates to the frisbee.

googlecaption

We hope that sharing this model in TensorFlow will help push forward image captioning research and applications, and will also allow interested people to learn and have fun. To get started training your own image captioning system, and for more details on the neural network architecture, navigate to the model’s home-page here. While our system uses the Inception V3 image classification model, you could even try training our system with the recently released Inception-ResNet-v2 model to see if it can do even better!

VIPLFaceNet: An Open Source Deep Face Recognition SDK

Robust face representation is imperative to highly accurate face recognition. In this work, we propose an open source face recognition method with deep representation named as VIPLFaceNet, which is a 10-layer deep convolutional neural network with 7 convolutional layers and 3 fully-connected layers.

Compared with the well-known AlexNet, our VIPLFaceNet takes only 20% training time and 60% testing time, but achieves 40\% drop in error rate on the real-world face recognition benchmark LFW. Our VIPLFaceNet achieves 98.60% mean accuracy on LFW using one single network.

An open-source C++ SDK based on VIPLFaceNet is released under BSD license. The SDK takes about 150ms to process one face image in a single thread on an i7 desktop CPU. VIPLFaceNet provides a state-of-the-art start point for both academic and industrial face recognition applications.

nocomment

To be informed about new articles on I Programmer, sign up for our weekly newsletter,subscribe to the RSS feed and follow us on, Twitter, Facebook, Google+ or Linkedin.

nocommentAI

Action Figure Craze Overruns OpenAI
13/04/2025

If you're on social media, you'll probably have seen a lot of 'action figure' posts, where people show off images of themselves, their dog or their cat in the form of an action figure, complete with s [ ... ]

+ Full Story

Kafka 4 Adds Queue Semantics Support
17/04/2025

Kafka 4.0 has been released, with major changes. This is the first version to operate entirely in KRaft mode by default. It also adds a new consumer group protocol designed to dramatically improve reb [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 26 September 2016 )