//No Comment - Fighting Bots,Open Source Image Captioning & An Open Source Deep Face Recognition SDK
Written by Mike James   
Monday, 26 September 2016

• Even Good Bots Fight

• Show and Tell: image captioning open sourced in TensorFlow

• VIPLFaceNet: An Open Source Deep Face Recognition SDK

nocomment

Sometimes the news is reported well enough elsewhere and we have little to add other than to bring it to your attention.

No Comment is a format where we present original source information, lightly edited, so that you can decide if you want to follow it up. 

 

Even Good Bots Fight

In recent years, there has been a huge increase in the number of bots online, varying from Web crawlers for search engines, to chatbots for online customer service, spambots on social media, and content-editing bots in online collaboration communities. The online world has turned into an ecosystem of bots. However, our knowledge of how these automated agents are interacting with each other is rather poor. In this article, we analyze collaborative bots by studying the interactions between bots that edit articles on Wikipedia.

We find that, although Wikipedia bots are intended to support the encyclopedia, they often undo each other's edits and these sterile "fights" may sometimes continue for years. Further, just like humans, Wikipedia bots exhibit cultural differences.

Our research suggests that even relatively "dumb" bots may give rise to complex interactions, and this provides a warning to the Artificial Intelligence research community.

googleresearchbanner

Today, we’re making the latest version of our image captioning system available as an open source model in TensorFlow. This release contains significant improvements to the computer vision component of the captioning system, is much faster to train, and produces more detailed and accurate descriptions compared to the original system. These improvements are outlined and analyzed in the paper Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge, published in IEEE Transactions on Pattern Analysis and Machine Intelligence.


Today’s code release initializes the image encoder using the Inception V3 model, which achieves 93.9% accuracy on the ImageNet classification task. Initializing the image encoder with a better vision model gives the image captioning system a better ability to recognize different objects in the images, allowing it to generate more detailed and accurate descriptions. This gives an additional 2 points of improvement in the BLEU-4 metric over the system used in the captioning challenge.

Another key improvement to the vision component comes from fine-tuning the image model. This step addresses the problem that the image encoder is initialized by a model trained to classifyobjects in images, whereas the goal of the captioning system is to describe the objects in images using the encodings produced by the image model. For example, an image classification model will tell you that a dog, grass and a frisbee are in the image, but a natural description should also tell you the color of the grass and how the dog relates to the frisbee.

googlecaption

We hope that sharing this model in TensorFlow will help push forward image captioning research and applications, and will also allow interested people to learn and have fun. To get started training your own image captioning system, and for more details on the neural network architecture, navigate to the model’s home-page here. While our system uses the Inception V3 image classification model, you could even try training our system with the recently released Inception-ResNet-v2 model to see if it can do even better! 

face

VIPLFaceNet: An Open Source Deep Face Recognition SDK

Robust face representation is imperative to highly accurate face recognition. In this work, we propose an open source face recognition method with deep representation named as VIPLFaceNet, which is a 10-layer deep convolutional neural network with 7 convolutional layers and 3 fully-connected layers.

Compared with the well-known AlexNet, our VIPLFaceNet takes only 20% training time and 60% testing time, but achieves 40\% drop in error rate on the real-world face recognition benchmark LFW. Our VIPLFaceNet achieves 98.60% mean accuracy on LFW using one single network.

An open-source C++ SDK based on VIPLFaceNet is released under BSD license. The SDK takes about 150ms to process one face image in a single thread on an i7 desktop CPU. VIPLFaceNet provides a state-of-the-art start point for both academic and industrial face recognition applications.

 

nocomment

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter,subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin.

Banner

nocommentAI


O'Reilly Data Reveals Surge In AI Learning
08/01/2025

The O'Reilly Technology Trends for 2025 Report is based on annual usage data from O’Reilly’s online learning platform data. It reveals a "dynamic landscape of developer learning", with AI tec [ ... ]



Google Releases Gemini 2 And Jules Code Agent
18/12/2024

Google has announced an updated version of Gemini, saying that Gemini 2.0 Flash Experimental will "enable even more immersive and interactive applications", along with new coding agents that can take  [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

 

 

Last Updated ( Monday, 26 September 2016 )