Anonymouth Hides Identity
Written by Sue Gee   
Sunday, 04 August 2013

An open source project to combat "stylometry", the study of attributing authorship to documents based only on the linguistic style they exhibit, is proving that it is possible to change writing style so as to evade detection.

Artificial Intelligence techniques are routinely used to detect plagiarism and recently were employed to reveal that Harry Potter author J K Rowling is indeed the author of The Cuckoo's Calling published under the byline of Robert Galbraith. Now software is tackling the opposite problem - anonymizing writing style to protect the identity of the originator.

Students from the Privacy, Security and Automation Lab (PSAL) at Drexel University recently won the  Andreas Pfitzmann Best Student Paper Award at the 12th Privacy Enhancing Technologies Symposium for their paper “Use Fewer Instances of the Letter “i”: Toward Writing Style Anonymization,” which explains this new framework for anonymizing writing style.

 

anonymouth1

The idea behind Anonymouth is that sylometry can be a threat in situations where individuals want to ensure their privacy while continuing to interact with others over the Internet. A presentation about the program cites two hypothetical scenarios:

  • Alice the Anonymous Blogger vs.Bob the Abusive Employer
  • Anonymous Forum vs. Oppressive  Government

and one anecdotal one from Daniel Domschiet-Berg's book Inside Wikileaks:

“I nudged Julian with my foot. We exchanged glances and started giggling. If  someone had run WikiLeaks documents through such a program, he would have discovered that the same two people were behind all the various press releases, document summaries, and correspondence issued by the project."

The JStylo-Anonymouth (JSAN) framework is work in progress at PSAL under the supervision of assistant professor of computer science, Dr. Rachel Greenstadt. It consists of two parts:

 

  • JStylo - authorship attribution framework, used as the underlying feature extraction employing a set of linguistic features
  • Anonymouth - authorship evasion (anonymization) framework, which suggests changes that need to be made

In the small scale user study (10 participants) reported in the award-winning paper, 80% were able to anonymize their documents to a limited extent. Modifying pre-written documents was found to be difficult and the anonymization did not hold up to more extensive feature sets. However, the students point out:

It is important to note that Anonymouth is only the rst step toward a tool to achieve stylometric anonymity with respect to state-of-the-art authorship attribution techniques. The topic needs further exploration in order to accomplish signi cant  anonymity.

The JSAN framework is available on GitHub under a GNU AGPLv3 license for any developer who wants to combat the threat of stylometry.

 

More Information

Privacy, Security and Automation Lab

Use Fewer Instances of the Letter “i”: Toward Writing Style Anonymization

Anonymouth on GitHub

Related Articles

Google Has Another Machine Vision Breakthrough? 

Cat To Human Translation App

Google Search Goes Semantic - The Knowledge Graph 

Grants Awarded To Kivy and NLTK To Boost Python 3       

Handbook of Natural Language Processing (2e)       

 

To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, Facebook, Google+ or Linkedin,  or sign up for our weekly newsletter.

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info

 

Banner


Azul Outperforms OpenJDK By Up To 37%
23/10/2024

Azul has announced that its Azul Platform Prime outperforms comparable OpenJDK distributions by as much as 37%. The company has also launched the Azul Java Performance Engineering Lab (JPEL) aimed at  [ ... ]



MongoDB 8 Reduces Memory Use And Increases Speed
07/10/2024

MongoDb 8 has been released, and the developers have said this is the most secure, durable, available, and performant version of MongoDB yet, with significantly reduced memory usage and query times, a [ ... ]


More News

 

Last Updated ( Sunday, 04 August 2013 )