Google Opensources Privacy Library
Written by Kay Ewbank   
Friday, 08 November 2024

Google is making a new differential privacy library available as open source. PipelineDP4J is a Java-based library that can be used to analyse data sets while preserving privacy.

Google has made other privacy-related libraries available in open source versions over recent years, including its Differential Privacy Library and its image blurring technology. The "privacy-enhancing technologies", aka PETs, are used to keep users' information anonymous and protected while still letting Google provide recommendations such as autocorrect suggestions. Google has a commitment to make these PET technologies freely available via open source projects.

googleopen

Google says differential privacy is a mathematical framework that allows for analysis of datasets in a privacy-preserving way. Google has used the technique in what it says is the largest application of differential privacy in the world spanning close to three billion devices over the past year, in products such as Google Home, Google Search on Android and Messages.

The technology is based on over six years of research on a "shuffler" model, which effectively shuffles data between "local" and "central" models to achieve more accurate analysis on larger data sets while still maintaining the strongest privacy guarantees.

The latest release is a version of PipelineDP for Java Virtual Machine (JVM) called PipelineDP4j. PipelineDP is an OpenMined framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark and Apache Beam.

PipelineDP4j lets developers execute highly parallelizable computations using Java as the baseline language. Google says the JVM release means the library covers some of the most popular languages including Python, Java, Go, and C++, and developers can also use PipelineDP4j in Kotlin or Scala.

To achieve differential privacy datasets need to meet a minimum threshold to ensure individuals' data isn't revealed. Internally, PipelineDP4j uses elements from the differential privacy library and combines them into an "out-of-the-box" solution that takes care of all the steps that are essential to differential privacy, including noise addition, partition selection, and contribution bounding. Google says this makes it preferable to using the lower-level differential privacy library as PipelineDP4j can reduce implementation mistakes.

Alongside the library, Google is also releasing a library called DP-Auditorium that can be used to test whether a given mechanism violates a differential privacy guarantee. DP-Auditorium uses only samples from the mechanism itself, without requiring access to any internal properties of the application. It introduces interfaces for both components to provide developers with the means to check the privacy guarantee is being met.

 googleopen

More Information

PipelineDP4j On GitHub

DP-Auditorium On GitHub

Related Articles

Google Open Sources Image Blurring

Google Open Sources Differential Privacy Library

Google Releases Open Source Cryptographic Tool

Chrome Cryptocode Generator Revealed 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


The Data Engineering Vault
11/10/2024

A curated network of knowledge designed to facilitate exploration, discovery, and deep learning in the field of data engineering.



Improved Code Completion With JetBrains Mellum
29/10/2024

JetBrains has launched Mellum, a proprietary large language model specifically built for coding. Currently available only with JetBrains AI Assistant, Mellum is claimed to provide faster, sm [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info