GitHub For Data Under Development
Written by Kay Ewbank   
Monday, 24 February 2020

Gretel, which will enable developers to work collaboratively with data, comes from a team made up of engineers and developers who previously worked for the National Security Agency, Google and Amazon Web Services.

Their project addresses the problem of needing realsitic user data to work with and provides a way for developers to share  sensitive data in real time while maintaining data privacy.

gretel

The developers of Gretel, Alex Watson, John Myers, Ali Golshan and Laszlo Bock, say it works in real time to enable safe sharing and collaboration between developers and applications, and has tools that are "open, intelligent, and integrated".

The team highlights the importance of developers being able to safely learn and experiment with data in order to support rapid innovation on behalf of customers:

"As developers, we don’t always need full access to sensitive customer data. We know that it’s often best to only select the data that you need for developing new features or exploring insights — especially if you can use your developer identity to access data in seconds, instead of spending weeks or months of waiting for compliance approval."

The Gretel solution is to meet this need using a combination of machine learning, synthetic data, and formal reasoning to offer provable privacy guarantees for data. Using this to ensure privacy into developer workflows, Gretel can enable safe access to data within seconds of the time it is created, unlocking siloed data and opening the door for new ideas.

The synthetic data is fake data that follows the same patterns as real user data, but presumably is more realistic than the old "A. Person, 3, High Street, Sometown" variety that programmers usually resort to. Gretel uses machine learning to work out the categories of the data, and classifies it using as many tags to the data as it can find. Those tags are then used to apply "differential privacy" to make the data anonymous so it doesn't match customer information. This results in an entirely fake data set generated by machine learning.

Alongside the data privacy aspects, the team is developing machine learning models to help developers make sense of their data, and to automate joining data with complementary open source datasets, private datasets, or anything in between. They say all Gretel services are available via simple APIs that integrate with developers’ existing workflows and tools. 

gretel

More Information

Gretel Homepage

Related Articles

Google Dataset Search Out Of Beta

New Database For Data Scientists

Project Cortex Adds AI To Office 365

Google Open Sources Differential Privacy Library

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Learn Python With Dan The Machine Learning Engineer
23/12/2024

aka Dan Kornas who runs a very successful X account about everything related to engineering ML applications. And what is he using in his tutorials? Python, of course.



The PostgreSQL Extension Repo By Pigsty
09/12/2024

A repository containing any PostgreSQL extension you can imagine for Linux distributions is something that might be valuable if you are trying to save some time.


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Tuesday, 25 February 2020 )