GitHub For Data Under Development |
Written by Kay Ewbank | |||
Monday, 24 February 2020 | |||
Gretel, which will enable developers to work collaboratively with data, comes from a team made up of engineers and developers who previously worked for the National Security Agency, Google and Amazon Web Services. Their project addresses the problem of needing realsitic user data to work with and provides a way for developers to share sensitive data in real time while maintaining data privacy. The developers of Gretel, Alex Watson, John Myers, Ali Golshan and Laszlo Bock, say it works in real time to enable safe sharing and collaboration between developers and applications, and has tools that are "open, intelligent, and integrated". The team highlights the importance of developers being able to safely learn and experiment with data in order to support rapid innovation on behalf of customers: "As developers, we don’t always need full access to sensitive customer data. We know that it’s often best to only select the data that you need for developing new features or exploring insights — especially if you can use your developer identity to access data in seconds, instead of spending weeks or months of waiting for compliance approval." The Gretel solution is to meet this need using a combination of machine learning, synthetic data, and formal reasoning to offer provable privacy guarantees for data. Using this to ensure privacy into developer workflows, Gretel can enable safe access to data within seconds of the time it is created, unlocking siloed data and opening the door for new ideas. The synthetic data is fake data that follows the same patterns as real user data, but presumably is more realistic than the old "A. Person, 3, High Street, Sometown" variety that programmers usually resort to. Gretel uses machine learning to work out the categories of the data, and classifies it using as many tags to the data as it can find. Those tags are then used to apply "differential privacy" to make the data anonymous so it doesn't match customer information. This results in an entirely fake data set generated by machine learning. Alongside the data privacy aspects, the team is developing machine learning models to help developers make sense of their data, and to automate joining data with complementary open source datasets, private datasets, or anything in between. They say all Gretel services are available via simple APIs that integrate with developers’ existing workflows and tools. More InformationRelated ArticlesGoogle Dataset Search Out Of Beta New Database For Data Scientists Project Cortex Adds AI To Office 365 Google Open Sources Differential Privacy Library
To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info |
|||
Last Updated ( Tuesday, 25 February 2020 ) |