LinkedIn Open Sources Feathr Machine Learning Feature Store
Written by Kay Ewbank   
Friday, 22 April 2022

LinkedIn has made Feathr open source. Feathr is the feature store LinkedIn built to simplify machine learning feature management and improve developer productivity.

The developers say that at LinkedIn dozens of applications use Feathr to define features, compute them for training, deploy them in production, and share them across teams.

linkedin

Feathr was developed to mitigate a problem faced by LinkedIn, that of preparing and managing features based on raw data sources for use by machine learning models. LinkedIn has hundreds of ML models running in applications like Search, Feed, and Ads, and those models are powered by thousands of features about entities.

feathr

Preparing and managing features for use by those ML applications is difficult and takes time. Feature preparation pipelines are made up of the systems and workflows that transform raw data into features for model training and inference. The pipelines are used to bring together time-sensitive data - potentially from multiple sources. Those 'features' are then joined to training labels, stored, and used by the ML applications.

Feathr provides a way to make feature preparation pipeline creation easier. It is an abstraction layer that provides a common feature namespace for defining features and a common platform for computing, serving, and accessing them “by name” from within ML workflows.

Feathr can be used to define features based on raw data sources, including time-series data, using simple APIs. Once the features have been defined, Feathr can be used to access those features by their names during model training and model inferencing. Features can also be shared across teams.

Feathr automatically computes feature values and joins them to training data, using point-in-time-correct semantics to avoid data leakage. It also supports deploying features for use online in production.

Feathr’s abstraction creates producer and consumer personas for features. Producers define features and register them into Feathr, and consumers access/import groups of features into their ML model workflows.

For the consumer, Feathr acts like a software package management tool for ML features. Feathr lets feature-consumers list the names of the features they want to “import” in their model, abstracting the nontrivial details about how they are sourced and computed.

Feathr is available on GitHub now.

linkedin

More Information

Feathr On GitHub

Related Articles

LinkedIn Open Sources Data Streaming Tool

LinkedIn Restricts Developer Access  

LinkedIn Groups API

LinkedIn Developer Network Opens 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


TestSprite Announces End-to-End QA Tool
14/11/2024

TestSprite has announced an early access beta program for its end-to-end QA tool, along with $1.5 million pre-seed funding aimed at accelerating product development, expanding the team, and scaling op [ ... ]



Sequin - Open Source Message Stream Built On Postgres
31/10/2024

Sequin is a tool for capturing changes and streaming data out of your Postgres database, guaranteeing exactly once processing. What does that mean?


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Friday, 22 April 2022 )