GitHub Updates Issues Picker
Written by Alex Denham   
Friday, 28 February 2020

GitHub has updated a tool that identifies areas for work on open source projects that are relatively easy and would be a good place to start contributing. The tool uses a combination of a machine learning model that has been trained to identify easy issues, and an associated list put together by project maintainers.

The possibilities are listed as beginner-friendly issues in the 'contribute' section for projects on GitHub, a facility that was first available last year as recommendations based on labels that were applied to issues by project maintainers. The GitHub team analyzed its data and came up with a list of about 300 label names used by popular open source repositories that described either "good first issues” or “documentation”. This search found suitably labelled issues in around 40 percent of repositories.

githubdeklogo

The updated version identifies issues in about 70 percent of repositories falling into the category of being suitable for beginners. This greater coverage has been achieved using a machine learning model that automatically infers labels for hundreds of thousands of candidate samples. Discussing the updated version, GitHub's  Tiferet Gazit said:

"There is a tradeoff between coverage and accuracy, which is the typical precision and recall tradeoff found in any ML product. To prevent the feed from being swamped with false positive detections, we aim for extremely high precision at the cost of recall. This is necessary because only a tiny minority of all issues are good first issues."

Going forward, the aim is to improve the issue recommendations by iterating on the training data, training pipeline, and classifier models to improve the surfaced issue recommendations. The team is also adding better signals to repository recommendations to help users find and get involved with the best projects related to their interests. They also plan to add a mechanism for maintainers and triagers to approve or remove ML-based recommendations in their repositories.

githubdeklogo

 

More Information

GitHub

Related Articles

GitHub Adds New Code Security Features

GitHub Acquires Pull Panda

Counting Vulnerabilities In Open Source Projects and Programming Languages

Don't Neglect Open Source Security

GitHub Sponsors - Money For Open Source

GitHub Bug Bounty Program Expanded In Scope and Reward 

Microsoft GitHub - What's Different 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


DuckDB And Hydra Partner To Get DuckDB Into PostgreSQL
11/11/2024

The offspring of that partnership is pg_duckdb, an extension that embeds the DuckDB engine into the PostgreSQL database, allowing it to handle analytical workloads.



Improved Code Completion With JetBrains Mellum
29/10/2024

JetBrains has launched Mellum, a proprietary large language model specifically built for coding. Currently available only with JetBrains AI Assistant, Mellum is claimed to provide faster, sm [ ... ]


More News

{laodposition comment}