A Kaggle contest sponsored by the ATLAS Experiment at CERN is underway to explore the potential of advanced machine learning methods in detecting the Higgs boson.
The HiggsML challenge started on May 12th and runs until September 15th. There are already 658 teams (701 people in total) taking part but there's plenty of time left to join in this interesting and topical contest where HEP (High Energy Physics) meets ML (Machine Learning).
To make an effective contribution no knowledge of particle physics is required; instead you need a good grasp of classification and statistical methods.
ATLAS is the particle physics experimenttaking place at the Large Hadron Collider at CERN that searches for new particles and processes using head-on collisions of protons of extraordinarily high energy. It has recently observed a signal of the Higgs boson decaying into two tau particles, but this decay is a small signal buried in background noise. This challenge want "to improve the discovery significance of the experiment " using advanced machine learning methods to classify events in simulated data into "tau tau decay of a Higgs boson" versus "background".
The supplied data consists of:
training.csv - Training set of 250000 events, with an ID column, 30 feature columns, a weight column and a label column
test.csv - Test set of 550000 events with an ID column and 30 feature columns
Participants are also provided with a sample submission file in the format required and a Python script tp calculate hte competition evaluation metric. Other software that is available includes a MultiBoost benchmark script and a Go starting kit.
The competition rules are on Kaggle. They include a limit on team size of 4 members and only being able to submit from a single Kaggle account.
Contestants can make up to 5 submissions a day and a public leader-board is maintained. This is calculated on around 18% of the test data. The final result will be based on the other 82% and contestants can select up to 2 final submissions for judging. Cash prizes of $7,000; $4,000; and $2,000 will be awarded to the top three on the private leader-board.
There is also a HEP meets ML Award that will be given to the team that creates a model that is most useful for the ATLAS experiment, as judged by the ATLAS collaboration members on the organizing committee.
Such a model won't necessarily be top performing on the leader-board but will be one that optimizes accuracy, shows simplicity and straightforwardness of approach, has reasonable performance requirements (CPU and memory demands), and robustness with respect to lack of training statistics. The winning team will be invited to meet the ATLAS collaboration physicists at CERN, with up to $2,700 to cover their travel expenses.
There may also be the opportunity for strong performers in the contest to contribute to a NIPS Workshop in Montreal. To be eligible for consideration teams must submit their model (the code, with reasonably detailed information on the principles behind it) by the competition deadline.
One of the biggest problems programmers face today is making a single code base work across a range of systems. How a giant company like Google solves the problem is obviously going to be interesting. [ ... ]