Kaggle Contest To Detect Chatbot Essays

Written by Sue Gee

Friday, 03 November 2023

As LLMs like ChatGPT rapidly improve their ability to generate text similar to human-written content, educators have very real concerns about how to distinguish between students own work and that generated with undue help from artificial intelligence. A Kaggle contest has just launched to detect whether an essay was written by a student or an LLM.

kaggle

With its community of over 15 million members, Kaggle is the obvious place to turn to for some machine-learning approach to of authenticating the work undertaken by conscientious students and of deterring this new method of cheating. And Kagglers seem enthusiastic to tackle the problem and there are already 320 teams, mostly individuals, making submissions. With almost 3 months to go before the Final Submission Deadline there's plenty of time to join in.

The contest comes from Vanderbilt University and the Learning Agency Lab with financial support from the Bill & Melinda Gates Foundation, Schmidt Futures, and Chan Zuckerberg Initiative.

The challenge is to develop a machine learning model that can accurately detect whether an essay was written by a student or an LLM.

The competition dataset comprises about 10,000 essays. All of the essays were written in response to one of seven essay prompts. In each prompt, the students were instructed to read one or more source texts and then write a response. This same information may or may not have been provided as input to an LLM when generating an essay. The competition blurb states:

Essays from two of the prompts compose the training set; the remaining essays compose the hidden test set. Nearly all of the training set essays were written by students, with only a few generated essays given as examples. You may wish to generate more essays to use as training data.

In fact one of the participant's has already made additional ai-generated essays available

This is a Code Competition and submissions must be made through either a CPU or a GPU Notebook and require no more than 9 hours of runtime.

The prize pool of $110 will be divided between Leaderboard Prizes, awarded for predictive performance and Efficiency Prizes, where the runtime required for a submission is also evaluated - and this is restricted to CPU only. Winning a Leaderboard Prize does not preclude you from winning an Efficiency Prize. For both prizes 1st Place wins $20,000.

While the immediate concern of the competition is to identify essays written using LLMs in a middle-school or high-school context, in a broader context the models participants devise will help identify telltale LLM artifacts and advance the state of the art in LLM text detection overall.

learning agency lab logo

More Information

LLM - Detect AI Generated Text

Vesuvius Challenge - Progress and Prizes

AI Village Capture The Flag

Kaggle Enveloped By Google Cloud

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Get Ready For Pure Virtual C++ 2025 Conference
22/04/2025

Pure Virtual C++ is Micorosft's free, one-day, virtual conference for the whole C++ community. This year, it is running on April 30th.

+ Full Story

Google Redesigns Play Console
18/04/2025

Google has updated its Play Console to provide developers with a dashboard for workflows and new metrics. Play Console is Google's tool for developers where subscribers can manage the apps and games t [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Friday, 03 November 2023 )

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments