Kaggle Contest To Detect Chatbot Essays
Written by Sue Gee   
Friday, 03 November 2023

As LLMs like ChatGPT rapidly improve their ability to generate text similar to human-written content, educators have very real concerns about how to distinguish between students own work and that generated with undue help from artificial intelligence. A Kaggle contest has just launched to detect whether an essay was written by a student or an LLM. 

kaggle

With its community of over 15 million members, Kaggle is the obvious place to turn to for some machine-learning approach to of authenticating the work undertaken by conscientious students and of deterring this new method of cheating. And Kagglers seem enthusiastic to tackle the problem and there are already 320 teams, mostly individuals, making submissions. With almost 3 months to go before the Final Submission Deadline there's plenty of time to join in.

The contest comes from Vanderbilt University and the Learning Agency Lab with financial support from the Bill & Melinda Gates Foundation, Schmidt Futures, and Chan Zuckerberg Initiative. 

The challenge is to develop a machine learning model that can accurately detect whether an essay was written by a student or an LLM.

The competition dataset comprises about 10,000 essays. All of the essays were written in response to one of seven essay prompts. In each prompt, the students were instructed to read one or more source texts and then write a response. This same information may or may not have been provided as input to an LLM when generating an essay. The competition blurb states: 

Essays from two of the prompts compose the training set; the remaining essays compose the hidden test set. Nearly all of the training set essays were written by students, with only a few generated essays given as examples. You may wish to generate more essays to use as training data.

In fact one of the participant's has already made additional ai-generated essays available

This is a Code Competition and submissions must be made through either a CPU or a GPU Notebook and require no more than 9 hours of runtime.

The prize pool of $110 will be divided between Leaderboard Prizes, awarded for predictive performance and Efficiency Prizes, where the runtime required for a submission is also evaluated - and this is restricted to CPU only. Winning a Leaderboard Prize does not preclude you from winning an Efficiency Prize. For both prizes 1st Place wins $20,000. 

While the immediate concern of the competition is to identify essays written using LLMs in a middle-school or high-school context, in a broader context the models participants devise will  help identify telltale LLM artifacts and advance the state of the art in LLM text detection overall.

 

learning agency lab logo

 

More Information

LLM - Detect AI Generated Text

Related Articles

Vesuvius Challenge - Progress and Prizes

AI Village Capture The Flag

Kaggle Enveloped By Google Cloud

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


Can C++ Be As Safe As Rust?
10/04/2024

Herb Sutter is a well known and respected C++ champion and he thinks that the language only needs a few tweaks to make it as safe as Rust. Can this be true?



Udacity's New Discovering Ethical AI Course
12/04/2024

Udacity has just launched an hour-long course on Ethical AI. Intended for a wide audience across many industries, it introduces to basic concepts and terms needed to step into the world of Ethica [ ... ]


More News

raspberry pi books

 

Comments




or email your comment to: comments@i-programmer.info

 

Last Updated ( Friday, 03 November 2023 )