COVID Results Skewed By Faulty Data Import |
Written by Alex Denham | |||
Monday, 05 October 2020 | |||
The official number of coronavirus cases in the UK has been under-reported by 16,000 during recent days - because of a data import error. In addition to the figures being skewed, people who had tested positive weren't notified, meaning their contacts also went unnotified. Public Health England, a UK governmental department, said that 15,841 cases between 25 September and 2 October were left out of the UK daily case figures. The missing cases were added back at the weekend, causing an apparent spike in case numbers. The problem has now been resolved, according to Public Health England. Their interim chief executive Michael Brodie said that a "technical issue" was identified overnight on Friday, 2 October in the process that transfers Covid-19 positive lab results into reporting dashboards. This was caused by some data files reporting positive test results exceeding the maximum file size. News outlets and social media have reported that the problem arose when an Excel spreadsheet reached its maximum file size, meaning no further rows could be added. This scenario has the results from labs carrying out Covid tests automatically entering the figures into spreadsheets, then those spreadsheets being sent to a central PHE facility to be collated. Because Excel spreadsheets are limited in the maximum number of rows, while CSV files aren't, if a CSV file is opened the data values beyond the Excel maximum are truncated. If that was the case, it would be quite shocking that a government department was trying to run a major data analysis on a spreadsheet. I'm not saying it wouldn't happen and doesn't happen, but for something of this magnitude? A (hopefully more likely) view is that what actually happened was a script to import CSV data into something other than Excel timed out. The sources reporting this say the fix was simply to set the timeout parameter to something suitably massive. The Press Association reports that the data files have been split into several smaller subfiles to overcome the problem. Whichever version is correct, the problem shouldn't recur. Either way, it's a reminder to developers everywhere. Error trapping and reporting can make the difference between a private aargh, let's run that again', and far-too-public reproaches.
More InformationRelated ArticlesWhat Skills Do Data Scientists Need Programmer's Guide To Theory - Error Correction End Manual Data Entry in Excel - Thanks AI! John Conway Dies From Coronavirus Fighting Coronavirus At Home With Exascale Power Smartphone App Borrows Power For Corona Virus Research To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info |
|||
Last Updated ( Monday, 05 October 2020 ) |