Human Genes Renamed To Please Excel |
Written by Janet Swift |
Friday, 07 August 2020 |
More than two dozen human genes have been renamed so that they can be typed into a spreadsheet without being formatted as dates. New guidelines for standardized gene naming explicitly allow for renaming genes to avoid problems with data handling.
The human genome has tens of thousands of unique genes - originally it had been assumed to be more than 100,000 but this number has subsequently been revised downwards. Giving each individual gene a meaningful name is seen as important to facilitate effective communication and the fact that some genes have had to be renamed on account of Excel has attracted a great deal of attention. It was the Verge that initially carried this story, alerted by a tweet that drew attention to this extract from the newly published Guidelines for human gene nomenclature:
The Verge outlined the problem with: when a user inputs a gene's alphanumeric symbol into a spreadsheet, like MARCH1 -- short for "Membrane Associated Ring-CH-Type Finger 1" -- Excel converts that into a date: 1-Mar. This is extremely frustrating, even dangerous, corrupting data that scientists have to sort through by hand to restore. It's also surprisingly widespread and affects even peer-reviewed scientific work. One study from 2016 examined genetic data shared alongside 3,597 published papers and found that roughly one-fifth had been affected by Excel errors. Elsepeth Bruford, coordinator of the HUGO Gene Nomenclature Committee, revealed to The Verge that so far the names of some 27 genes have been changed and she noted that while there has been some dissent about the decision, it was easier to rename human genes than it was to change how Excel works. In fact, HGNC had initially tried to change the way that geneticists used Excel and last year posted a YouTube video that showed how to enter data in Excel in order to avoid it converting gene names to dates: So, by changing gene names, are the geneticists now caving in when they should be asking Microsoft to fix the date formatting issues, which annoy other groups of users as well? The consensus both among those commenting on the Verge's article and on Hacker News which linked to it, is that eliminating names that contain dates is a sensible move. This is because Excel is a useful tool for scientists across all disciplines to work with data and that while it is possible to "tame" Excel's autoformatting this isn't foolproof, especially if you want to share spreadsheets with other users who have their own formatting options. To us, it seems that this is the biggest case of the tail wagging the dog we have encountered in some time. I make you wonder what would have happened if Excel has wielded such power in former times? Perhaps e=mc2 would have been E1=M1*C1*C1 or quark might have been autocorrected to quart.
More InformationGuidelines for human gene nomenclature Related ArticlesCalculating with Dates in Excel To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info
|
Last Updated ( Friday, 07 August 2020 ) |