Latest Vulnerability Suggests Compilers Should Learn Unicode |
Written by Mike James |
Wednesday, 10 November 2021 |
There is a fuss at the moment about a security problem that could allow a Trojan to enter the code of any language. It is the "any language" part that seems to be the scary bit. What is going on? In fact, it's all very simple. Unicode is great with the possible exception of the existence of homoglyphs - that is two characters which look the same to a human, but one can be a single character and the other a composite of different partial characters. This causes no end of problems for simple tasks such as comparing two strings. It is also a potential security problem, but more on this in a moment. The current security concern isn't about homoglyphs, but about being able to change the order of text using directionality overrides. We tend to think that text runs left to right, but of course some scripts are right to left. Unicode supports setting the direction of text, but it also allows you to embed codes which modify the global direction. For example, LRI and RLI set local Left to Right and Right to Left for the following characters until a PDI - Pop Directional Isolate is encountered. So for example: RLI LRI a b c PDI LRI d e f PDI PDI displays d e f a b c You can also use LRO and RLP to override the direction of all the text following. You can see that careful use of the direction codes could allow you to swap the order of words. If you are clever enough, or have the time to think deeply about how to use this to good effect, you could invent the following: /*RLO } LRIif (isAdmin)PDI LRI begin admins only */ which, when you take the directions into account, displays as: /begin admins only */ if (isAdmin) { This looks perfectly good for restricting access to just admins and, if this was in a pull request, you might well let it go. However, most compilers simply ignore control codes and the code that the compiler sees is: /* } if (isAdmin) begin admins only */ What the compiler sees is code that has no if statement at all and so admits everyone to the admin section of the program. Once you have seen this sort of thing it is relatively easy to think up other uses of direction codes and while the example is in C you can easily find similar mechanisms in other languages. Here is one in JavaScript: if(acesslevel != "userRLO LRI// check if adminPDI LRI"){ which the user reads as: if(acesslevel != "user"){// check if admin but the compiler reads as if(acesslevel != "user// check if admin") i.e. the check is going to fail and the acesslevel defaults to admin. In fact there probably isn't a language which isn't vulnerable to such tricks. Or is it the maintainer who is vulnerable? Or is it the compiler? For my money it's the compiler that is the problem. Since when did compilers decide not to enter the 21st century and recognize that Unicode not only exists but can modify the meaning of a program. Compilers and other language tools either have to reject dangerous Unicode or they have to read it like a human would. To quote from the paper by Nicholas Boucher and Ross Anderson from the University of Cambridge disclosing the problem: "About half of the compiler maintainers we contacted during the disclosure period are working on patches or have committed to do so. As the others are dragging their feet, it is prudent to deploy other controls in the meantime where this is quick and cheap, or relevant and needful." Not so much an exploit, more an oversight. More InformationTrojan Source: Invisible Vulnerabilities Nicholas Boucher, Ross Anderson Related ArticlesOpen Source Insights Into The Software Supply Chain New Spectre-Like Vulnerability - Is The Era Of Fast Clever Computers Over? Rowhammer - Changing Memory Without Accessing It ShellShock - Yet Another Code Injection Vulnerability Heartbleed - The Programmer's View To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info
|
Last Updated ( Wednesday, 10 November 2021 ) |