How Fast Does This Code Run?
Written by Mike James   
Tuesday, 14 January 2020

MIT researchers have trained a neural network to tell you how fast any code that you present it with will run. Sounds fun but why do we need it?

4004

The 4004 chip - you really knew what it was doing!

Back in the day it was possible to look at the output of a compiler, or some hand-crafted assembler, and have a reasonable idea how fast the code would run. You could do the job simply by counting clock cycles for each instruction in the particular version you were using. Then things got more complicated.

In an effort to speed things up, processors went in for reordering instructions, branch prediction, cache misses,  speculative execution, execution barriers, multiple cores and hyper threading. Not only does that make it it difficult to work out how many clock cycles any particular chunk of code would take to execute, it isn't even a fixed and unchanging number according to whether branches are taken or not and so on.

So what to do about it?

The only reasonable solution is to try the code out and gather statistically valid data over a large number of runs and hope. This doesn't really make optimizing code particularly easy as the time to cycle through the procedure is quite large. Now a team from MIT has trained a neural network to estimate the time that "basic blocks" of code, i.e. common snippets, take to execute on different architectures. The size of the sample data is impressive - 300,000 blocks taken from a range of different types of application. This is now available as BHive, an open source dataset.

What is surprising is that the resulting program, Ithemal, managed to predict running times on the latest Intel processors more accurately than hand-crafted models created by Intel - and you might suppose that it knows its own processors. Typically Ithemal's error rate is 10% while hand-crafted models have an error rate of 20%.

The program might also have applications in creating optimizations for "black box" processors. i.e. devices for which the exact design isn't known. The neural network doesn't have any idea what the structure of the processor is, instead it uses the code blocks to find out what the processor runs fast and what it runs slow. It is an entirely empirical approach. The approach is also immune from errors in the documentation as it learns from real implementations.

Of course, being a black box in its own right, the neural network gives no clue as to why some code is faster than others. There are no useful insights generated by the tool. The researchers see finding interpretations of the neural network's output as an important next step. Perhaps it will be possible to put the knowledge into some sort of human understandable rules - like always put the most probable choice in the non-branch path.

You can see that this is going to be useful for compiler designers as a way of working out how to optimize the generated code. I'm wondering if it captures any of the structure of the processor and whether this could be used to validate or improve the hardware. AI is certainly changing what software is all about.

simulation

More Information

Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks Charith Mendis, Alex Renda, Saman Amarasinghe and Michael Carbin

Compiler Auto-Vectorization with Imitation Learning Charith Mendis, Cambridge Yang, Yewen Pu, Saman Amarasinghe and Michael Carbin

Related Articles

 Bayou - AI To Help You Code

The AI In The IDE - IntelliCode In Visual Studio

Do AI, Automation and the No-Code Movement Threaten Our Jobs?

Ubisoft Applies AI To Code

DeepMind's Differentiable Neural Network Thinks Deeply

Neural Turing Machines Learn Their Algorithms

Learning To Be A Computer

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

 

Banner


Apache Fury Adds Optimized Serializers For Scala
31/10/2024

Apache Fury has been updated to add GraalVM native images and with optimized serializers for Scala collection. The update also reduces Scala collection serialization cost via the use of  encoding [ ... ]



Meta Releases OpenSource Podcast Generating Tool
28/11/2024

Meta has released an open source project that can be used to automatically convert a PDF file into a podcast. Meta says Notebook Llama can be considered an open-source version of Google's NotebookLM.

 [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

s possible with software.

 

Last Updated ( Tuesday, 14 January 2020 )