Knuth's 21st Not Christmas Tree Lecture
Thursday, 24 December 2015

This year's annual lecture by Donald Knuth lecture is a keeper. It is on a topic that will be understandable even if you don't know a lot about it - commafree codes. 

Donald Knuth should be known to you but just in case - he invented the TeX computer typesetting system, is the author of the monumental work The Art of Computer Programming and of many algorithms. He also has a strange sense of humour so look out for it when you watch the video.  

The tradition for the Christmas Tree Lecture that is given at Stanford University in December is that is related to something new about trees learned during the year. For the first time, in the 21st of the series, the talk is nothing to do with trees and instead is about commafree codes:

"A commafree code is a set of codewords that can be read easily without spaces or other delimiters between words.

In 1965, Willard Eastman discovered a beautiful but underappreciated way to construct commafree block codes of all odd lengths, over an infinite alphabet. Professor Knuth will explain his construction and its interesting connection to questions of iteration versus recursion."

Sounds easy or obvious even but a few attempts at creating such a thing will convince you that it is tricky. A more modern term is self-synchronizing code. Basically the idea is that you don't have to send markers within a code to divide up the code words because they can be separated without such markers. That is if you take a set of words and run them together without spaces or punctuation then it should be possible to separate them again if they form a comma free code. The key idea is that it shouldn't be possible to find one of the words as the result of putting two of the words together or within another word. 

For example, bear,ring, earring, ear is a very non-comma free code. Putting all the words together 

bearringearringear 

makes it very clear that you cannot uniquely separate them.

For example 

bearringearringear 

picks out a code word that shouldn't be picked out.

You can also add the condition that code words are all of the same length. A binary example of this is the two bit code 00,11. In any sequence of these code words you never get a spurious code word due to the concatenation of 00 or 11 e.g.  00111100000011 can be broken into code words in only one way no matter where you start from.

 

 

 If you are interested in this topic then the good news is that there is a pdf of the next pre-publication chunk of The Art of Computer Programming on Backtracking and it covers algorithms that apply to comma free codes:

The Art of Computer Programming: Volume 4, Pre-fascicle 5B: Introduction to Backtracking

 

 

stanfordlogoforknuth

More Information

Computer Musings 

The Art of Computer Programming: Volume 4, Pre-fascicle 5B: Introduction to Backtracking (pdf)

Related Articles

Donald Knuth's Christmas Tree Lecture 

Donald Knuth and the Art of Programming     

Another Chunk of The Art of Computer Programming 

How not to shuffle - the Knuth Fisher-Yates algorithm 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin

 

Banner


Is PHP in Trouble?
10/04/2024

The April 2024 headline for the TIOBE Index, which ranks programming languages in terms of their popularity, reads, "Is PHP losing its mojo" asking this question because this month PHP has dropped out [ ... ]



Supersimple - Deep Insights From Data
02/04/2024

Announcing $2.2 Million in pre-seed funding, the Estonian startup Supersimple has launched an AI-native data analytics platform which combines a semantic data modeling layer with the ability to answer [ ... ]


More News

 

raspberry pi books

 

Comments




or email your comment to: comments@i-programmer.info

 

<ASIN:0201558025>

<ASIN:0321751043>

Last Updated ( Thursday, 24 December 2015 )