# Information Theory and the Number of Unique Tweets

By Deane Barker on March 15, 2013

I gotta say, I was pretty amazed to find this article over at XKCD that attempts to answer the question “How many unique English tweets are possible?”

This is interesting, primarily because I’m just coming off a very brief but intense fling with information and communication theory.  In the span of three weeks, I read the following:

In all three, I got introduced to Claude Shannon, a mathematician who invented the field of information theory – the science, essentially, what what comprises information and how that gets communication.  He wrote a paper in 1948 which more or less defined the field.

This field necessarily got all mathematical in the early 20th century when the telegraph showed up, because the goal was to cram the most information into the least amount of bandwidth – communicate the most with the least.

This necessarily leads to very philosophic questions about what information actually is and how it goes from random data to coherence.  Think about: random data is communicating…random data.  Not meaning.  Not information, really.  From the article.

For example, “Hi, I’m Mxyztplk” is a grammatically valid sentence if your name happens to be Mxyztplk. (Come to think of it, it’s just as grammatically valid if you’re lying.) Clearly, it doesn’t make sense to count every string that starts with “Hi, I’m …” as a separate sentence. To a normal English speaker, “Hi, I’m Mxyztplk” is basically indistinguishable from “Hi, I’m Mxzkqklt”, and shouldn’t both count. But “Hi, I’m xPoKeFaNx” is definitely recognizably different from the first two, even though “xPoKeFaNx” isn’t an English word by any stretch of the imagination.

It also depends greatly on the context of the receiver – how much do they know about what you’re going to say?  What can they extrapolate from the data to form meaning?

information is fundamentally tied to the recipient’s uncertainty about the message’s content and their ability to predict it in advance.

Anyway, the XKCD article is interesting, and information theory is worth looking into.  I wish I could recommend the books I mentioned above, but none of them really got at the core philosophy of information in the way I wanted.