My friend, Dr. AD Smith, reminded me this morning that today is Pi Day – 3/14. To me it is just A Pi day rather than THE Pi Day which occurred on 3/14/1592. Encoding is a means of using one thing to represent another – using a calendar date to represent a transcendental number, in this case. We use simple arithmetic encoding to represent letters and special symbols as well as numbers in computers. One such system, ASCII, uses seven binary digits (bits) to represent the 26 letters of English, the numbers 0-9, and special symbols like ampersands, parenthesis, and the like. Other encodings like Hexadecimal, BCD and Unicode use slightly different representations to do the same thing and represent even more symbols.
AD’s reminder of Pi Day got me to thinking about the use of sounds to represent numbers and/or letters. Remember the movie Close Encounters of the Third Kind? The aliens in that story communicated with music. People have been encoding numbers and letters as music for quite some time – at least a couple of thousand years. I decided to look to see whether anyone has done that for Pi. It turns out that several people have – as a compositional exercise or as a mnemonic device to help one remember the first hundred or more digits of Pi (more on this later). FWIW, I have only memorized Pi to nine digits, and I have never needed more.
Others have encoded Pi in ways other than its Greek symbol – as visual art. Folks who achieve such feats are very imaginative and probably have a lot of extra time on their hands.
And this brings me to Codons. My favorite encoding technique involves the use of DNA to store information. I recall listening to a conversation between Eric Topol, MD and a famous molecular biologist who laid claim to having produced the most copies of a (his) textbook. He had encoded the book as a string of DNA codons. He had used DNA amplification PCR (Polymerase Chain Reaction technology like is used to detect COVID in nasal swabs) to make millions of copies of his textbook in a test tube. Take that, Simon and Shuster!
Unless you have some basic knowledge of how DNA works, you are probably wondering, “What the hell is a codon, and how do you use it to encode a textbook. Both are fair questions. If you like SciFi dramas, you may have seen the movie GATTACA at some point. The name of the movie is a combination of the four letters in the DNA alphabet (G,A,T,C) – Guanine, Adenine, Thymine, and Cytosine. Each letter is attached to a special sugar to create a nucleotide (of which there are four, of course). Because the DNA molecule is designed to be redundant and thus capable of being repaired, its so-called double helix matches each A on one strand to a T on the other strand, and each G on one strand to a C on the other. Now, a codon consists of three such nucleotides. Each position in the triplet can be filled by four possible letters (AGTC). Four times Four times Four is 64 – way more combinations than we need to represent the 26 letters of our alphabet.
Just in case you are wondering, the German alphabet has 26+ letters (three vowels can get the umlaut and there is a special letter called the Eszett. The Arabic alphabet has 28 letters. In each case, there are plenty of 3-nucleotide codons to cover the entire alphabet.
Cryptography and Encryption, techniques that are used to pass information securely, are additional forms of encoding. I will stop here before I ruin Pi Day for you. Happy 3.14159265.