Friday, 1 February 2013

How to crack cryptograms




Cryptograms are great little puzzles, they pack a lot of punch for their size! This post shows you the basic steps when tackling one of these conundrums. I'm rather partial to cryptograms, and have even written a whole book of the little beasts.

Either I'm extremely tiny, or these book covers are extremely big!

 A cryptogram is a substitution cipher, which means that each letter of the alphabet has been replaced, or substituted, with another symbol. This symbol can be another letter, a number, or a little icon. (In a code whole words are replaced by a single symbol — such as 'shoot the zombies' is encoded as '16' — and they are very difficult to crack. So don't call cryptograms codes. You'd be wrong, and we can't have that!).

To crack a cryptogram, you need to use letter frequency analysis. Sounds fucking scary, doesn't it? But don't worry, I'll take you through the basic steps in this process below.

Here's the cryptogram for this exercise :

KZV UDFX SODH UT YVPTUUH O KLRYK OY KZV TOYZ YKOES, P KUKPFFX TVPKRLVFVYY TOYZ KZPK HUVYD'K ZPAV VXVCPFFY UL TODY. HPAV CPLLX 

Here's what you'll need:
  • A cryptogram or two
  • A pencil
  • An eraser
  • Some scrap paper
  • A brain
Here's the bare bones list of what to do ... I explain these steps below:


   Look for single letter words
   Count the symbols, and find the most common ones
   Look for THE and AND
   Look for THAT
   Look for double letters
   Look for apostrophes
   Look for 2-letter words

1) Look for Singles


A single letter word will almost always be A or I. Pencil one of these choices in over any single letter word in the cryptogram.

KZV UDFX SODH UT YVPTUUH O KLRYK OY KZV TOYZ YKOES, P KUKPFFX TVPKRLVFVYY TOYZ KZPK HUVYD'K ZPAV VXVCPFFY UL TODY. HPAV CPLLX 

In this example, there's two single-letter words, O and P. So O = A or I, and P = I or A!

It's too early to tell which of these two options is most likely, so we need to carry on to the next step, keeping these options in mind ...

Use a pencil to lightly write your guesses above or below each letter in the cryptogram. The best cryptogram books will give you plenty of room for writing above or below the lines of the cipher.

2) Counting Letters


In English text, E is the most common letter, followed by T, A, O, N, R, I, S and H.

So it's always helpful to do a tally of all the symbols used in a cipher, and see which ones are the most abundant.
In this example, the most common cipher letter is H

 In our cryptogram, the most frequently seen cipher symbols are K, V and Y (11 times each), and P (9 times). So it's a fair bet that these four symbols may decrypt into some of E, T, A, O or N.

Cryptograms generally are short pieces of text; in short texts the standard letter frequencies are often a bit skewed — so E may not be the most common letter, or there may be an unusual number of Fs for some reason, for example.

3) Look for THE and AND


THE and AND are the most common 3-letter words in English. 

In our quote, there are two 3-letter words, and they're both the same : KZV

KZV UDFX SODH UT YVPTUUH O KLRYK OY KZV TOYZ YKOES, P KUKPFFX TVPKRLVFVYY TOYZ KZPK HUVYD'K ZPAV VXVCPFFY UL TODY. HPAV CPLLX 

So it's pretty safe to guess that KZV either is THE or AND. It's unlikely that a quotation will start with the word AND, so let's give KZV = THE a try.

This also matches with the information we've learnt above, about K, V, Y or P possibly = E (our 'THE' guess would give us V = E) so it's looking good!

Write in your guess letters lightly in pencil above each K, Z and V in the cipher.

It's really easy to miss letters in a cipher, when pencilling in your decrypted letters over the cryptogram — always read through the cipher at least twice, and don't be surprised if you discover you've missed one now and then.

4) Look for THAT


THAT is a special word - it's the most common 4-letter word, and it's a "pattern word". A pattern word is vital when cracking ciphers. It is simply any word which has repeated letters - not necessarily doubles (more on those shortly), but any letter which appears more than once in a word.

THAT has the letter pattern 1-2-3-1. So when you see a 4-letter word that is in this pattern, where the first and last letters are the same, it's a good bet that it's THAT. There are other options, of course (DEAD is another one), but THAT is the most common one.

KZV UDFX SODH UT YVPTUUH O KLRYK OY KZV TOYZ YKOES, P KUKPFFX TVPKRLVFVYY TOYZ KZPK HUVYD'K ZPAV VXVCPFFY UL TODY. HPAV CPLLX 

In our cryptogram, we have a word that fits this pattern : KZPK. This fits very well with our theory about KZV = THE. Here's K=T again, in KZPK = THAT.

This also decides for us the "A or I" conundrum from Step 1 - it's clear now that P = A, so that also means O = I.

5) Look for Double Letters


Double letters are very helpful in ciphers. By knowing which ones are most frequently seen in English, you can help narrow down your options.

In this cryptogram, we have a bunch of doubles :

KZV UDFX SODH UT YVPTUUH O KLRYK OY KZV TOYZ YKOES, P KUKPFFX TVPKRLVFVYY TOYZ KZPK HUVYD'K ZPAV VXVCPFFY UL TODY. HPAV CPLL

The most commonly seen double letters in English are, in order of frequency : LL (most common), EE, SS, OO, TT and RR. There are plenty of others, of course, but they're not seen so often.

We've already figured out that K = T, and V probably = E, so that narrows down our options. So from this information, it's fair to say that U, F, Y and L could = L, S, O or R.

You can also eliminate letters when you have doubles - many letters rarely or never appear as doubles : A, H, I, J, K, Q, U, V, W, X, Y, Z

6) Check your Punctuation


Punctuation is a great help when cracking cryptograms. Keep an eye out for things like questions marks, as the words CAN, WHY, COULD, MAY, and other interrogative terms might appear in that sentence.

Apostrophes are very handy. The letter following an apostrophe can be :

S (IT'S, WHAT'S ...)
T (CAN'T, DON'T ...)
D (HE'D, SHE'D ...)
M (I'M ...)
LL (I'LL ..)
RE (WE'RE, YOU'RE ...)

In our cryptogram, there's one apostrophe, and it makes sense with our guess about K = T :

KZV UDFX SODH UT YVPTUUH O KLRYK OY KZV TOYZ YKOES, P KUKPFFX TVPKRLVFVYY TOYZ KZPK HUVYD'K ZPAV VXVCPFFY UL TODY. HPAV CPLLX 

7) Look for 2-Letter Words


There are a host of commonly used 2-letter words in English : AN, AS, AT, AN, BE, DO, GO, HE, IF, IN, IS, IT, ME, MY, NO, OF, OK, ON, OR, SO, TO, UP, and WE for starters.

Check through the cryptogram for these two lettered combinations. In our cipher, there are several :

KZV UDFX SODH UT YVPTUUH O KLRYK OY KZV TOYZ YKOES, P KUKPFFX TVPKRLVFVYY TOYZ KZPK HUVYD'K ZPAV VXVCPFFY UL TODY. HPAV CPLLX 

In any 2-letter word, one of the two letters will be a vowel (unless it's a chemical symbol, an abbreviation, or something tricky like that!). 

So we can say that from the cipher words UT, OY and UL, it's likely that two of U, T, O, Y or L are vowels.

8) Putting it all Together


So, let's put all our guesses so far into the cryptogram. I've used lowercase letters for the decrypted message (ignoring punctuation and grammatical conventions for the moment) :

the  UDFX  SiDH  UT  YeaTUUH  i  tLRYt  iY  the  TiYh  YtiES, a  tUtaFFX  TeatRLVFeYY  TiYh  that  HUeYD't  haAe  eXeCaFFY  UL  TiDY.  HaAe CaLLX 

It still looks a bit confusing, but the worst is over! At this point you may be able to make some educated guesses about some of the words. haAe could be HAKE, HALE, HARE, HATE, or HAVE. However, HAVE is the most likely solution, which would give us A = V.

9) More Patterns and Guesses


Now's the time to start making guesses with the remaining letters from your alphabet (we've used up A, E, I, H, T and V). Don't forget, in the text below, the CAPITAL letters are the cipher, and the lower case ones are the 'plain text' (decrypted message).

the  UDFX  SiDH  UT  YeaTUUH  i  tLRYt  iY  the  TiYh  YtiES, a  tUtaFFX  TeatRLVFeYY  TiYh  that  HUeYD't  have  eXeCaFFY  UL  TiDY.  Have CaLLX 

The final two words in the cryptogram, after the full stop, are probably a name (the author of the quote). So Have is probably a first name. Run through the options : Cave, Dave, Fave, Gave, Nave, Pave, Rave, Save, Tave, Wave ... it's fair to guess that the answer is DAVE.  So now we have H = D.

the  UDFX  SiDd  UT  YeaTUUd  i  tLRYt  iY  the  TiYh  YtiES, a  tUtaFFX  TeatRLVFeYY  TiYh  that  dUeYD't  have  eXeCaFFY  UL  TiDY.  dave CaLLX 

Let's look at the 2-letter words more seriously now : UT and UL in particular. We've used up most of the vowels, only O, U, and Y are left. No letter is ever encrypted as itself, so we can rule out U = U. Given the letters left over, it's a fair bet that U = O, giving us OF, ON, and OR to choose from for UT and UL. U is also one of the double letters, which works well if it decodes as O. 

the  oDFX  SiDd  oT  YeaTood  i  tLRYt  iY  the  TiYh  YtiES, a  totaFFX  TeatRLVFeYY  TiYh  that  doeYD't  have  eXeCaFFY  oL  TiDY.  dave CaLLX 

Now things should start to make sense more quickly - what could YeaTood be? Only one option : SEAFOOD. So Y = S, and T = F - nearly there!

10) All Done


the  oDFX  SiDd  of  seafood  i  tLRst  is  the  fish  stiES, a  totaFFX  featRLeFess  fish  that  doesD't  have  eXeCaFFs  oL  fiDs.  dave CaLLX 

It's practically done! totaFFX is easy to decipher now = TOTALLY (so F = L and X = Y), and  doesD't can only be DOESN'T (so D = N). 

the  only  Sind  of  seafood  i  tLRst  is  the  fish  stiES, a  totally featRLeless  fish  that  doesn't have  eyeCalls  oL  fins.  dave CaLLy 

Sind, tLRst, stiES, featRLeless, eyeCalls, oL, and CaLLy won't tax you for long, I'm sure ...only S, L, R, E and C remain to decrypt. No, I'm not going to write it out here, I know you can figure it out for yourself!

For a whole book of cryptograms and ciphers, with a lot of information on how to solve them, check out Cracking Codes and Cryptograms for Dummies which I wrote with Mark Koltko-Rivera a few years ago. There are also many free cryptograms on the Dummies website, also by yours truly:


5 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. This comment has been removed by a blog administrator.

    ReplyDelete
  5. keep up the excellent work, and I will be a regular visitor for a very long time. Thanks! :)

    Team building Melbourne

    ReplyDelete