Random Letter Generation

One of the first problems encountered in my game is how to create a series of random letters.  Each player’s board has a series of random letters constantly being produced and although that doesn’t seem too difficult, I wanted it to actually generate meaningful letters that could be easily used.

If I went a strictly naive route I’d just make a simple random letter generator that selects a random letter between A and Z.  In English, however, not every character has a 1 in 26 chance of appearing in words.  Most people know that there are a lot more E’s than X’s for example.  That’s why the X tile in Scrabble is worth more than the E or S tile.  We can actually look to Scrabble for some insight on how to analyze letter distribution, but I found it to be hiding too many of the details.

Then it occurred to me that I actually have a very representative sample of English (the 172,000 word list).  I can just analyze the text of the word list to find letter frequencies.  Here’s an algorithm:

  1. Scan each word out of the word list file.  Trim all whitespace and non-alpha characters.
  2. Build a dictionary with a char as a key and an integer as the value – this will hold the count of each kind of character.
  3. Iterate over each word and each character in the word to put it in the frequency table.  Also count each letter to find the total number of letters in the word list.
  4. For each letter in the dictionary, find the percentage of use in the word list by dividing the frequency table count by the total letter count.  This is the probability that the letter will occur in the word list.

That analysis is actually performed by the Content Processor I wrote to build my word DFA.  The result is a probability table that is written to disk and then imported by the game’s runtime.

The table that is produced is simply a 1,000 character table with each character replicated the number of times to make the probability work out (For example, E occurs 11.5% of the time in my word list, therefore 115 characters of E are placed in the table).  When you want a random letter, pick a random integer between 0 and 999.  Then take that integer and use it as an index into the character probability table and the character that comes out is the random character.

The result of all this is that the proper characters to form lots of words are generated for the player and he isn’t left with a bunch of infrequently used characters like Q, X, Z, or J.

Advertisements

2 Responses to Random Letter Generation

  1. Mark says:

    Come now. “Laziness is a virtue for a programmer” Let someone else do your work for you:

    http://www.csm.astate.edu/~rossa/datasec/frequency.html

    Maybe you just have to break down and go to grad school already to learn how to be really lazy 😉

  2. […] it harder if the speed of the game shouldn’t increase too much?  The solution may lie in the random letter generation.  In easier difficulties, I could randomly generate the same way I currently am – use a true […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: