Jump to content

Recommended Posts

  • Replies 2.5k
  • Created
  • Last Reply

Top Posters In This Topic

bait

 

Trivia: In N-Word Space, say in some Unabridged Dictionary what is the answer to the following questions:

 

1) What is the distribution of letters? Is is sparse or dense?

1a) Related questions: what does the word distribution look like as a function of length of word?

2) Given the answer to 1a - what number of letters would allow the most flexibility with the fewest repeated words for this game?

3) For a given length word (in this game four), can all possible four letters words be gotten to given the rules?

 

Next post will include these answers - I already have them (from an Unabridged Dictionary I used to use for, well... black hat purposes... in another life. :eek: ) These are interesting questions though (to lovers of language and syntax anyway...)

 

Cheers,

 

Patrick

Edited by pagoda
Link to post
Share on other sites

wail

 

More correctly, I should have said that I have some of the answers, but not all... Those I do have are shown below.

 

#1) What does the distribution of letters in the Unabridged Dictionary look like?

 

The Letter a Appears 184697 Times

The Letter b Appears 36966 Times

The Letter c Appears 96102 Times

The Letter d Appears 63378 Times

The Letter e Appears 217977 Times

The Letter f Appears 21933 Times

The Letter g Appears 43713 Times

The Letter h Appears 59952 Times

The Letter i Appears 188408 Times

The Letter j Appears 2717 Times

The Letter k Appears 14630 Times

The Letter l Appears 121765 Times

The Letter m Appears 65693 Times

The Letter n Appears 148412 Times

The Letter o Appears 159708 Times

The Letter p Appears 73194 Times

The Letter q Appears 3394 Times

The Letter r Appears 149137 Times

The Letter s Appears 130612 Times

The Letter t Appears 141357 Times

The Letter u Appears 81553 Times

The Letter v Appears 18515 Times

The Letter w Appears 12406 Times

The Letter x Appears 6385 Times

The Letter y Appears 48375 Times

The Letter z Appears 8122 Times

 

The Total Number of Letter in the Unabridged Dictionary Is: 2099101

 

#1a) What is the distribution of words in the Unabridged Dictionary as a function of word length?

 

My Unabridged Dictionary contains 213,583 words. Each individual letter in the English alphabet is considered a "word" even though we don't normally think of all of them in that manner. To be sure, some of them we do (such as "a"), but not the majority of them.

 

Number of Words of Length [ 1]: 26

Number of Words of Length [ 2]: 61

Number of Words of Length [ 3]: 627

Number of Words of Length [ 4]: 2988

Number of Words of Length [ 5]: 7198

Number of Words of Length [ 6]: 14163

Number of Words of Length [ 7]: 20452

Number of Words of Length [ 8]: 27015

Number of Words of Length [ 9]: 29824

Number of Words of Length [10]: 29220

Number of Words of Length [11]: 25021

Number of Words of Length [12]: 19966

Number of Words of Length [13]: 14683

Number of Words of Length [14]: 9672

Number of Words of Length [15]: 5890

Number of Words of Length [16]: 3363

Number of Words of Length [17]: 1808

Number of Words of Length [18]: 838

Number of Words of Length [19]: 428

Number of Words of Length [20]: 197

Number of Words of Length [21]: 81

Number of Words of Length [22]: 40

Number of Words of Length [23]: 17

Number of Words of Length [24]: 5

 

This is a near Gaussian distribution. In the context of this Scrabble game, it shows that there are 2,988 possible four letter words.

 

The question of sparseness or denseness is really more relevant to words than letter. So, how many possible combinations are there for each letter?

 

For Words of Length 1 There Are: 26 Combinations

For Words of Length 2 There Are: 676 Combinations

For Words of Length 3 There Are: 17576 Combinations

For Words of Length 4 There Are: 456976 Combinations

For Words of Length 5 There Are: 11881376 Combinations

For Words of Length 6 There Are: 308915776 Combinations

For Words of Length 7 There Are: 8031810176 Combinations

For Words of Length 8 There Are: 208827064576 Combinations

For Words of Length 9 There Are: 5429503678976 Combinations

For Words of Length 10 There Are: 141167095653376 Combinations

For Words of Length 11 There Are: 3.67034448698778e+15 Combinations

For Words of Length 12 There Are: 9.54289566616822e+16 Combinations

For Words of Length 13 There Are: 2.48115287320374e+18 Combinations

For Words of Length 14 There Are: 6.45099747032972e+19 Combinations

For Words of Length 15 There Are: 1.67725934228573e+21 Combinations

For Words of Length 16 There Are: 4.36087428994289e+22 Combinations

For Words of Length 17 There Are: 1.13382731538515e+24 Combinations

For Words of Length 18 There Are: 2.94795102000139e+25 Combinations

For Words of Length 19 There Are: 7.66467265200362e+26 Combinations

For Words of Length 20 There Are: 1.99281488952094e+28 Combinations

For Words of Length 21 There Are: 5.18131871275445e+29 Combinations

For Words of Length 22 There Are: 1.34714286531616e+31 Combinations

For Words of Length 23 There Are: 3.50257144982201e+32 Combinations

For Words of Length 24 There Are: 9.10668576953721e+33 Combinations

 

In other words, the set of words that we actually use in our language is VERY sparse when compared with the number of possible combinations. Anyone know why this is so? (There is a good argument as to why this is the way things are the way they are with respect to our language. Yes, it's partly that our lexicon cannot contain so many words (we could not remember them all) but it's deeper than just that answer...)

 

#2) Given the answer in #1a what length of initial word would allow the most flexibility in terms of not repeating words (as has been done many times in the four letter version)?

 

I don't know the answer since I do not know how sparse the words are in relation to one another. Since the rules of the game allow for the changing of only one letter per turn, we would need to know the word length that has the most densely packed set of words each of which has another word of distance one letter from one another using some path to get to another word (i.e. to go from "tail" to "fail" changes the first letter of the word, so these words differ by one letter, but are not near neighbors in an alphabetical list - thus, they are near in the sense that they differ by one letter.) That said, one would guess that the game would have many more possible answers (i.e. playable words) using words of length 8, 9, 10 or 11. However, the initial word choice would determine the space of possible playable words.

 

#3) It is sufficient to answer this question by finding an example in which a word or small set of words is self contained - that is, cannot be expanded to create a new word by the changing of one letter. So - is there either a single word or a small cycle of words of length four that cannot be reached given the starting point chosen for this game of Scrabble? I don't know the answer to this either. However, both this problem and the above problem should be solvable, but the answers are not obvious to me right now. You can bet I will keep looking. After all, what better way to waste time... :)

 

Cheers,

 

Pagoda

Edited by pagoda
Link to post
Share on other sites

(I'm not responding to my own post - I just didn't get this edit in fast enough - ignore this post in the context of the game...)

 

Bruce:

 

kyte

 

That a great one! My unabridged dictionary says the following:

 

kyte: the paunch; stomach; belly

 

:lol:

Link to post
Share on other sites