Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Entropy calculation with acrostic is wrong #127

Open
ilyagr opened this issue Nov 16, 2020 · 1 comment
Open

Entropy calculation with acrostic is wrong #127

ilyagr opened this issue Nov 16, 2020 · 1 comment

Comments

@ilyagr
Copy link

ilyagr commented Nov 16, 2020

If there are 382 words that start with 'a', then the entropy of the acrostic 'aaaaa' should be ln_2(382)*5. However, this is not what xkcdpass reports:

$  xkcdpass -V --acrostic 'a'
With the current options, your word list contains 382 words.
A 1 word password from this list will have roughly 8 (8.58 * 1) bits of entropy,
assuming truly random word selection.

anyone

$  xkcdpass -V --acrostic 'aaaaa'
With the current options, your word list contains 1910 words.
A 5 word password from this list will have roughly 54 (10.90 * 5) bits of entropy,
assuming truly random word selection.

amiable activism arise aspire ageless

(The correct answer would be 8.58 * 5 in the second example)

I believe the problem is here:

if options.acrostic:
worddict = wordlist_to_worddict(wordlist)
numwords = len(options.acrostic)
length = 0
for char in options.acrostic:
length += len(worddict.get(char, []))

Currently, it computes the entropy as num_words * ln_2(sum of lengths of word lists for each letter in the acrostic).

The correct formula for the entropy is ln_2(# of words for letter 1) + ln_2(#of words for letter 2) + .... Separating length and num_words doesn't make sense in this setting.

@ilyagr ilyagr changed the title Entropy calculation with acristics is wrong Entropy calculation with acrostics is wrong Nov 16, 2020
@ilyagr
Copy link
Author

ilyagr commented Nov 16, 2020

So, it should be something like:

 if options.acrostic: 
     worddict = wordlist_to_worddict(wordlist) 
     entropy = 0.0 
     for char in options.acrostic: 
         if char not in worddict:
            # Less confusing error message than 'math domain error'
            raise ValueError('No words in list start with letter `{}`.'.format(char))
         entropy += math.log(len(worddict[char]), 2)
     print('The entropy with the acrostic is approximately', int(entropy), 'bits.')
  else:
    # Rest of the function

@ilyagr ilyagr changed the title Entropy calculation with acrostics is wrong Entropy calculation with acrostic is wrong Nov 16, 2020
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant