Friday 31 August 2007

Bulk Password Generation Part 1

I recently received a list of passwords from someone that had been generated from an administration tool in use at their organisation.
They were interested in knowing whether I could create a tool that would reproduce the passwords for them.

Apparently, the way the tool worked was to generate a password using the numeric asset identifier of the PC the password was to be used on.
The tool produced two passwords for each PC. I am told that one is the password for the local administrator account, and the other was a password for the system BIOS.

The list I received contained a total of 1000 passwords, two for each of the first 500 assets (0 to 499).

The passwords were similar in appearance.
The first set of passwords (which I believe were the local administrator passwords) were 8 characters long, and were structured like this:
UlUlUnns
Where U is Uppercase letter, l is lowercase letter, n is number and s was a symbol.
The second set of passwords (which, apparently, were the BIOS passwords) were also 8 characters long, and were structured like this:
UUUUUnns

I decided to start with the BIOS passwords (as they appeared simpler - they did not contain mixed case letters), and in this article I will discuss how the BIOS password could be broken.
In my next article I will examine the local administrator passwords.

To start with, I thought it might be helpful to look at the first few passwords. The file displayed each password next to the asset number used to generate it, thus:
0 KCPGB76#
1 LDQHC87$
2 MERID98%
3 NFSJE09&
4 OGTKF10(
5 PHULG21)

You will, as I did, notice immediately that there is a very obvious pattern to these passwords... With the exception of the symbols, each character in the password was simply moved on one place for each subsequent password!
Well I couldn't believe the scheme could be so simple, and to see whether this was correct, I calculated what the password would be for a few asset numbers (ignoring the symbol) and compared the results:
10 UMZQL76
15 ZREVQ21
20 EWJAV76
25 JBOFA21

I first compared my estimate for asset 10, UMZQL76, and this seems a good match for the generated code UMZQL76& (I'll talk about the symbols in a minute).
The second estimate was for asset 15, which I though was ZREVQ21. This again matches ZREVQ21-.
My third estimate, asset 20, matches the actual EWJAV76&.
The fourth estimate, asset 25, broke the rule. I had estimated a password of JBOFA21. The actual password turned out to be EBOFA21-. You will notice that what I had though was a J turned out to be an E. Although other than the first letter, the password was ok.

I decided to see where I had gone wrong. I knew that for asset 20, I had generated the correct password (ignoring the symbol), so to see where things went wrong, I estimated the next asset, 21.
According to the rule I had so far made, asset 21 should have produced FXKBW87. It actually produced AXKBW87.
Again only the first letter was out (I thought F, actual A).
So unlike the previous 21 assets (0-20), where the first letter had increased by one each time (with Z looping around to A), this time it had moved back from E (in asset 20) to A, a step of -4 instead of +1.
If I included this into my formula, that at step 21, the lettering would go back 4 but then continue adding one every time, I can successfully generate the correct first letter for assets 25, 30, 35, etc. up to asset 78.
At asset 78, the lettering again goes back 4, from E to A.
This happens again at assets 135 and 192.
Looking at these "special" assets together, I see that my formula breaks at these numbers:
21
78
135
192
At this point I decided to use some simple maths to help me along.
The difference between 0 and 21 is, of course, 21.
The difference between 21 and 78 is 57.
The difference between 78 and 135 is 57.
The difference between 135 and 192 is 57.
A pattern has emerged - every 57th asset, the letter goes back 4 instead of going forward one. In every case, this is going from E to A.

To help me work things out more easily, I decided to substitute letters for numbers.
The most obvious way to do this was to say A=0, B=1, etc until Z=25.
I then examined the first few assets from 21 onwards. I chose 21 because this is the first number of the "57" sequence.
21 A 0
22 B 1
...
72 Z 25
73 A 0
74 B 1
75 C 2
76 D 3
77 E 4
78 A 0 <--- Restart sequence
79 B 1
So at asset 78, the lettering reset to start from A again (which happened to be 4(E)-4).

I want a way to calculate which letter should be generated from the assset number. I know that 21 should give 0 (A), and 22 should give 1 (B).
I could do asset-21 modulus 26:
21-21 = 0 mod 26 = 0 (A)
22-21 = 1 mod 26 = 1 (B)
...
72-21 = 51 mod 26 = 25 (Z)
73-21 = 52 mod 26 = 0 (A)
...
77-21 = 56 mod 26 = 4 (E)
78-21 = 57 mod 26 = 5 (F) <--- Incorrect
But that would not work the as soon as we reach one of the "reset" assets...
I already know that the sequence resets every 57 assets.
What about is I tried asset modulus 57 first to get the position in the sequence?
20 mod 57 = 20
21 mod 57 = 21
22 mod 57 = 22
...
77 mod 57 = 20
78 mod 57 = 21
Letter 21 is a V though, and I know that the letter for asset 78 is an A.
Now what if I modify the asset number used to generate the code by the difference between 57 and the offset (the asset number of the password whose letter was reset to A first), which was 21?
So, looking at asset 21:
21 + (57 - 21) = 57
. 57 mod 57 = 0
. 0 = A
Now asset 22:
22 + (57 - 21) = 58
. 58 mod 57 = 1
. 1 = B
And a few more (replacing '(57 - 21)' with 36 for simplicity):
76 + 36 = 112 mod 57 = 55 (?)
77 + 36 = 113 mod 57 = 56 (?)
78 + 36 = 114 mod 57 = 0 (A)
However, the results for assets 76 and 77 (55 and 56) don't mean anything, but if I subsequently modulus these with 26 (the number of letters in the sequence):
76 + 36 = 112 mod 57 = 55 mod 26 = 3 (D)
77 + 36 = 113 mod 57 = 56 mod 26 = 4 (E)
78 + 36 = 114 mod 57 = 0 mod 26 = 0 (A)
That looks a bit better.
I tried it out on some random asset numbers between 78 and 499, and the letter was correct in all cases.
So the formula for the number of the first letter (remember 0=A) turns out to be:
l = (n + 36) mod 57 mod 26
Where l is the letter code, and n is the asset number.

Now I had figured out the first letter was generated, I tried for the next 4.
I took each letter and worked out what number is was in my sequence (where A=0 and Z=25), and then looked at where the sequence was disrupted (reset to A).
For letter 2, I noticed that the sequence was reset first at asset 39, and then subsequently at 106 and 173.
For letter 3, it was 30, 101 and 172.
For letter 4, it was 47, 126 and 205.
For letter 5, it was 30, 113 and 196.
So for letter 2, the sequence length was 67 and the modifier was 28 (67 minus the first offset 39).
For letter 3 the sequence length was 71 and the modifier was 41 (71 - 30).
For letter 4 the sequence length was 79 with a modifier of 32.
For letter 5 the sequence length was 83 with a modifier of 53.

This resulted in formulae for each of these letters of:
1) l = (n + 36) mod 57 mod 26
2) l = (n + 28) mod 67 mod 26
3) l = (n + 41) mod 71 mod 26
4) l = (n + 32) mod 79 mod 26
5) l = (n + 53) mod 83 mod 26
Using these formulae I was able to successfully calculate the first five characters of every single password for the assets up to 499.

Moving onto the two numbers located at positions 6 and 7 in the password, these working in much the same way as the letters, other than the character sequence is 0 - 9 and consists of 10 possibilities, instead of the 26 for the letters (A-Z).
Applying exactly the same method that I used above, I managed to work out a formula for the numbers of:
6) l = (n + 27) mod 99 mod 10
7) l = (n + 116) mod 197 mod 10

Finally, the symbols are all that is left to work out.
Through examining the symbols that are used for various sequential codes, I managed to work out the the symbol sequence was had 10 possibilities, in the order "!#$%&()*+-".
Again, using the same technique described above, I managed to work out the formula for the symbols as:
8) l = (n + 101) mod 107 mod 10

Using the formula for each character of the password, I can now generate the correct BIOS password for any asset number.
For example, for asset 101:
1) l = (101 + 36) mod 57 mod 26
. = 137 mod 57 mod 26
. = 23 mod 26
. = 23 'X'
2) l = (101 + 28) mod 67 mod 26
. = 129 mod 67 mod 26
. = 62 mod 26
. = 10 'K'
3) l = (101 + 41) mod 71 mod 26
. = 142 mod 71 mod 26
. = 0 mod 26
. = 0 'A'
4) l = (101 + 32) mod 79 mod 26
. = 133 mod 79 mod 26
. = 54 mod 26
. = 2 'C'
5) l = (101 + 53) mod 83 mod 26
. = 154 mod 83 mod 26
. = 71 mod 26
. = 19 'T'
6) l = (101 + 27) mod 99 mod 10
. = 128 mod 99 mod 10
. = 29 mod 10
. = 9 '9'
7) l = (101 + 116) mod 197 mod 10
. = 217 mod 197 mod 10
. = 20 mod 10
. = 0 '0'
8) l = (101 + 101) mod 107 mod 10
. = 202 mod 107 mod 10
. = 95 mod 10
. = 5 '('
So the final password for asset 101 is 'XKACT90('.

Some important things to note about this password generation method:
WEAK TECHNIQUE
The technique used to generate these passwords is weak.
If you saw two or three sequential passwords, you would immediately sense a flaw due to the sequential nature of the generated passwords.
To make a highly educated guess at the formula for each character using the technique I described above, you would need less than 300 sequential passwords.
You could actually make a confident guess at 5 out of 8 of the character formulae with less than 150 sequential passwords.

POOR IMPLEMENTATION
Whoever created the scheme had the forsight to use complex password characters, by including the numerics and symbols in the password.
However, because the 6th and 7th character are always numbers and the final character is always one of 10 symbols, this actually reduces the complexity of the password significantly!
You would only need to see two separate passwords (not even sequential) to surmise that the format was fixed, 5 letters followed by 2 numbers and a single symbol.
A better implementation would have been to include numbers and symbols in each character sequence for each letter, thereby increasing the complexity of the password to 46 possibilities per password character.

EASY TO BRUTE FORCE
Consider that to break the password using brute force, you would currently need to break 5 letters (26 possibilities each), 2 numbers (10 possibilities each) and a symbol (10 possibilities):
26^5 x 10^3 = 11,881,376,000
But if a fuller implementation (letters, numbers and symbols in each character set) had been implemented (46 possibilities each):
46^8 = 20,047,612,231,936
This is over 1,687 times the number of combination than for the current password.
The math is simple 11 billion vs 20 trillion combination!


In my next article I will look at the local administrator passwords that comprise the second half of the password list I was given and discuss the use of such methods to generate password lists.

Until then...

No comments: