Wednesday, 11 July 2007

Outlook PST Password Encryption

In my previous article about Microsoft Outlook, I demonstrated how secured Outlook Data Files (PSTs) could be accessed without having the password for them.

During the article, I noticed a call that took my entered password as a criteria and returned a coded value that was then compared to a value on the stack.

I believe that this is where the password is encrypted and today I want to take a look at how this works.

First of all, referring to my previous article, there is a call at +523B2 in MSPST32, which has a parameter of my previous password:


In this case, EAX contains the password I entered, 'fred'. When I step over the CALL at +523B2, I notice that EAX now contains the value 0x1B6D2E42 which, for now, I will assume is the coded value of my password.

Then, later on in the proc, at +523CC, this value (which, in the mean time, has been copied to EBX) is compared with the value at ESI-1C, which is 0xFA7D0C3D.

The result is that they are not equal and a dialog is displayed informing me that the password I entered is incorrect.

Now to double check that I am looking in the right place, I enter what I know to be the correct password, 'adad'. The CALL at +523B2 this time sets EAX to 0xFA7D0C3D, which is the same as the value that it will be compared with at +523CC, so I know that this must be where the encryption takes place. If I run the code from here, I see that the incorrect password dialog is skipped over and the PST file is opened.

So, to start looking at the encryption, I set a breakpoint at +523B2 in OllyDBG, and then step into the proc that is called.

The proc has the following code:


You will quickly see that all this proc is doing is calculating the password length. If the password is longer than the maximum (0xF), the length is set to 0xF (and stored in EAX).

The proc then calls another proc at +2AF7, which I step into.

This proc has the following code:

+2B11SHRESI, 8

Now this is starting to look more like it.

The first few lines, in +2AF8 to +2B02 simply check to see whether the password has a length and sets ECX to the length of the password.

There is a loop between +2B05 to +2B22. If I examine the loop in detail, we can see the following occuring:

  • +2B05 - The next character code for the next character from my password is moved into EDX.
  • +2B08 - The lowest order byte of the DWORD value held at EBP+8 is moved into ESI. This is 0 on the first iteration of the loop.
  • +2B0C - The character code of the current letter (which is in EDX) is XORed with the byte in ESI and is stored in EDX.
  • +2B0E - The value held at EBP+8 is moved into ESI. This is 0 on the first iteration of the loop.
  • +2B11 - The value in ESI is right-shifted 8 bits.
  • +2B14 - A 32-bit modifier value is loaded into EDX from the data segment of the MSPST module (I will talk more about this in a minute). The position of the value is offset by 4*EDX (or essentialy EDX DWORDs). EDX contains the character code which was XORed at +2B0C.
  • +2B1B - The value in EDX (the modifier) is XORed with value in ESI, and the result is stored in EDX
  • +2B1D - The position of the next character is incremented
  • +2B1E - The loop counter is decremented
  • +2B1F - The value in EDX is stored back in EBP+8 for the next loop iteration.
  • +2B22 - Loop unless ECX = 0

This process takes place for each character in the entered password.

The modifier value I mentioned, which is loaded into EDX at +2B14, is used to "encrypt" each character in the password. There is a predifined list of 575 32-bit values in the list. All are different, with the exception of the last 128 values, which are all 0xAAAAAAAA.

It is important to note though, that as I am using the ASCII character set, it is only possible to use the first 256 values (0-0xFF), as any value in the range 0-0xFF XORed with any value in the range 0-0xFF can only yield a result of 0-0xFF.

So, based on what I have discovered in the above loop, lets look at how 'adad' will be encrypted:

1'a' 0x61 (97)0x000000000x000x61 (97)0x000000000x3AB551CE0x3AB551CE
2'd' 0x64 (100)0x3AB551CE0xCE0xAA (170)0x003AB5510x36034AF60x3639FFA7
3'a' 0x61 (97)0x3639FFA70xA70xC6 (198)0x003639FF0x720767850x72315E7A
4'd' 0x64 (100)0x72315E7A0x7A0x1E (30)0x0072315E0xFA0F3D630xFA7D0C3D
 It = loop iteration
 C = character
 PV = previous value (from previous iteration), EBP+8
 BV = lowest order byte of PV (PV = EBP+8)
 P = Position for modifier value, result of C XOR BV (at +2B0C)
 SV = Shifted value, result of PV >> 8 (at +2B11)
 M = Modifier value, looked up from position of data segment + P * 4 (at +2B14)
 NV = New value, result of SV XOR M (at +2B1B)

So you will see from the table above, the 'adad' will give a final value of 0xFA7D0C3D, which is as expected.

In conclusion, I have managed to understand how Outlook PST password encryption takes place. As with SourceSafe's password encryption, this has some potential weaknesses, although it does appear that the encryption for Outlook PST passwords is slightly stronger than SourceSafe's.

In my next article, I hope to look at the weaknesses in the encryption and see if there may be a way to decrypt the password from the encrypted password code.

No comments: