Considered this python code snippets
import hashlib
while True:
print("Enter your password")
s = raw_input('--> ')
print(s)
print("Now the md5sum")
s = hashlib.md5(s).hexdigest()
print(s)
By any means it's relatively a simple code to understand, we use s as a placeholder for our incoming data string, compute it's md5sum and replace the s value with a hexdigest.. In short s now contain the md5sum in hex right? So any plaintext that we've entered should vanished and and flush out by the garbage collector in python VM right?
Let's give it a test.
So most people would think any previous plaintext value would be washed out from the memory. The String DogFood won`t exist right? Let's attach this current script on a debugger ('Im using edb debugger , the best thing besides windbg sorry stallman gdb just sux!!!!');
High-level languages often have data types that are immutable. The program can only write to an immutable object once, at creation time. In other words s is just a label and the string maybe be stored in the same address or anywhere in the memory. (Noted to self, heap/stack/bss/dss/ is actually some sort of label the computer generated to ou give it some of approximate understanding on a specific region in the memory)
Let's search for the md5sum string. 36f65df05afee9fb079943b7ba5d9617
The string was stored in a different address!!
So in a High Level Language, there is no gurantee your initial plaintext data in an address would be overwrite with a encrypted blob/binary . The only way to ensure overwrite is 100% is to use either mutable data structure that are capable of replacing dynamics element.
So why did u see a chunk of the unencrypted/crypted data in the heartbleed heak leak? Not a surprise anymore right?
import hashlib
while True:
print("Enter your password")
s = raw_input('--> ')
print(s)
print("Now the md5sum")
s = hashlib.md5(s).hexdigest()
print(s)
Let's give it a test.
So most people would think any previous plaintext value would be washed out from the memory. The String DogFood won`t exist right? Let's attach this current script on a debugger ('Im using edb debugger , the best thing besides windbg sorry stallman gdb just sux!!!!');
I like using edb debugger, it helps for example binary search string. Since we have replace the s value from DogFood to a hex string. We shouldn`t see any DogFood string in the memory right? Unfortunely that is entirely not true :(
DogFood in Hex |
Let's search for the md5sum string. 36f65df05afee9fb079943b7ba5d9617
The string was stored in a different address!!
So in a High Level Language, there is no gurantee your initial plaintext data in an address would be overwrite with a encrypted blob/binary . The only way to ensure overwrite is 100% is to use either mutable data structure that are capable of replacing dynamics element.
So why did u see a chunk of the unencrypted/crypted data in the heartbleed heak leak? Not a surprise anymore right?