Generate a consistent (deterministic) hashcode in .NET Core (if that does not cause a security issue)

With .NET Core, Microsoft have changed the behaviour of GetHashCode() so that it will produce a different result for the same data, for each program execution.

This is for security reasons, to try and prevent replay attacks and other such attacks that involve predicting or repeating a hash code.

You can read a better summary here at andrewlock.net.


For some cases, although this is strictly speaking not a breaking change, you may prefer the old behaviour. 

So provided it does not introduce a security issue, you may want to keep generating the same hashcode for the same data, across program executions, using a custom hashcode.

An interesting example is from Microsoft themselves, with a deterministic hashcode for a Dictionary.

Unfortunately, Microsoft's deterministic hashcode source is internal, not public (perhaps to reduce security bugs).

So, we can write our own:


Solution 1 - from andrewlock.net

static int GetDeterministicHashCode(this string str)
{
    unchecked
    {
        int hash1 = (5381 << 16) + 5381;
        int hash2 = hash1;

        for (int i = 0; i < str.Length; i += 2)
        {
            hash1 = ((hash1 << 5) + hash1) ^ str[i];
            if (i == str.Length - 1)
                break;
            hash2 = ((hash2 << 5) + hash2) ^ str[i + 1];
        }

        return hash1 + (hash2 * 1566083941);
    }
}


Solution 2 - using MD5 encryption.

MD5 is consistent and also produces a short sequence of 16 bytes.

This code seems simpler:

static int GetDeterministicHashCode(this string str)
{
        using (MD5 md5 = MD5.Create())
{
byte[] inputBytes = Encoding.UTF8.GetBytes(input);
byte[] hashBytes = md5.ComputeHash(inputBytes);

return Convert.ToHexString(hashBytes); } }

Comments