Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MurmurHash3_x86_32 Algorithm Does Not match the one in Wikipedia #25

Open
GoogleCodeExporter opened this issue Apr 3, 2015 · 3 comments

Comments

@GoogleCodeExporter
Copy link

Your MurmurHash3 implementation uses the following tail code:

  switch(len & 3)
  {
  case 3: k1 ^= tail[2] << 16;
  case 2: k1 ^= tail[1] << 8;
  case 1: k1 ^= tail[0];
          k1 *= c1; k1 = ROTL32(k1,15); k1 *= c2; h1 ^= k1;
  };


But this does match the (otherwise identical) MurmurHash algorithm in Wikipedia:


    with any remainingBytesInKey
        remainingBytes <- SwapEndianOrderOf(remainingBytesInKey)
        // Note: Endian swapping is only necessary on big-endian machines.
        //       The purpose is to place the meaningful digits towards the low end of the value,
        //       so that these digits have the greatest potential to affect the low range digits
        //       in the subsequent multiplication.  Consider that locating the meaningful digits
        //       in the high range would produce a greater effect upon the high digits of the
        //       multiplication, and notably, that such high digits are likely to be discarded
        //       by the modulo arithmetic under overflow.  We don't want that.

        remainingBytes <- remainingBytes * c1
        remainingBytes <- (remainingBytes << r1) OR (remainingBytes >> (32 - r1))
        remainingBytes <- remainingBytes * c2

        hash <- hash XOR remainingBytes


To make the two versions match, you'd have to change the bitwise-xor in your 
implementation to bitwise-or, as follows:

  switch(len & 3)
  {
  case 3: k1 |= tail[2] << 16;
  case 2: k1 |= tail[1] << 8;
  case 1: k1 |= tail[0];
          k1 *= c1; k1 = ROTL32(k1,15); k1 *= c2; h1 ^= k1;
  };



I'm just wondering which implementation is correct.  I can't figure out where 
the canonical implementation of the algorithm is.


Original issue reported on code.google.com by [email protected] on 17 Aug 2013 at 7:07

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant