Did you create code for calculating all the similarity scores with the Min Hash approach?

Chis,

Thank you for the tutorial and code. I found yours to be one of the more understandable explanations. MinHashing seems extremely clever. 

In your runHashMinExample, you commented that you used the direct calculation for the similarities, which I have no problems understanding. However, I have been searching for this "MinHash approach approach" to creating similarities. I was wondering if you had written this for if you can point me in the right direction. I have a huge dataset and I would like to use MinHash approach. 

I am assuming that this would be faster. I am also assuming that the storage code (for the triangle matrix) would remain the same.

Thank you,
Ben

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Did you create code for calculating all the similarity scores with the Min Hash approach? #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Did you create code for calculating all the similarity scores with the Min Hash approach? #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions