-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not being able to replicate coherence scores from paper #13
Comments
Thanks for using Palmetto and pointing out this important difference. |
Great, I will be patiently waiting! :) |
I can confirm, that I am encountering the same problem. I couldn't replicate the results from table 8, too. I also tried to reproduce the correlation values for the NYT topics, the C_V coherence and Wikipedia as reference coprus, but I got 0.781 instead of 0.803. Until now, I couldn't find a reason why the results are different. Sorry, that I couldn't bring much more light into this problem, until now. On which OS are you executing the problem? |
I'm using Windows 7 (64 bit). I got a much lower coherence when trying to reproduce the correlation values for the NYT topics. But probably I did something wrong there myself. |
Feel free to reuse/check my correlation implementations. You can find them at |
Are there any more information available on what is causing the difference between values in the paper and the ones calculated using the provided source. I get similar but not exactly the same values as the ones provided in the original issue. This is using the wikipedia index provided at http://139.18.2.164/mroeder/palmetto/ and compiling the library locally as a jar. (I get these values using the provided jar aswell) I am on a Linux system with Fedora 27. The values i get are: I also checkout out and older version aa8b650 and compiled that version. Using this version I also got the exact same values as current version. Is there a known commit of the library the returns the same values as the paper? That would greatly help in troubleshooting what is causing these differences. Or could the difference depend on the Wikipedia_db version, but that version provided is dated May 2014. So it should not have changed right? |
Sorry, I still couldn't figure out where the difference comes from. The implementation itself does not seem to cause the problem. I also made sure that for the examples posted above there is no influence from a lemmatizer in the preprocessing. So there are two sources left:
In general, it seems like C_V has also a drawback described in #12. It behaves not very good when it is used for randomly generated word sets. So finally I would suggest to use C_P, NPMI or UCI for evaluating topics. |
So is the stande we should not use Cv to evaluate lda topic models? or just if the corpus size is small? |
The main issue with respect to our implementation is fixed with (#81). It turned out that a parameter was implement wrong and, hence, C_V should a strange behavior. After the fix, tests showed that C_V works as it should, i.e., although the exact values described in this issue are still not being reproduced, the Pearson correlation values of C_V fit to the values reported in the paper. Does that answer your question? |
Dear all,
For my research I want to evaluate a new semantic coherence measure with the ones available in Palmetto, especially C_V and C_A. I'm trying to replicate some results described in your paper ("Exploring the Space of Topic Coherence Measures"), by using the topics and human ratings that you have published. However, I'm not able to replicate the same Pearson correlation scores as in Table 2. Having a closer look, I found that I'm also not able to replicate the coherence scores as shown in Table 8 of the paper. In this table the following coherence scores, using the measure C_V, are displayed:
0.94 company sell corporation own acquire purchase buy business sale owner
0.91 age population household female family census live average median income
0.86 jewish israel jew israeli jerusalem rabbi hebrew palestinian palestine holocaust
Running Palmetto as jar and using the wikipedia_bd downloaded from the link on Github and the above topics in a .txt file, I get the following scores:
0.52072 company sell corporation own acquire purchase buy business sale owner
0.75174 age population household female family census live average median income
0.73356 jewish israel jew israeli jerusalem rabbi hebrew palestinian palestine holocaust
Using the web service I also get different scores:
http://palmetto.aksw.org/palmetto-webapp/service/cv?words=company%20sell%20corporation%20own%20acquire%20purchase%20buy%20business%20sale%20owner
http://palmetto.aksw.org/palmetto-webapp/service/cv?words=age%20population%20household%20female%20family%20census%20live%20average%20median%20income
http://palmetto.aksw.org/palmetto-webapp/service/cv?words=jewish%20israel%20jew%20israeli%20jerusalem%20rabbi%20hebrew%20palestinian%20palestine%20holocaust
Am I making a mistake somewhere? How can the scores here be different from the C_V scores displayed in the paper?
The text was updated successfully, but these errors were encountered: