-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adicap : enhancement of regex to match local spelling #158
Comments
Hi @GuillaumePressiat, thanks for this feedback! Of course, feel free to make a PR to improve this pattern! |
Hi @percevalw, thanks for your answer. Nice module by the way. (I was at "DIM siege AP-HP" when nlp-segmenter and uima pipelines were developed (@parisni and co.) and I had tried to participate few improvements on sections definition (edition of file sections.csv)). |
I see! The |
Hi @GuillaumePressiat! Thanks for the feedback :)
should solve this issue! |
Hello @etienneguevel, thanks for the tip! I've modified the two scripts here (patterns.py and adicap.py). Today I doesn't see where the thing is. Many thanks |
Hello @GuillaumePressiat, I've looked for the reasons your modifications didn't lead to the expected results, and found that there is an issue between the model used for the ADICAP pipeline ( The model used look like this :
I've made an issue to describe how the sentencizer deal with codes like your ADICAP example at : #178 The |
Hello @etienneguevel, Thank you for the feedback! It's quite logical indeed. For now I've just removed all dots in my anapath documents and the basic eds.adicap pipeline works just fine. Thanks for the other issue related to this (eds.sentences cutting codes in different sentences). I will follow this! Guillaume |
Hi @GuillaumePressiat, the ADICAP matcher should now work (in the master branch) without having to modify the text upstream. Please let us know if you still have issues with this component ! :) |
Hi @percevalw, It's ok now! Thank you very much! Guillaume |
Description
In my hospital (CHU de Brest), ADICAP codes are written like this:
In this case dots spells adicap structure and dictionnaries for (d1-d8) part of code.
Your regex in adicap ner is without dots, here
Are you ok if I propose this modified regex?
just add 3 conditionnal dots
\.{0,1}
ind1_4 = r"[A-Z]\.{0,1}[A-Z]\.{0,1}[A-Z]{2}\.{0,1}"
test :
Many thanks
The text was updated successfully, but these errors were encountered: