Pure JavaScript implementation of LEFFF lemmatizer.
Part of this readme come from the Python implementation https://github.com/ClaudeCoulombe/FrenchLefffLemmatizer
A French Lemmatizer in JavaScript based on the LEFFF (Lexique des Formes Fléchies du Français / Lexicon of French inflected forms) is a large-scale morphological and syntactic lexicon for French. A lemmatizer returns the lemma or more simply the dictionary entry of a word, In French, the lemmatization of a verb returns this verb to the infinitive and for the other words, the lemmatization returns this word to the masculine singular.
Sagot (2010). The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French. In Proceedings of the 7th international conference on Language Resources and Evaluation (LREC 2010), Istanbul, Turkey. Retrieved from Benoît Sagot Webpage about LEFFF
In this project, we use the morphological lexicon only:
.mlex file which has a simple format in CSV (4 fields separated by \t
)
Tagset format FRMG - from the ALPAGE project since 2004
npm i node-lefff
const nodeLefff = require('node-lefff');
const nl = await nodeLefff.load();
nl.lem('action') // action
nl.lem('acteur') // acteur
nl.lem('actrices') // acteur
nl.lem('Dleyton') // Dleyton
How to use with natural
const nodeLefff = require('node-lefff');
const nl = await nodeLefff.load();
const Stemmer = require('natural/lib/natural/stemmers/stemmer_fr');
const LefffLemmer = new Stemmer();
LefffLemmer.stem = nl.lem;
LefffLemmer.tokenizeAndStem('Mes mémés m\'aimaient mais pas papa'); // ['mémé', 'aimer', 'papa']
- word
<String>
: Word - @return
<String>
: Lemmatized word, or word itself, if no lemma is known
- word
<String>
: Word - @return
Array<Object>
:
[{
type: String
lemma: String // Lemmatized word
mode: String
}, ...]
Learn more about mode and type here:
- mode
String
: mode string returned bynl.infos
- @return
Object
:
{
indicatif: Bool,
conditionnel: Bool,
impératif: Bool,
subjonctif: Bool,
participe: Bool,
infinitif: Bool,
présent: Bool,
futur: Bool,
imparfait: Bool,
passéSimple: Bool,
passé: Bool,
premièrePersonne: Bool,
deuxièmePersonne: Bool,
troisièmePersonne: Bool,
masculin: Bool,
féminin: Bool,
singulier: Bool,
pluriel: Bool,
}