If you have a look at all the attributes that spaCy generates for their tokens then you can imagine that some of these features can be useful for machine learning pipelines. To name a few:
is_oov: is the token part of the vocabulary/does it have a vector?
is_stop: is the token a stopword?
lemma_: what is the lemma of the token
pos/tag coarse/fine-grained part of speech information
- morphological features
- grammatical dependency
These can all have a discrete representation and could be added in general to a Rasa pipeline.
If you have a look at all the attributes that spaCy generates for their tokens then you can imagine that some of these features can be useful for machine learning pipelines. To name a few:
is_oov: is the token part of the vocabulary/does it have a vector?is_stop: is the token a stopword?lemma_: what is the lemma of the tokenpos/tagcoarse/fine-grained part of speech informationThese can all have a discrete representation and could be added in general to a Rasa pipeline.