Skip to content

Conversation

elijah-potter
Copy link
Collaborator

Issues

Description

Draft

Demo

How Has This Been Tested?

Checklist

  • I have performed a self-review of my own code
  • I have added tests to cover my changes

assert_top3_suggestion_result(
"I had good day.",
MissingArticle::default(),
"I had good day.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ironically, Is the article missing?

Lint: Miscellaneous (0 priority)
Message: |
138 | finding it very nice, (it had, in fact, a sort of mixed flavour of cherry-tart,
| ^~~~~~~ Consider adding an article before this noun.
Copy link
Collaborator

@hippietrail hippietrail Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't want to add a/an before mass nouns. "A stuff", "an information", "a knowledge", etc.
But you must also keep in mind that most mass nouns also have countable senses: "a beer", "an area", "a flavour", etc. These can now all be fully queried.

Lint: Miscellaneous (0 priority)
Message: |
317 | “Perhaps it doesn’t understand English,” thought Alice; “I daresay it’s a French
| ^~~~~~~ Consider adding an article before this noun.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I have probably not yet annotated all of the names of languages as mass nouns. Many are both mass and countable like "German" but not "English". (Linguists do talk about Englishes but non-linguists don't)

Lint: Miscellaneous (0 priority)
Message: |
347 | and beg for its dinner, and all sorts of things—I can’t remember half of
| ^~~~ Consider adding an article before this noun.
Copy link
Collaborator

@hippietrail hippietrail Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different. "Half" is only a countable noun. I don't know how to classify this off the top of my head. I'm thinking "half of" as a phrase is a quantifier... Wow discussions online seem to say this usage of "half" is as a mass noun after all.

Lint: Miscellaneous (0 priority)
Message: |
412 | advisable to go with Edgar Atheling to meet William and offer him the crown.
| ^~~~~~~ Consider adding an article before this noun.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely is different. Proper nouns are already "definite" and so don't take articles unless it's part of the name like in "The Gambia" or surnames when referring to the whole family like in "The Simpsons".
And of course in a humorous style: "I met two Toms and a William".

Lint: Miscellaneous (0 priority)
Message: |
544 | hurried off at once: one old Magpie began wrapping itself up very carefully,
| ^~~~~~~~ Consider adding an article before this noun.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we've run into the fact that verbs in their -ing form can behave as both "gerunds" and "present participles". Gerunds behave as nouns and present participles behave as adjectives, but in either case they still also act like verbs in that they can take objects when they're transitive verbs.

I think in this case it's a gerund because it's acting as a noun that's the object of the verb "began". But I'm not 100% sure and would probably consult two LLMs (-:

My hunch is that the best way forward would be to specifically handle them. I better get to work on making sure all the verb form properties are fully propagated. I think .is_verb_progressive_form should be fully propagated though since -ing forms are fully regular.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To follow up on my comment about verb properties, all the properties of regular verbs can be queried already, they're generated by the annotations that derive the inflections. But for irregular verbs we need 4 or 5 more annotations for manually entered irregular inflections. I might do this today.

Lint: Miscellaneous (0 priority)
Message: |
624 | When I used to read fairy-tales, I fancied that kind of thing never happened,
| ^~~~~ Consider adding an article before this noun.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really need more support for hyphenated terms and further down the line we're going to need support for compound nouns that use spaces too.

Maybe you can already do this using NominalPhrase but I've been thinking for months how best to handle the full nominal phrase including the optional determiner and optional adjectives vs the core nominal phrase including just a run of nouns. Not sure if this always means a compound noun or not.

For runs of nouns as a unit whatever the right term is, normally the last one in the group is the "head" and bears the inflection: "man hole covers" and not "men hole cover" but there's a set of special ones like "passers by" and "sons in law". They probably can include possessives in the middle though. And what happens with gerunds in the middle vs present-participles acting as adjectives at the beginning might need some thought and experimentation.

Message: |
679 | another snatch in the air. This time there were two little shrieks, and more
680 | sounds of broken glass. “What a number of cucumber-frames there must be!”
| ^~~~~ Consider adding an article before this noun.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reveals two problems. We've already covered that "glass" is both a mass noun and countable noun. Here it's a mass noun.

But even if it were only countable and needed correction, the article would actually belong before the adjective that precedes the noun: "of a broken glass", not "of broken a glass".

Message: |
764 | over; and the moment she appeared on the other side, the puppy made another rush
765 | at the stick, and tumbled head over heels in its hurry to get hold of it; then
| ^~~~ Consider adding an article before this noun.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case "head over heels" is an idiom so the rules are different. Idioms act as single words. Do not confuse them with "idiomatic" which means "as used naturally by native speakers" which covers other things that are not fixed phrases.

I've been thinking of addressing this problem like this:

  • Mark idioms in the dictionary with a new annotation.
  • When building the FstDictionary by parsing the dictionary.dict and applying the affix/annotation rules, add a property to each word that is either the first or last word of an idiom.

This way, in linters we can check that the following word is not the start of an idiom or the previous word is not the end of an idiom.

It surely needs further thought.

Lint: Miscellaneous (0 priority)
Message: |
1226 | up into a sort of knot, and then keep tight hold of its right ear and left foot,
| ^~~~ Consider adding an article before this noun.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here "foot" and even "left foot" are part of a list making up a noun phrase that as a group has the determiner "its". This gets deeper and deeper into syntax parsing!

For now you can see the adjective "left" makes it not require an article.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants