-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve names only once #2169
Comments
Hi @facundominguez, I would like to take up this idea for GSOC. I am making the proposal, how can I get it reviewed? |
Greetings @facundominguez , I'm also interested in the idea! Would get in touch :) |
A little bit of analysis follows. To do name resolution, the current implementation changes the representation of variables from Symbol (in essence a string) to GHC's Var (which is roughly a Name paired with a Type, and perhaps its unfoldings). Name is better than Symbol because it can tell the origin of an identifier, and can disambiguate names that come even from modules with the same name. Until here, we are good. But then Liquid Haskell discards the Vars to keep Symbols as identifiers before storing the specs in interface files. Vars have multiple references to various bits of information that make them impractical to serialize. But saving Symbols is also problematic, because it forces resolving names again when the specs are read from the interface files. A possible solution is to use Name instead of Symbol before storing the specs. This allows to recover the Vars from the Names regardless of the particular environment in which the lookup is made. Alternative: always qualify the Symbols before storing specsIf the symbols are qualified with their module of origin, it should be possible to recover a Var from it. However, this has some disadvantages:
Implementation strategyThis plan aims to serialize Names instead of Symbols. A goal of the following stages is to keep tests passing all the way to completion. The first two stages are refactorings that prepare the scene to do the behavior changes in small increments during stage 3. Stage 1Before serializing a spec, use a pair Stage 2Convert the deserialized Names to Vars, and propagate the Vars to all the places where their Symbols are used, but keep using the Symbols. Sticking to the Symbols makes this a refactoring that shouldn't affect the behavior of LH still. It is unclear to me if Names can be converted to Vars without using the IO monad in the GHC API. If the IO monad is needed, they might need to be converted upfront. To integrate these changes, the spec of the current module, which starts with Symbols (that is the BareSpec), might need to be augmented to have each symbol accompanied with the corresponding Var. Otherwise, the spec of the current module and the imported specs might not be able to be combined. Stage 3One usage site at a time, stop using the Symbol and start using the corresponding Var. See that tests continue to pass after every modification. When all usage sites have been changed, stop serializing the Symbols when storing specs. Stage 4Remove redundant imports from tests. I have a few in the stitch-lh benchmark, and more in another code base (an experiment to integrate Tangent: Recovering Vars might be unnecessary for imported specsVars are useful to check that a function has a spec compatible with its type, and they might be useful to reflect unfoldings. This is helpful to do for the spec of the current module, but not for the spec of imported modules, where checks have already been done when verifying those modules. It is likely that LH is needlessly recovering Vars in these cases. Stopping this, however, will require keeping the imported specs separate from the specs of the current module. At the moment they are all put together in a single soup called I'd leave this for another issue, though. |
I like this plan to use One thing that's unclear to me though -- you write:
meaning, specifically, that the "disadvantages" 1, 2 (for fully qualifying |
That's my understanding. A Name is not strictly required to solve (1), but then LH has to imitate GHC to resolve the name, and having a Name should require less ceremony. |
This is definitely the way to go, because going from |
#2407 is a first iteration of the plan as shared above. It implements the first 3 stages for type constructors in refinement types. I'm hoping this work can be imitated without much innovation for the other features of LH that require names. #2411 repeats the work for names of data constructors in data type specifications. It also fixes #2092 mentioned below. I can think of: type aliases, expression aliases, names in expressions, asserted specs, assumed specs, reflections and rewrite rules, measures, expressions in predicates. Probably there are a few others that I'm missing now. |
Possibly related: #2092 |
Ten PRs later, perhaps it is time to give a bit of an update. The plan has worked pretty well for the namespace of Haskell identifiers. Name resolution is now remembered for any annotation that refers to a Haskell name when specs are imported. And I could implement that piece-wise for each annotation. Things are still in progress with the logic namespace. This is the namespace where predicates in refinement types grab names from. It contains measures, inline functions, reflected functions, and sometimes local variables. My first try is to build an environment of logic names, and use it to resolve all the names that appear in the AST of refinement types. The resolved names are easily persisted and recovered when importing specs. However, But I'm still exploring the details. One recent lesson is that |
This is amazing - thanks @facundo! Yes I think there were all sorts of
implicit assumptions of the kind you mention - waiting for you to step on
like a mine!!!
- Ranjit.
…On Thu, Nov 28, 2024 at 12:02 PM Facundo Domínguez ***@***.***> wrote:
Ten PRs later, perhaps it is time to give a bit of an update.
The plan has worked pretty well for the namespace of Haskell identifiers.
Name resolution is now remembered for any annotation that refers to a
Haskell name when specs are imported. And I could implement that piece-wise
for each annotation.
Things are still in progress with the logic namespace. This is the
namespace where predicates in refinement types grab names from. It contains
measures, inline functions, reflected functions, and sometimes local
variables.
My first try is to build an environment of logic names, and use it to
resolve all the names that appear in the AST of refinement types. The
resolved names are easily persisted and recovered when importing specs.
However, liquid-fixpoint can only handle strings (symbols) and so I'm
converting the resolved names to strings in a non-ambiguous (injective)
transformation, before sending the queries to liquid-fixpoint.
But I'm still exploring the details. One recent lesson is that
liquidhaskell assumes in quite a few places that logic names are rendered
as symbols in exactly the same way as the Haskell names they correspond to,
and I was not considering this so far.
—
Reply to this email directly, view it on GitHub
<https://urldefense.com/v3/__https://github.com/ucsd-progsys/liquidhaskell/issues/2169*issuecomment-2506708073__;Iw!!Mih3wA!FW1F99K_xi2pfeZRNq-12GGiFr9QrmhFAzs74pGPOmsTmvWHSnuJooUZOtA81AK6nuK3aS4_99mFsU2MMpi3SMf6$>,
or unsubscribe
<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AAMS4OCROTWBGQRI5SJB2IT2C5ZDPAVCNFSM6AAAAABO5UPVKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMBWG4YDQMBXGM__;!!Mih3wA!FW1F99K_xi2pfeZRNq-12GGiFr9QrmhFAzs74pGPOmsTmvWHSnuJooUZOtA81AK6nuK3aS4_99mFsU2MMle680Xt$>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
At the moment resolution of all names is persisted except for type constructors used in qualifier sorts. In addition I'd like to revise type and expression aliases to use import aliases as the other names. Also, #2463 will add a new After these are handled, we could start retiring the old environments that LH is building to qualify names. I already managed to remove the |
#2303 adds a minimal test showing the problem.
Currently, names in specifications are resolved multiple times. Name resolution determines for each name in a spec, what is the Haskell definition it is associated to. Type and data constructors exist both in the logic and in Haskell, and signatures with refinement types are also linked to Haskell definitions.
These names are resolved once when compiling the module containing the specs, and they are resolved again when compiling a module that transitively imports the module containing the specs. This is not ideal for the duplicated work, but it is also problematic because the implementation doesn't ensure that names would resolve in the same way on all of the attempts.
Resolving names requires looking up an identifier in an environment storing all the things that GHC knows to be in scope. When resolving names for transitive dependencies, a single environment is used to resolve the names of all of the transitively imported modules. In order to disambiguate names that could refer to one of many definitions in Haskell, the resolution algorithm considers whether the name has a qualifier, and what is the name in which the spec has been defined. But this strategy not always yields the same resolution as obtained when compiling the imported module, which uses, understandably, a smaller environment to resolve the names.
This issue is about making name resolution more predictable. The current workaround is to manually fully qualify names in specifications, although there are restrictions to do this when it comes to data specifications.
Part of the solution would likely be storing specifications in interface files after their names have been fully qualified with the module of origin of the thing referred by the name.
Another related concern is that variable names should also include the package of origin, in case multiple packages define modules with the same names.
There is work in progress in liquidhaskell and liquid-fixpoint.
The text was updated successfully, but these errors were encountered: