-
Notifications
You must be signed in to change notification settings - Fork 25
Finite maps #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finite maps #279
Conversation
This commit introduces finite maps implemented as 2-3-trees, where the balance invariant is enforced by types (using nested datatypes approach). While this is not the best implementation in terms of efficiency, its correctness is relatively easy to verify and maintain. The implementation can be seen as a temporary solution, and as a guideline for the interface of a better implementation that will be provided in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a finite map (dictionary) data structure for the standard library, implemented using 2-3 balanced trees with type-level height invariants. The implementation uses a functor pattern where Map.make creates specialized map operations for a given key type with comparison methods. While noted as a temporary solution, it provides a complete API with logarithmic complexity for core operations.
Key Changes
- Implements 2-3 trees using nested datatypes to enforce balance invariants at compile time
- Provides a comprehensive map API including insert, remove, lookup, fold, map, and iteration operations
- Includes randomized testing to verify correctness of operations and size tracking
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| lib/Map.fram | Core implementation of finite maps with 2-3 tree data structure, public API signature, and all map operations |
| test/stdlib/stdlib0008_Map.fram | Test suite with basic assertions and randomized property-based tests for map operations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
wojpok
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code looks fine and could be merged to the master as is. I haven't noticed anything wrong. Once I realized that the smart constructors implement arithmetic on 2-3 counting system, reading became a smooth sailing. I have mentioned some nitpicks, but they can be ignored.
| order of keys. ##} | ||
| , method iter : | ||
| {Val, E} -> T Val -> ({key:Key} -> Val ->[E] Unit) ->[E] Unit | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have a show method defined for Maps. It may be empty like the one for Streams, but at least it would enable showing of nested structures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think, the show method should print just #Map? Or, should it try to print the bindings in the map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have decided no to implement proper show method for Streams because it may not terminate in REPL. Since Maps are finite, it would make sense to print all bindings. Maybe we should use Haskell notation and print something similiar to Map.fromList [(key, value), ....]? I am fine with either honestly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With printing all bindings in Haskell style I see two problems. First, I didn't implement fromList function. Second, maps are implemented as a functor, so the name of the module is not know for the generic implementation. Maybe something like #Map.fromList [(key, value), ...]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, show method and fromList function are kind of "nice to have", but not the part of the core functionality. I can implement them as a separate issue, but I prefer to merge this PR as it is. When implementing a standard library, there is always missing functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. We can take care of it later
| the result of addition to a tree represented by `Repr`. #} | ||
|
|
||
| data AddResult Key Val Repr = | ||
| | AR_Ok of Repr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is convention to use "PascalCase" without underscores, but without a doubt this naming scheme is very readable. I like it this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a discussion on this topic in fram-lang/fram#2 (comment). As the prefixes to datatype constructors are common, I think the good convention is to use underscore to separate prefixes built from multiple capital letters, and for other prefixes use "PascalCase".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't aware of that discussion
| finally c => c size | ||
| end | ||
|
|
||
| #============================================================================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are different line types intentional or an accident?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional. Different line types mimic section/subsection distinction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, now it makes sense
lib/Map.fram
Outdated
| data H1 = H1 | ||
| data HS H = HS of H | ||
|
|
||
| {# Handler for resize effect, used to keep track of the size of the map. #} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definition is straight to the point, but feels intimidating at first. I think this comment should mention it is a state effect explicitely.
| | Node2 ml k v mr => | ||
| RR_Ok (Node2 (Node3 l k1 v1 ml k v mr) k2 v2 r) | ||
| | Node3 ml kl vl mm kr vr mr => | ||
| RR_Ok (Node3 (Node2 l k1 v1 ml) kl vl (Node2 mm kr vr mr) k2 v2 r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k1 and kl from afar looks the same. It may be worth to use numbers instead of l suffix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I ignore this nitpick? I would like to have number consecutive to easily verify if the order in the tree is preserved. The problems begin when we have to match on a subtree, like in this case. So on the lower level, I use l and r suffixes, because they do not interfere with numbers. I agree, that for some fonts 1 and l looks almost the same. If this is a problem — change the font!
This PR introduces finite maps implemented as 2-3-trees, where the balance invariant is enforced by types (using nested datatypes approach). While this is not the best implementation in terms of efficiency, its correctness is relatively easy to verify and maintain. The implementation can be seen as a temporary solution, and as a guideline for the interface of a better implementation that will be provided in the future.