Skip to content

Conversation

@ppolesiuk
Copy link
Member

This PR introduces finite maps implemented as 2-3-trees, where the balance invariant is enforced by types (using nested datatypes approach). While this is not the best implementation in terms of efficiency, its correctness is relatively easy to verify and maintain. The implementation can be seen as a temporary solution, and as a guideline for the interface of a better implementation that will be provided in the future.

This commit introduces finite maps implemented as 2-3-trees, where the
balance invariant is enforced by types (using nested datatypes
approach). While this is not the best implementation in terms of
efficiency, its correctness is relatively easy to verify and maintain.
The implementation can be seen as a temporary solution, and as a
guideline for the interface of a better implementation that will be
provided in the future.
@ppolesiuk ppolesiuk requested a review from Copilot November 21, 2025 07:52
Copilot finished reviewing on behalf of ppolesiuk November 21, 2025 07:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a finite map (dictionary) data structure for the standard library, implemented using 2-3 balanced trees with type-level height invariants. The implementation uses a functor pattern where Map.make creates specialized map operations for a given key type with comparison methods. While noted as a temporary solution, it provides a complete API with logarithmic complexity for core operations.

Key Changes

  • Implements 2-3 trees using nested datatypes to enforce balance invariants at compile time
  • Provides a comprehensive map API including insert, remove, lookup, fold, map, and iteration operations
  • Includes randomized testing to verify correctness of operations and size tracking

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
lib/Map.fram Core implementation of finite maps with 2-3 tree data structure, public API signature, and all map operations
test/stdlib/stdlib0008_Map.fram Test suite with basic assertions and randomized property-based tests for map operations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ppolesiuk ppolesiuk requested a review from Copilot November 21, 2025 08:31
Copilot finished reviewing on behalf of ppolesiuk November 21, 2025 08:35
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@wojpok wojpok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks fine and could be merged to the master as is. I haven't noticed anything wrong. Once I realized that the smart constructors implement arithmetic on 2-3 counting system, reading became a smooth sailing. I have mentioned some nitpicks, but they can be ignored.

order of keys. ##}
, method iter :
{Val, E} -> T Val -> ({key:Key} -> Val ->[E] Unit) ->[E] Unit
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a show method defined for Maps. It may be empty like the one for Streams, but at least it would enable showing of nested structures

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think, the show method should print just #Map? Or, should it try to print the bindings in the map?

Copy link
Collaborator

@wojpok wojpok Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have decided no to implement proper show method for Streams because it may not terminate in REPL. Since Maps are finite, it would make sense to print all bindings. Maybe we should use Haskell notation and print something similiar to Map.fromList [(key, value), ....]? I am fine with either honestly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With printing all bindings in Haskell style I see two problems. First, I didn't implement fromList function. Second, maps are implemented as a functor, so the name of the module is not know for the generic implementation. Maybe something like #Map.fromList [(key, value), ...]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, show method and fromList function are kind of "nice to have", but not the part of the core functionality. I can implement them as a separate issue, but I prefer to merge this PR as it is. When implementing a standard library, there is always missing functionality.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. We can take care of it later

the result of addition to a tree represented by `Repr`. #}

data AddResult Key Val Repr =
| AR_Ok of Repr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is convention to use "PascalCase" without underscores, but without a doubt this naming scheme is very readable. I like it this way

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a discussion on this topic in fram-lang/fram#2 (comment). As the prefixes to datatype constructors are common, I think the good convention is to use underscore to separate prefixes built from multiple capital letters, and for other prefixes use "PascalCase".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of that discussion

finally c => c size
end

#==============================================================================
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are different line types intentional or an accident?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional. Different line types mimic section/subsection distinction.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, now it makes sense

lib/Map.fram Outdated
data H1 = H1
data HS H = HS of H

{# Handler for resize effect, used to keep track of the size of the map. #}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition is straight to the point, but feels intimidating at first. I think this comment should mention it is a state effect explicitely.

| Node2 ml k v mr =>
RR_Ok (Node2 (Node3 l k1 v1 ml k v mr) k2 v2 r)
| Node3 ml kl vl mm kr vr mr =>
RR_Ok (Node3 (Node2 l k1 v1 ml) kl vl (Node2 mm kr vr mr) k2 v2 r)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k1 and kl from afar looks the same. It may be worth to use numbers instead of l suffix

Copy link
Member Author

@ppolesiuk ppolesiuk Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ignore this nitpick? I would like to have number consecutive to easily verify if the order in the tree is preserved. The problems begin when we have to match on a subtree, like in this case. So on the lower level, I use l and r suffixes, because they do not interfere with numbers. I agree, that for some fonts 1 and l looks almost the same. If this is a problem — change the font!

@forell forell merged commit 1900074 into master Nov 26, 2025
2 checks passed
@forell forell deleted the maps branch November 26, 2025 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants