-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Protocols #1292
base: master
Are you sure you want to change the base?
feat: Protocols #1292
Conversation
This commit adds frontend support for two new primitives, `defprotocol` and `instance`. defprotocol defines a new protocol, a collection of zero or more interfaces. Protocols implement a form of subtyping whereby one can assert that a given type implements a number of interfaces. Types that implement a protocol are called `instances` and are added using the `instance` primitive. This commit does not yet add any logic for enforcing protocols, only the frontend code for adding them to a context. Initially, we will only support protocols in the specification of interfaces. A protocol may be used in place of any type or type variable.
This commit finalizes an initial implementation of protocols by updating the type system to treat them as instances of constrained polymorphism (that is, like var types but more restrictive). Protocols are a form of type constraint whereby only types that implement a set of interfaces may be used in the place of a protocol. Here's the gist of how they work: To define a protocol, a user enters `(defprotocol <name> [interfaces...])`. `defprotocol` adds a fresh protocol to the type environment, inhabited by no types. Users then register types with the protocol using `(instance <protocol> <type>)`. The type will only be registered if it appears in at least one position of the signatures of the protocol's interface's implementations. That is, if there is no existing implementation of all <interfaces> for the type, it cannot be added to the protocol. Users can then reference protocols in type signatures by prefixing their names with `!`. For example, `(sig foo (Fn [!Stringable] String))`. When we type check protocols, we update the state of types based on the current state of the type environment (since a protocol's members may have changed). If the type being passed in the place of a protocol is a member of the protocol, it successfully type checks. Otherwise, it fails to type check. Protocols can also be used for arbitrary "tagging" or "subtyping" over types by defining empty protocols. If a user wants to simply specify a type union, they can call: `(defprotocol Empty)` which will define a protocol that has no interfaces. Any type may be added to this protocol.
Very interesting PR! Some feedback: Being able to constraint parameters is really nice and important. This should be enabled for regular old interfaces too though, or you'd have to define an interface and a protocol each time you want to constraint a parameter of a function. Maybe this should be solved with a "typeclass" macro that does it all in one go? Having to think about whether to use an interface or a protocol seems like it's making the language harder to use, and a bit less elegant. One way around that is to have the user only interact with the Having to have two concepts could also be a sign that the current My other concern is regarding the syntax and semantics of constraining type variables. First, I didn't catch why the |
Scott coming in with the hot features, I dig it 🏅 I’ll take a look at it over the weekend, I’m impressed by how little code this is! Similarly to Erik, I was asking myself whether for the unification algorithm, |
One additional thing: I think that constraining a single type variable to multiple interfaces/protocols is also an important feature. And if we have that, protocols is very similar to a shorthand for constraining the type variable to a set of interfaces. So that's something to consider for making this feature as simple as possible (conceptually) |
Some great points about variable constraints. I was just thinking about this last night actually, and an alternative way to implement this is to do something like: (defprotocol Stringable-Addable [a b] (str (Fn [a] String)) (add [a a] a))
(instance Stringable-Addable [Int] Int.str Int.+) Which is kind of along the lines of what you mentioned, Erik, where the protocol and interfaces are kept separate while still allowing us greater control over polymorphism. One downside to this approach is that it kind of "reimplements" interfaces in some ways. One nice thing about the current approach is that you can just enforce subtyping based on pre-existing implementations of the interfaces you care about. Another downside to this approach is that we'd need to call special protocol functions instead of interfaces in function bodies: e.g. (sig foo (Fn [Stringable-Addable Stringable-Addable] (Stringable-Addable)))
(defn foo [x y] (Stringable-Addable.add)) So perhaps it really is best to align closer with Haskell for maximum control. At which point we'd instead have something like: (defprotocol Stringable str)
(defprotocol Fooable)
(sig with-foo [(protocol Stringable Fooable a) ...] (Fn [a] String))
(defn with-foo [x] (concat (str x) "foo")) One advantage to this approach is that it totally does away with the need for an
This is an unfortunate side-effect of the way xobjToTy (XObj (Sym spath@(SymPath _ s@(firstLetter : rest)) _) _ _)
| isLower firstLetter = Just (VarTy s)
| firstLetter == '!' =
if (not (null rest))
then Just (ProtocolTy (SymPath [] rest) [])
else Nothing |
First of all this is awesome, I've been wanting this in the language for a while.
For what it's worth I like this syntax: (sig my-fun [(where a Stringable Fooable)] (Fn [a] String) I agree with Erik that having both interfaces and protocol might be a bit confusing, after all we could still keep the current interfaces short names by declaring generic functions: (sig str [(where a Stringable)] (Fn [a] String)
(defn str [a] (Stringable.str a)) |
Nice, I like the Before I go any further, maybe we can all agree on the best implementation, it sounds like maybe we want to do the following:
(defprotocol (<Name> <params>)
(<fn-name> <signature>)
)
;;e .g.
(defprotocol (Monoid a)
(combine (Fn [a a] a))
(zero a)
)
(instance (Monoid Int)
(combine [x x] (+ x x))
(zero 0)
)
(sig plus-zero (where (Monoid a)) (Fn [a] a))
(defn plus-zero [x] (Monoid.combine x (Monoid.zero))) Or with multiple forms: (defprotocol (Foo a))
(sig plus-zero (where (Monoid a) (Foo a)) (Fn [a] a)) And supporting multi-parameter protocols should probably be a thing too (defprotocol (Natrual a b)
(transform (Fn [a] b))
)
(sig transform-then-plus (where (Natural a b) (Monoid b)) (Fn [a b] b)) And we should also support parameterizing over higher kinds application like haskell does too: (defprotocol (Functor f)
(fmap (Fn [(Fn [a] b) (f a)] (f b)))
)
(instance (Functor Maybe)
(fmap [f m] (match m) (Nothing) (Nothing) (Just x) (Just (f x)))
)
(sig fmap-and-plus (where (Functor f) (Monoid b)) (Fn [(Fn [a] b) (f a)] b)) Sound good? This would be a pretty big breaking change so we'll have to try to be certain we can provide those generics for all existing interfaces. |
@scolsen Looks great 👍 |
Hi! I usually hang out more in PureScript land than here, but Veit got my attention on this PR, so I'll leave a few throughts (of course feel free to ignore all of this if not relevant at all 😊):
|
@f-f thanks a lot for the feedback! I think this might be more of a naming issue than anything else – maybe we should stick with the "interface" name to avoid the confusion (or switch to "type class"). |
@f-f Thanks for the pointers and checking this out! There are definitely some bits to mull over. I'll let you know where we end up! |
That's not a bad idea. We could just keep |
I won't get a chance to take a stab at this until Aug 27th or so, so if anyone wants to try implementing it before then, please do! |
Would this support composing abstractions in a way rust does: trait A: B { // A supersets trait B
fn method(arg: C) -> void // Variable arg must implement trait C but can be absolutely whatever in terms of memory layout and other things
} where trait is roughly = protocol |
Has the thinking around ad-hoc polymorphism in Carp changed since this PR or is it still relevant? |
Not in any significant way, we haven't touched this topic in a while. I have experimented with making the type system more closely resemble GHC's approach to polymorphism, kinds, and type classes (at least as it stood when "Typing Haskell in Haskell" was written) . After that experiment I think I'd at least like to land somewhere in between the GHC approach and the approach we have in Carp currently. Not sure how others feel about it. |
Sounds fascinating— thanks. |
This PR adds initial support for
Protocols
, a sort of lightweight form of typeclasses. Given a collection of interfaces, protocols allow users to enforce type restrictions around the implementation of those interfaces. Before we get into the boring minutiae, it might be easier to see an example of how they work.To define a protocol, use
defprotocol
Once we have a protocol defined, we can mark types as members of it using
instance
Of course, if the type doesn't implement the required interfaces, it won't be added to the protocol:
We can reference protocols in types to limit the acceptable types in otherwise fully polymorphic positions. To refer to a protocol in a type signature, prefix it with
!
. For example:Protocols leverage the same polymorphism resolution as our regular old variables, so nothing is resolved until we call this function.
Let's try it with a valid
Num
:and what about a type that doesn't implement the protocol, even though it implements the interfaces?
oops. Lucky for us,
Double
already implements our interfaces:Finally, protocols can also be used for arbitrary "type tagging" or type unions simply by omitting any interfaces. One can use this to constrain the polymorphism of functions where desired without requiring additional interface implementations:
Now that we know how it works, here are a couple of important implementation notes:
instance
call time. If an implementation of a required interface doesn't yet exist in the environment wheninstance
is called, adding the type to the protocol will fail.resolveProtocols
andupdateProtocols
(depending on whether or not the have access to a full context or just the type environment) in order to ensure they have the latest version of the protocol's memberships.!
. They can share names with structs. When the!
is absent, the type system will try to find a struct. Think of this as similar to haskell's=>
delimiter for type classes vs types.