Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PoC] Associated types #18260

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from
Draft

[PoC] Associated types #18260

wants to merge 11 commits into from

Conversation

Girgias
Copy link
Member

@Girgias Girgias commented Apr 6, 2025

This is a proof of concept for associated types which are resolved at compile time.
Although these are not generics, they do solve the case of template types in interfaces, as generic types bound to an interface are isomorphic to associated types.

Implementation

Depends on:

The implementation currently is very crude and basic, but it demonstrates the concept of it nonetheless.
Currently, I am forcing the associated type to be invariant, as I haven't really though about how sensible it would be to have different types and how to decide which one should be the bound one.

Having a type constraint on the associated type turned out to be relatively easy as well.

This basic PoC also does not support anything relating to an interface extending another interface that has associated types, although I have an idea how it should work.

Also no opcache/JIT support.

Benefits

This would bring somewhat better semantics for the new interfaces of my Container/Offset RFC

@withinboredom
Copy link
Contributor

You are almost to full generics here :) ...

Invariant is a good default (and usually the default for any generics). You only need a proper constraint resolver (this is partly why I was working on type trees in #18189, which would let you resolve covariant/contravariant constraints very easily). I'm working on that for zend_type the last couple of weeks -- which is far more complex. Potentially, between the two of us, we could enable something powerful here.

I don't know if your intent is to get to full generics from here, but this is pretty similar to a couple of other experiments I've done.

@Girgias
Copy link
Member Author

Girgias commented Apr 6, 2025

I was not really planning on going full generics. As the main issue with them from my understanding is determining the type to be bound to at runtime in a way that is not terrible for ergonomics and performance. Maybe @arnaud-lb could shed a bit more light.

I didn't even think of a constraint resolver, but a few other people mentioned it and have an idea how to implement it, so will do that soon.

@Girgias Girgias force-pushed the associated-types branch from f72c2e6 to 2baed8a Compare April 7, 2025 03:05
@arnaud-lb
Copy link
Member

Interesting!

In term of functionalities that has some similarities with @nikic's "purely abstract" generics [1] as well as @derickr Collections [2], in that we can not parameterize types at the point of use, but types can extend/implement parameterized types.

One implication is that we can not use a type-with-assoc-types in type declarations, because this is not allowed:

function f(I<T: Foo> $i) {}

and this is unsound if I has assoc types:

function g(I $i) {}

Therefore, currently this seems most useful in traits and abstract classes? Could you expand on the relation with the Container/Offset RFC?

Allowing assoc types in traits or abstract classes seems possible, but this increases complexity to a level comparable to [1], as assoc types on properties, method signatures, or method bodies would be handled at runtime (at least on abstract classes).

Allowing I<T: Foo> in type decls seems possible. Some things to consider would be how T is allowed to vary, and the fact that this increases the complexity of sub-type checking against a type-with-assoc-types. I think that variance should be specified in the type statement (and default to invariant) rather than determined implicitly, to avoid unintended variance changes. E.g. T is invariant here:

interface I {
    type T; // invariant by default
    function foo(T): T;
}

here as well:

interface J {
    type T; // invariant by default (covariant would be allowed)
    function foo(): T;
}

but covariant here:

interface K {
    type out T; // covariant
    function foo(): T;
}

I was not really planning on going full generics. As the main issue with them from my understanding is determining the type to be bound to at runtime in a way that is not terrible for ergonomics and performance. Maybe @arnaud-lb could shed a bit more light.

I confirm. There are some difficulties [3]:

  • Type inference is hard to achieve
  • Big-O complexity of type checking can be quadratic or worse when checking compound types against compound types

[1] PHPGenerics/php-generics-rfc#45
[2] https://wiki.php.net/rfc/collections
[3] arnaud-lb#4

@Girgias
Copy link
Member Author

Girgias commented Apr 7, 2025

Could you explain the unsoundness argument a bit more? I am struggling to see it.

This experiment was mainly prompted about the discussion of allowing never as a parameter type (#18016) where the main motivation seems to be able to define an interface:

<?php
interface I {
    public function set(never $offset, never $value);
    public function get(never $offset): mixed;
}

With the intention that any implementation of said interface would specialize the types to be "sensible" e.g.

<?php
class ListOfAnimals implements I {
    public function set(int $offset, Animal $value);
    public function get(int $offset): Animal;
}

The proposal to allow never as parameter types seems to be hitting into the same unsoundness issue you are describing (i.e. we cannot know statically if the whole call chain is valid).

However, an associated type, even without being able to specify it in a type declaration, gives you at least the small guarantee that different methods use the same types:

interface I {
    type K : int|string
    type V : mixed;
    public function set(K $offset, V $value);
    public function get(K $offset): V;
}

class ListOfAnimals implements I {
    public function set(int $offset, Animal $value);
    public function get(int $offset): Animal;
}

This is basically also how it ties in to the Container/Offset RFC, because instead of needing to use mixed everywhere:

<?php

interface DimensionReadable
{
    public function offsetGet(mixed $offset): mixed;
 
    public function offsetExists(mixed $offset): bool;
}
 
interface DimensionFetchable extends DimensionReadable
{
    public function &offsetFetch(mixed $offset): mixed;
}
 
interface DimensionWritable
{
    public function offsetSet(mixed $offset, mixed $value): void;
}
 
interface DimensionUnsetable
{
    public function offsetUnset(mixed $offset): void;
}
 
interface Appendable
{
    public function append(mixed $value): void;
}
 
interface FetchAppendable extends Appendable
{
    public function &fetchAppend(): mixed;
}

We could use a pair of associated type:

<?php

interface DimensionReadable
{
    type K;
    type V;
    public function offsetGet(K $offset): V;
 
    public function offsetExists(K $offset): bool;
}
 
interface DimensionFetchable extends DimensionReadable
{
    public function &offsetFetch(K $offset): V;
}
 
interface DimensionWritable
{
    type K;
    type V;
    public function offsetSet(K $offset, V $value): void;
}
 
interface DimensionUnsetable
{
    type K;
    public function offsetUnset(K $offset): void;
}
 
interface Appendable
{
    type V;
    public function append(V $value): void;
}
 
interface FetchAppendable extends Appendable
{
    public function &fetchAppend(): V;
}

Where ArrayAccess, to keep BC, resolves both K and V to mixed.

My main concern with supporting traits, is that I would be hitting the same issue, that I haven't resolved yet, when trying to resolve self to the class name it is implemented at compile time. As doing so would, I think, remove a lot of type checking complexity as everything would just be reusing the same typing infrastructure.

I will also say that for this feature to be fully fleshed it does need to support property hooks, which might or might not be a challenge.

@arnaud-lb
Copy link
Member

Could you explain the unsoundness argument a bit more? I am struggling to see it.

What I meant is that calling any method in I would be unsound, but now I get that's it's intended / it's the purpose.

Thank you for the explanations.

@withinboredom
Copy link
Contributor

Type inference is hard to achieve

I have some ideas here. Here's one I'd probably tackle first as a proof-of-concept:

  1. tack a bool on zvals: isAffirmedType or something.
  2. on type checking: if the type of the zval matches the declared type, set isAffirmedType to true.

If the type is affirmed, then the type in the zval can be inferred, otherwise, it is an error.

function foo(SomeInterface $a) {
  new GenericArray($a); // type error: type cannot be inferred from SomeConcreteType
}

foo(new SomeConcreteType());

It's not ideal, but it is pretty straightforward to reason about as a user.

Big-O complexity of type checking can be quadratic or worse when checking compound types against compound types

I'm working on this, but I lack a lot of practical knowledge of the engine -- but getting there, slowly but surely. Feel free to beat me to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants