[PoC] Limited Abstract Generics #18260

Girgias · 2025-04-06T04:37:01Z

This is a proof of concept for a limited abstract generic types feature set, as those can be, and are, resolved at compile/linking time.

Implementation

Depends on:

The implementation is relatively dumb, and partially based on arnaud-lb#4 for parser/AST/compile shenanigans.

The generic types (name and constraint) are stored on the CE in a new generic_parameters list field.

The bound types are also stored on the CE as a HashTable:

lc_interface_name: HashTable<int|string, zend_type>
    int: positionally bound type
    string: positional bound type associated to its template name

This means that this implementation cannot be extended to support concrete generics (i.e. generics on a concrete instantiable class), as those need to be tied to the instance of the CE, not the CE itself.

The generic types must be:

invariant
Not be part of a composite type
Only on interfaces
No support for type declarations

It is possible to extend an interface with generic types, so that a sub-interface can reuse the same generic parameter.
If one of the generic parameters of the interface being extended has a type constraint, this type constraint must be repeated on the child interface. As the type constraints for interfaces are checked when extending.

ToDos

opcache support (currently leaks memory)
Fix some type binding issues for implicit and explicitly implemented interfaces
Properly support this for internal interfaces

Benefits

Although the lack of type declarations can make this unsound, in that a generic type T of an interface I<T : C> is no better than the type constraint C (which is mixed) by default.
It does "solve" the primary need of wanting never to be useable for parameter types, being able to specify the actual type, and thus have engine type checking, on the concrete implementation. As this is currently prevented by LSP variance rules.

One use case would be to use generic types K, V on the new interfaces of my Container/Offset RFC instead of mixed.

Future scopes

Lifting restricting on generic parameters needing to be used standalone (i.e. make T|null valid)
Add support for abstract class
Add support for traits
Add type declarations
Allow variance of generic parameters
Optional generic parameters

withinboredom · 2025-04-06T08:41:12Z

You are almost to full generics here :) ...

Invariant is a good default (and usually the default for any generics). You only need a proper constraint resolver (this is partly why I was working on type trees in #18189, which would let you resolve covariant/contravariant constraints very easily). I'm working on that for zend_type the last couple of weeks -- which is far more complex. Potentially, between the two of us, we could enable something powerful here.

I don't know if your intent is to get to full generics from here, but this is pretty similar to a couple of other experiments I've done.

Girgias · 2025-04-06T14:31:48Z

I was not really planning on going full generics. As the main issue with them from my understanding is determining the type to be bound to at runtime in a way that is not terrible for ergonomics and performance. Maybe @arnaud-lb could shed a bit more light.

I didn't even think of a constraint resolver, but a few other people mentioned it and have an idea how to implement it, so will do that soon.

arnaud-lb · 2025-04-07T13:18:13Z

Interesting!

In term of functionalities that has some similarities with @nikic's "purely abstract" generics [1] as well as @derickr Collections [2], in that we can not parameterize types at the point of use, but types can extend/implement parameterized types.

One implication is that we can not use a type-with-assoc-types in type declarations, because this is not allowed:

function f(I<T: Foo> $i) {}

and this is unsound if I has assoc types:

function g(I $i) {}

Therefore, currently this seems most useful in traits and abstract classes? Could you expand on the relation with the Container/Offset RFC?

Allowing assoc types in traits or abstract classes seems possible, but this increases complexity to a level comparable to [1], as assoc types on properties, method signatures, or method bodies would be handled at runtime (at least on abstract classes).

Allowing I<T: Foo> in type decls seems possible. Some things to consider would be how T is allowed to vary, and the fact that this increases the complexity of sub-type checking against a type-with-assoc-types. I think that variance should be specified in the type statement (and default to invariant) rather than determined implicitly, to avoid unintended variance changes. E.g. T is invariant here:

interface I {
    type T; // invariant by default
    function foo(T): T;
}

here as well:

interface J {
    type T; // invariant by default (covariant would be allowed)
    function foo(): T;
}

but covariant here:

interface K {
    type out T; // covariant
    function foo(): T;
}

I was not really planning on going full generics. As the main issue with them from my understanding is determining the type to be bound to at runtime in a way that is not terrible for ergonomics and performance. Maybe @arnaud-lb could shed a bit more light.

I confirm. There are some difficulties [3]:

Type inference is hard to achieve
Big-O complexity of type checking can be quadratic or worse when checking compound types against compound types

[1] PHPGenerics/php-generics-rfc#45
[2] https://wiki.php.net/rfc/collections
[3] arnaud-lb#4

Girgias · 2025-04-07T15:19:25Z

Could you explain the unsoundness argument a bit more? I am struggling to see it.

This experiment was mainly prompted about the discussion of allowing never as a parameter type (#18016) where the main motivation seems to be able to define an interface:

<?php
interface I {
    public function set(never $offset, never $value);
    public function get(never $offset): mixed;
}

With the intention that any implementation of said interface would specialize the types to be "sensible" e.g.

<?php
class ListOfAnimals implements I {
    public function set(int $offset, Animal $value);
    public function get(int $offset): Animal;
}

The proposal to allow never as parameter types seems to be hitting into the same unsoundness issue you are describing (i.e. we cannot know statically if the whole call chain is valid).

However, an associated type, even without being able to specify it in a type declaration, gives you at least the small guarantee that different methods use the same types:

interface I {
    type K : int|string
    type V : mixed;
    public function set(K $offset, V $value);
    public function get(K $offset): V;
}

class ListOfAnimals implements I {
    public function set(int $offset, Animal $value);
    public function get(int $offset): Animal;
}

This is basically also how it ties in to the Container/Offset RFC, because instead of needing to use mixed everywhere:

<?php

interface DimensionReadable
{
    public function offsetGet(mixed $offset): mixed;
 
    public function offsetExists(mixed $offset): bool;
}
 
interface DimensionFetchable extends DimensionReadable
{
    public function &offsetFetch(mixed $offset): mixed;
}
 
interface DimensionWritable
{
    public function offsetSet(mixed $offset, mixed $value): void;
}
 
interface DimensionUnsetable
{
    public function offsetUnset(mixed $offset): void;
}
 
interface Appendable
{
    public function append(mixed $value): void;
}
 
interface FetchAppendable extends Appendable
{
    public function &fetchAppend(): mixed;
}

We could use a pair of associated type:

<?php

interface DimensionReadable
{
    type K;
    type V;
    public function offsetGet(K $offset): V;
 
    public function offsetExists(K $offset): bool;
}
 
interface DimensionFetchable extends DimensionReadable
{
    public function &offsetFetch(K $offset): V;
}
 
interface DimensionWritable
{
    type K;
    type V;
    public function offsetSet(K $offset, V $value): void;
}
 
interface DimensionUnsetable
{
    type K;
    public function offsetUnset(K $offset): void;
}
 
interface Appendable
{
    type V;
    public function append(V $value): void;
}
 
interface FetchAppendable extends Appendable
{
    public function &fetchAppend(): V;
}

Where ArrayAccess, to keep BC, resolves both K and V to mixed.

My main concern with supporting traits, is that I would be hitting the same issue, that I haven't resolved yet, when trying to resolve self to the class name it is implemented at compile time. As doing so would, I think, remove a lot of type checking complexity as everything would just be reusing the same typing infrastructure.

I will also say that for this feature to be fully fleshed it does need to support property hooks, which might or might not be a challenge.

arnaud-lb · 2025-04-07T16:10:31Z

Could you explain the unsoundness argument a bit more? I am struggling to see it.

What I meant is that calling any method in I would be unsound, but now I get that's it's intended / it's the purpose.

Thank you for the explanations.

withinboredom · 2025-04-07T19:13:58Z

Type inference is hard to achieve

I have some ideas here. Here's one I'd probably tackle first as a proof-of-concept:

tack a bool on zvals: isAffirmedType or something.
on type checking: if the type of the zval matches the declared type, set isAffirmedType to true.

If the type is affirmed, then the type in the zval can be inferred, otherwise, it is an error.

function foo(SomeInterface $a) {
  new GenericArray($a); // type error: type cannot be inferred from SomeConcreteType
}

foo(new SomeConcreteType());

It's not ideal, but it is pretty straightforward to reason about as a user.

Big-O complexity of type checking can be quadratic or worse when checking compound types against compound types

I'm working on this, but I lack a lot of practical knowledge of the engine -- but getting there, slowly but surely. Feel free to beat me to it.

arnaud-lb · 2025-04-21T20:07:13Z

@withinboredom this is an interesting idea as it makes inference works when the runtime and static types match. Unfortunately I think it’s unsound because calling foo() with a type accepted by its signature is an error.

Zend/zend_ast.c

morrisonlevi · 2025-05-08T14:58:29Z

Zend/tests/type_declarations/abstract_generics/big_example.phpt

+--EXPECTF--
+Fatal error: Generic type cannot be part of a union type in %s on line %d


When I saw my example in here, I got excited that you added basic union support. Nope! 😆

One day 🙏🏻

As I said to Bob I really want to keep it as small as possible as it's already hurting my brain a bit! But this should be a rather easy limitation to lift :)

Zend/zend_compile.c

arnaud-lb · 2025-05-14T09:55:31Z

Zend/zend_inheritance.c

-		zend_class_entry *fe_scope, const zend_type fe_type,
+		zend_class_entry *fe_scope, const zend_type *fe_type_ptr,


What is the reason for passing all types as pointers?

At the beginning when using the Associated Type syntax I was storing a pointer in the bound_types field as the type would have been stored in the arg_info of the first bound function. This doesn't really apply any more. Will revert the first commit.

Zend/zend_inheritance.c

Have a HT null access somewhere...

Use the index that is now part of the zend_type This should reduce memory consumption as we are not doing a second deep copy of the types Fixes one of the implicit interface tests

github-actions bot added Category: Engine Extension: opcache Extension: reflection Extension: tokenizer Category: Optimizer ABI break labels Apr 6, 2025

Girgias force-pushed the associated-types branch from f72c2e6 to 2baed8a Compare April 7, 2025 03:05

Girgias force-pushed the associated-types branch from 2baed8a to cc039c9 Compare April 7, 2025 20:19

Girgias mentioned this pull request Apr 8, 2025

Zend: Use pointer to zend_type for variance checks #18257

Open

DanielEScherzer reviewed Apr 22, 2025

View reviewed changes

Zend/zend_ast.c Outdated Show resolved Hide resolved

Girgias force-pushed the associated-types branch from cc039c9 to fd86e50 Compare April 28, 2025 13:04

Girgias force-pushed the associated-types branch 2 times, most recently from 14b6bb7 to 4b8cb7e Compare May 8, 2025 11:19

Girgias changed the title ~~[PoC] Associated types~~ [PoC] Limited Abstract Generics May 8, 2025

morrisonlevi reviewed May 8, 2025

View reviewed changes

Girgias force-pushed the associated-types branch from 4b8cb7e to 52e85af Compare May 10, 2025 17:19

php deleted a comment May 13, 2025

Girgias force-pushed the associated-types branch 2 times, most recently from c5cedbd to 3a8b3d2 Compare May 14, 2025 07:42

github-actions bot added the Extension: zend_test label May 14, 2025

Girgias force-pushed the associated-types branch 2 times, most recently from 3485776 to 1679e6d Compare May 14, 2025 08:39

arnaud-lb reviewed May 14, 2025

View reviewed changes

Girgias added 29 commits May 22, 2025 14:23

Improve AT constraint indication when dumping type to string

b1a3546

Fix various unhandled cases where AT is part of a union type

363f817

[skip ci] Get explicit notation working

2198aaf

Have a HT null access somewhere...

Remove most of associated types

be7efd5

rename

87f9544

Did I finally fix generic type binding?

56f2f1c

fix stuff

2d83b54

move stuff

51b689e

name of variable

cad9308

Rewrite generic type binding for inherited interfaces

5e076d0

Do some fixes about implicit and explicit interface implementations

beabf2b

Fix namespace resolution for interfaces

36a4736

Use safe_pemalloc for generic params alloc

7fbdc0a

Fix AST and name resolution

6cdc4f5

Extra fixes

7b38d21

Rename last associated macro stuff to generic

20e21ab

Add bound resolved type to string for error messages

92a45c7

Fix typo in variable name

d1bae38

Update wording for error message

518d7ea

Add failing test

9d17cae

Add some const qualifiers

00372bd

Change to static storage

02695e0

rename interfaces

6fd8860

Fix redeclaration of method, and other issues

ad1ae05

Add an index field to zend_type

fba8f76

Stop relying on the generic type name to find bound type

106c3d2

Use the index that is now part of the zend_type This should reduce memory consumption as we are not doing a second deep copy of the types Fixes one of the implicit interface tests

Refactor bind_generic_types_for_inherited_interfaces

439458c

Zend: Inherit interfaces early

229fcc9

Fix some implicit bound type issues

891f24d

Girgias force-pushed the associated-types branch from 03d7d81 to 891f24d Compare May 22, 2025 13:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PoC] Limited Abstract Generics #18260

[PoC] Limited Abstract Generics #18260

Uh oh!

Girgias commented Apr 6, 2025 •

edited

Loading

Uh oh!

withinboredom commented Apr 6, 2025

Uh oh!

Girgias commented Apr 6, 2025

Uh oh!

arnaud-lb commented Apr 7, 2025

Uh oh!

Girgias commented Apr 7, 2025

Uh oh!

arnaud-lb commented Apr 7, 2025

Uh oh!

withinboredom commented Apr 7, 2025

Uh oh!

arnaud-lb commented Apr 21, 2025

Uh oh!

Uh oh!

morrisonlevi May 8, 2025

Uh oh!

Girgias May 12, 2025

Uh oh!

Uh oh!

arnaud-lb May 14, 2025

Uh oh!

Girgias May 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		--EXPECTF--
		Fatal error: Generic type cannot be part of a union type in %s on line %d

		zend_class_entry *fe_scope, const zend_type fe_type,
		zend_class_entry fe_scope, const zend_type fe_type_ptr,

[PoC] Limited Abstract Generics #18260

Are you sure you want to change the base?

[PoC] Limited Abstract Generics #18260

Uh oh!

Conversation

Girgias commented Apr 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation

ToDos

Benefits

Future scopes

Uh oh!

withinboredom commented Apr 6, 2025

Uh oh!

Girgias commented Apr 6, 2025

Uh oh!

arnaud-lb commented Apr 7, 2025

Uh oh!

Girgias commented Apr 7, 2025

Uh oh!

arnaud-lb commented Apr 7, 2025

Uh oh!

withinboredom commented Apr 7, 2025

Uh oh!

arnaud-lb commented Apr 21, 2025

Uh oh!

Uh oh!

morrisonlevi May 8, 2025

Choose a reason for hiding this comment

Uh oh!

Girgias May 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

arnaud-lb May 14, 2025

Choose a reason for hiding this comment

Uh oh!

Girgias May 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Girgias commented Apr 6, 2025 •

edited

Loading