erictraut/cpython

Runtime representation of generic classes

JelleZijlstra opened this issue · 2 comments

My prototype doesn't yet make generic classes actually subscriptable. The PEP says that "There is no need to include Generic as a base class. Its inclusion as a base class is implied by the presence of type parameters, and it will automatically be included in the mro and orig_bases attributes for the class."

My thinking for how to implement this:

  • Implement Generic in C.
  • Add a new CALL_INTRINSIC opcode in the interpreter that takes a tuple of type params and returns Generic[T, U, ...]
  • In the compiler, add instructions to call this intrinsic and add it to the bases of a generic class before the call to __build_class__.
  • To set __type_params__ (name from #10 (comment)), inject code into the class body that sets the value in the class namespace.

That's effectively the same technique I used in my earlier prototype, except that I didn't have Generic in C, and I didn't use an intrinsic.

Here's the code from my prototype, in case it's of use to you:

/* Generates a Generic.__class_getitem__ call for type parameters */
static int
compiler_generate_generic_base_class(struct compiler *c,
                                     asdl_typeparam_seq *typeparams)
{
    PySTEntryObject *ste = c->u->u_ste;
    Py_ssize_t i, n, n_active_params;

    n = asdl_seq_LEN(typeparams);
    n_active_params = ste->ste_active_typeparam_count;

    // For now, simulate an import of Generic from the typing module.
    // In the future, we may want to implement this type to avoid the import.
    _Py_DECLARE_STR(generic, "Generic");
    if (!compiler_generate_typing_import(c, &_Py_STR(generic))) {
        return 0;
    }

    ADDOP_NAME(c, LOAD_METHOD, &_Py_ID(__class_getitem__), names);

    for (i = 0; i < n; i++) {
        if (!compiler_load_typevar(c, n_active_params - n + i)) {
            return 0;
        }

        // If this is a TypeVarTuple, unpack it.
        if (typeparams->typed_elements[i]->kind == TypeVarTuple_kind) {
            ADDOP_I(c, UNPACK_SEQUENCE, 1);
        }
    }
    
    ADDOP_I(c, BUILD_TUPLE, n);

    ADDOP_I(c, CALL, 1);
    return 1;
}

I ended up doing something similar, except for doing it through a new intrinsic instead of directly in bytecode.