What is a virtual subclass in Python and what are its advantages?

Question:

In Python we can implement an abstract class from the abc module and one of the ways is the class inheriting from abc.ABC :

from abc import ABC

class AbstractClass(ABC):
    @abstractmethod
    def method(self):
        ...

From this class we can define a subclass based on direct inheritance:

class Subclass(AbstractClass):
    def method(self):
        return "Metodo da subclasse"

Or by defining a virtual subclass from AbstractClass.register :

class VirtualSubclass:
    def method(self):
        return "Metodo da subclasse virtual"

AbstractClass.register(VirtualSubclass)

When checking from the issubclass function both classes will satisfy the condition:

print(issubclass(Subclass, AbstractClass))  # True
print(issubclass(VirtualSubclass, AbstractClass))  # True

See it working on Ideone

So what's the difference between implementing a subclass from inheritance or virtually from register ? When to use the virtual subclass?

Answer:

TL;DR:update – The main motivation for creating this language functionality (not the call to .register , but the whole concept and mechanism of "virtual subclassing") was almost certainly to be able to register built-in classes in native code such as dict and list as instances of the more generic types collections.abc.Mapping and collections.abc.Sequence (and other compatible types). Using this mechanism for user-created classes has limited usefulness, as detailed below:

The ugly

The simple answer is: you will practically never need it!

But if you're in a project that uses static type hinting with MyPy, you may need to – (and then you'll probably find that it's not all flowers as you increase the complexity, and your option, despite logic and the right thing won't work, because there are things badly done along the way)

Let's do it by steps:

What implementing a class MinhaCLasse as a virtual subclass of ClasseBase does is that when at some point in the code, your or anyone else using your MinhaClasse is an instance of ClasseBase , with issubclass(MinhaClasse, ClasseBase) returns True .

So, going to a more concrete example – you create a class that behaves like a Python sequence – it implements __getitem__ , __len__ , and other things inside, but it doesn't directly inherit from either list , or collections.abc.Sequence . There you go. This does, for example, if you were to use your class with a library that would use Python "sequences". If your team that wrote this other code, and you guys agreed to always test if an object is passed by testing isinstance(obj, collections.abc.Sequence) , cool – your call will work, and merge your code with the rest of the your team's code will work fine.

It's just – how many times have you tested that an object that you're going to use as a sequence is a sequence using that? In the real world, libraries are rare that will check if your parameter is a string even using this comparison – in practice, your object will plug into a for on the other side, and if it doesn't respond well to the "iterable" protocol, it happens an exception – and presto, everyone is happy. 🙂

This is – you run this here:

...
class MinhaSeq:
   ...

collections.abc.Sequence.register(MinhaSeq)
...

def qualquer_funcao():
    items = MinhaSeq(...)

    biblioteca_x.funcao_y(items)

And, if on the other side, in library_x, the person takes care of it, the code could be like this:

def funcao_y(obj):
    for item in obj:
        funcao_z(item)

Then, if by chance your MinhaSeq didn't work as a sequence or iterable, the line that has the for would cause a TypeError .

Now, if there in library_x they "thought about it", the code could look like this:

import collections.abc

def funcao_y(obj):
    if not isinstance(obj, collections.abc.Iterable):
         raise TypeError("funcao_y precisa receber um iterável")
    for item in obj:
        funcao_z(item)

Congratulations – now the TypeError happens exactly one line before! And – library_x will only work for anyone who either inherits from an official Python sequel, or remembers to call "register". In other words – on the library_x side, it didn't make it easy to use for third parties, and it didn't gain anything with this check.

(and you have one more "isinstance" check, which if it's in a tight loop can make a difference in performance, although this is a rare situation in Python code (ie: a single call to isinstance impacts the performance of a snippet of code))

the correct

Now, yes, with PEP 484, and static checking, library_x could have code like this:

import typing as T

def funcao_y(obj: T.Iterable[T.Any]):
    for item in obj:
        funcao_z(item)

Note that in this case library_x can help the user who is concerned about making correctly typed calls in a large project: the project will include running "mypy" at test time/qa and if the call in your code passes an object if it is not an "Iterable" it will be charged before the code is running or production. And, if, on the other hand, whoever is going to use library_x is not worried about this check, they won't be checking their project with " mypy " and will just make the call – which will work as in the first case above, without any code more at runtime.

Hence the problem with the "virtual subclass": it wo n't work in this case! Because "mypy" (and other similar tools) can't figure out that you're going to call "collections.abc…register" for your class in static parsing: it's going to fail your call in the same way.


Just to give you a very real example of when I say that "sc_x" won't test with isinstance: it's not a joke – it doesn't happen in the real world – it doesn't even happen in the Python standard library. The JSON encoder, for example, requires real instances of dict and list (or direct subclasses) to work, and it won't work with `collections.abc.Sequence. Just this week there was a highly complex question on SOen about it, involving users with extremely high reputations, and complex issues (like "metaclasses") – and discovering bugs in Python's own default implementation: https://stackoverflow.com/a/ 58031309/108205 (disclaimer: I was involved in the question and mine was the accepted answer).


(continuing correct): What to do then?

well, do you want to use some type checking in python and make things right from an OO standpoint? So it's worth remembering that when the "register" mechanism and the virtualsubclassing was created, Python's static optional type checking was not even thought of, which is gaining in popularity these days. What would happen in the case above, is that instead of worrying about registering MinhaSeq as a virtual subclass of collections.abc.Sequence , you would advise with the recommendations of PEP 484 and the Typing module that your class respects the typing.Sequence interface .

The problem? Python code typed to work with MyPY is boring to write. For example, if you've already declared the class MinhaSeq not to be compatible, by its inheritance, with the sequence type, you can't just do an "=" declaring that it is now compatible – that is, this here: ( TL;DR: the example below is the recommended way in modern Python to create other classes that present an interface understandable by MyPy — that is, when one is formally concerned with typing in the project):

import collections.abc
import typing as T

class MinhaSeq(collections.abc.Sequence[T.Any]):
    ... 

It works – but it's obviously not equivalent to calling the .register after the class has been created. The closest thing would be to use typing.cast – "typing.cast" is something that doesn't do anything at runtime – it returns the very same object it was called with – but passes information to the static checker, in this case mypy, about the type of object returned. The problem?? The static mapper will have already learned about MinhaSeq and won't let you change its type after the declaration – so the cast return, which it will understand is a "Sequence" type has to be for a similar name, but not the same:

import collections.abc
import typing as T

class _MinhaSeq:
    ...

MinhaSeq = T.cast(T.Type[T.Sequence[T.Any]], _MinhaSeq) 

And here, you would be ready to call funcao_x(obj: T.Iterable[T.Any]): passing instances of MinhaSeq

the fun

Although it has no practical use – even in super pedantic code regarding typing, the interesting thing about virtual classes is precisely the "concept". It could be that the idea will come back with more force in a few years (if "mypy" pays the same attention in the call to "register" as it pays in the call to "typing.cast", for example, the thing would already work)

Behind the scenes are mechanisms for letting isinstance programmatically respond to isinstance and issubclass calls that are in play – the "register" method of ABC classes is just a way for ABC classes to note how their special methods __subclasscheck__ and __subclasshook__ are used – and it is possible to have some project that makes a nice use and practical applications of it. But, as defined today, it would be difficult to use virtual subclasses in a project with practical application beyond proof of concept.

a use case

As an "exception to confirm the rule", over the past weekend I went to implement a feature where "virtual subclassing" seemed to provide an interesting feature. I wouldn't have remembered this feature if I hadn't interacted with that question – and I would have simply subclassed it.

There's this free project I'm developing – a library for drawings and artwork with unicode in the terminal – https://github.com/jsbueno/terminedia – In it I have a class hierarchy that starts with "Shape" and provides some specialized classes to contain graphical and textual data (for example, one of the classes loads a binary image file, and holds the internal data as a PIL image, another holds the data as strings, and has a colormap that works like a palette : each character can represent a different color, etc…).

All the "Shape" classes have in common that their data is read and modified through the __getitem__ and __setitem__ methods – and I wanted to provide a "view" class that would allow you to choose a smaller area inside a Shape – for example, the rectangle between positions (5.5) and (15.10) – and being able to change data in this view as if it were a "Shape" – but the content of position "0,0" of the view would transparently change the content of the original , in position "5, 5". And so on for any graphical operation on the view: it has the same methods and attributes as a "Shape", but the internal data is from the original instance, and all addressing is done only within the region of interest (ROI).

The logic for such a view is quite simple – it has to worry about providing a few attributes, and transparently accessing all the other attributes of the original instance. In Python this is possible by customizing the __getattribute__ method of a class, carefully.
So it doesn't make sense for me to have all the Shape logic in the ShapeView, even if it's inherited – (and it would be inherited if I wasn't using virtualsubclassing – it wouldn't consume "more" resources). The ShapeView only has to "concern" with making the necessary coordinate transformations for its proxy role.

Well, it turns out that in the rest of the project, there are some points where an object of type "Shape" is expected. Since the ShapeView – being a proxy for a Shape – can do everything a Shape can, it makes sense that it can "say it's a Shape, even without directly inheriting from one". This is where virtualsubclassing does exactly what would be interesting: the code that checks the class with isinstance(data, Shape) will think it's a shape. And even though within the project this only happens in a few places, with virtualsubclassing, I can encourage this pattern for library users – it will continue to work.

What did I need to do? First, the base class "Shape" has to become a class that allows registration of virtual subclasses. This can be done by simply inheriting from abc.ABC from the default library. And look how interesting, exactly the Shape base had some "abstract" methods – which need to be overriden by subclasses, but I hadn't bothered to abc.ABC from abc.ABC just because of that – as the focus of the project (at least at the moment), it's not 100% in line with all good OO practices, I simply had a raise NotImplementedError those methods that need to be rewritten in the subclass. The @abstractmethod of Python's abc module does little more than that, so I wasn't using it. But since I was going to use the base ABC for the virtualsubclassing, there's no @abstractmethod not to use the @abstractmethod decorator in the 3 places where it makes sense.

Another point that's nice to note is, so I needed to call Shape.register(ShapeView) to "turn on" the virtualsubclassing. And, like any callable in Python that takes a callable as a single argument, it can be done with decorator syntax – that is, the registration of the virtual subclass could be done like this:

@Shape.register
class ShapeView:
    ... 

https://github.com/jsbueno/terminedia/blob/140c934da66c0186e52741cbb0dacfa6bc16f0b7/terminedia/image.py#L351

(note: the .register call works as a decorator because it returns the original argument – if it returned None , that wouldn't work – in the documentation there's a note that in Python version 3.3 they realized this and changed it to work).

Well – then finally, to bring the water to the boil – I was excited that all the contents of the ShapeView class had been minimal – it just needed to customize access to attributes, implement __setitem__ , __getitem__ addition to width, height and size to work like a Shape that does a lot more things — and I realized how the implementation is, I had to replicate the namespaces with the drawing methods in it – then it wasn't that far from a "Shape" anymore. So – on that side – ShapeView had to re-implement a lot of Shape – virtualsubclassing didn't bring that many gains. But on the other hand, a ShapeView can be associated with any of the other Shape subclasses – and provide transparent access to attributes like PixelClass – maybe for this to work right, I would end up having to make all shapes have to have the functionality that is in the " ShapeView" – or create dynamic subclasses each time a shapeview was created. – so maybe the virtualsubclassing balance is still quite positive.

Disclaimer: from the size of the text used just to explain the concrete use case, you can see that virtualsubclassing isn't a feature you're going to use all the time.

Scroll to Top