Wednesday, December 8, 2021

Why General Inheritance is Flawed and How to Finally Fix it

 

Why General Inheritance is Flawed and How to Finally Fix it


By leveraging composition and the final keyword in the right way, you can improve your programming skills and become a better Java programmer.  


General inheritance, whereby a public class is extended over package boundaries, provides a number of challenges and drawbacks and should be avoided in almost all cases. Classes and methods can be made final meaning that subclassing is disallowed which effectively prevents inheritance. While this may sound like a strange thing to do in an object-oriented language like Java, it does carry significant benefits for a large number of class types.


But, when should a class or method be final and just why is general inheritance problematic?

Immutable Classes

Immutable classes are classes whose state can not be observed to change from the outside world. This gives immutable objects the advantage of being inherently thread-safe and they can be reused indefinitely.


Java’s built-in String class is an example of an immutable class. It does have an internal state that is very likely to change the first time hashCode() is called, but this internal state cannot be observed by an outside caller (unless resorting to reflection).


Immutable classes shall always be declared final or else subclasses could compromise the immutability contract, simply by adding and exposing a mutable state. 


For the sake of completeness, it is worth mentioning that an immutable class should declare all its fields as private, final and ensure exclusive access to any mutable sub-component (such as an array), for example using defensive copying.

Non-instantiable Classes (aka Utility Classes)

A non-instantiable class is often informally referred to as a “utility class” and contains only static methods (and perhaps static fields). Static methods are not class methods but rather global functions attached to a “carrier-class”.  Ideally, non-instantiable classes should be immutable concerning their (static) state (if any).


These methods should be called using their carrier-class name followed by the method name (e.g. Collections.emptyList()). Subclassing a non-instantiable utility can result in non-intuitive behavior and is likely a source of confusion as the methods cannot be overridden anyhow, only replaced as illustrated hereunder:


public class FooUtil {


    static void print() {

        lower();

    }


    static void lower() {

        System.out.println("lower foo");

    }


}



public class BarUtil extends FooUtil {


    static void lower() {

        System.out.println("lower bar");

    }


}


Invoking BarUtil::print will produce “lower foo” and not “lower bar” meaning that BarUtil::lower did not override FooUtil::lower. However, if BarUtil::lower was called directly, it would have printed “lower bar”.


Therefore, non-instantiable classes should generally be declared final


As a side note, non-instantiable classes should have a single default constructor declared private to prevent instantiation of the non-instantiable class (as the name implies).

Methods Called by a Constructor

Methods called by a constructor of a class should always be final, either by declaring the entire class final or by declaring these methods final. Failure to do this may open up a leak of an object (e.g. “this”) that is only partially initialized and thus is likely in an illegal state. Such a leak may, for example, occur by the not-yet-initialized instance registering itself with a listener. These errors are likely hard to identify if they make it out in the open.

General Inheritance

The use/non-use of general inheritance has sparked opinionated discussions for quite some time.


Back in the early days, inheritance was often thought to be the general way of code reuse. As it later turned out, inheritance outside a package could lead to unsatisfiable and erroneous behaviour unless special care is put into providing classes that are suitable to extend across package boundaries [Bloch18, Item18]. 


Furthermore, general inheritance breaks encapsulation [Snyder80] because the superclass implementation might change over time which might cause a subclass to fail even though no changes were made. This problem might be avoided if one commits to never change the superclass, effectively making the superclass a large monolithic fossil API commitment for eternal times. In all fairness, this argument can also be raised against classes using composition even though there are fewer ways problems can leak into the code. So, this is not an argument for finalization but rather a more fundamental problem with code reuse.


Inheritance could produce unintended effects due to self-use, whereby an overridable method calls another overridable method in the base class: Imagine a class that extends ArrayList and that is supposed to keep track of the number of elements ever added to the class. If we override add() bumping the counter by one and override addAll(Collection) adding Collection.size() to the counter after which the corresponding super method is called,  then we are in for a surprise: 


Because ArrayList::addAll happens to self-use ArrayList::add to individually add the elements, additions via addAll() will count twice. Furthermore, there is no guarantee that this behavior will stay the same over time unless it is documented. Maybe there will be a more performant way of bulk-adding elements in the future whereby elements are directly inserted in the backing array without calling add()?


Another common problem with self-use is when a subclass overrides a method that is supposed to call one or several other methods but the programmer forgets to call the super method. A related problem is the problem of deciding if an overriding method should call the super method at the beginning or at the end of the overridden method (or indeed somewhere in between).  A solution to some of these problems could be to declare the top method final in the base class and provide overridable protected “hook methods” that can be overridden in a more controlled fashion. 


General inheritance also opens up potential security vulnerabilities: Suppose an ArrayList was extended to ensure only objects fulfilling a certain predicate could be added (e.g. they must be in a valid state). Then, in a later release, a new way of adding elements was introduced via the base class AbstractList. This new way will now become visible in the supposedly safeguarded class, effectively providing a back-door for adding illegal objects to the list. 


Another problem is “propagating exposure” as exemplified by  Arrays.asList(“a”, “b”) which returns a “fixed-size list” (but ought to return an unmodifiable List and here an immutable List as the elements themselves are all immutable). As it turns out, elements in the returned List may now not only be replaced via an Iterator but also via the List::replaceAll,a  method added in JDK 8 after the inception of Arrays::asList.


An additional class of problems might arise if a subclass adds a new method to the ones of the base class. If at a later stage, a method with the same signature is added to the base class, then this method will be coincidentally overridden by the subclass. This is likely not the intended behavior at all. If a method with the same name and parameters are added but with a different return type, then the code will likely fail to compile. So in the general case, it is not possible to ever add methods in a non-final public class as there is no control of how the class is subclassed.


Yet another problem could be incidental inheritance. The JDK itself has several problematic inheritances whereby classes were incidentally inherited because it was apparently “convenient“ and not because class B indeed was class A. For example, Stack extends the old Vector class for no good principal reason. This prevents Stack from evolving to a more efficient and performant implementation.


To summarize, a class that is supposed to be generally inherited is very hard to ever change and must [Bloch18, Item19]:

  • Document its self-use of overridable methods

  • Potentially providing hooks in the form of judiciously chosen protective methods

  • Be accompanied by tests using subclasses

  • Not provide a constructor that invokes overridable methods

  • Not allow serialization to invoke overridable methods



Inheriting also creates constraints and problems if hashCode()/equals() are overridden. If we have a base class called Fruit, then is an Apple with the same color as a Pear equal? Can an instance of SevilleOrange ever be equal to a BergamontOrange instance?  Generally, it is not easy to decide these kinds of questions. It is important to remember that any subclass should either override none of these methods or should override them both.


It should be noted that exposing a public non-final class in a public API by definition means that it opens up for inheritance across package boundaries as user-land code can place extending classes in any packet. Since split packages are strongly discouraged or might even be entirely forbidden depending on the use of JPMS, subclassing such a class implies subclassing over package boundaries.


One way of avoiding all these things is to declare classes final and use composition instead of inheritance, effectively abandoning inheritance across packages. This often provides a much cleaner API whereby only interfaces can be exposed and concrete classes do not leak out in the API. This way, any superclass used is only package-private and can, by convention or definition, never be used externally.


Composition with delegation protects against most of the problems mentioned above including unintended self-use, security holes via extra methods in base classes, signature collisions,  incidental inheritance, need of subclass testing, accidental leak of “this” and many other problems. In the past, it was feared that this would lead to reduced performance but this is simply not the case.


Inheritance in Java is, for good reasons, restricted to one superclass which naturally limits the scalability of the concept. Composition, on the other hand, allows an arbitrary number of delegates to be used.


A small drawback with composition could materialize in combination with the use of certain callbacks. However, this problem can be avoided if proper provisions are put in. In other words, if a component (used in composition) registers itself with a listener, then the listener will invoke the component itself and not the composing class.

Sealed Classes

In more recent Java versions, the concept of sealed classes (JEP 409) was introduced. Before this, the final keyword was a boolean property: either a class was extensible (within its declared access type) or it was not. Sealed classes introduce a more granular mechanism whereby it can be said that a Fruit can either be an Apple, Pear or Orange but nothing more. This is fundamentally a more generalized form of final. The amount of effort put into the Java languages with features like this indicates a class extensibility is an important property. Interestingly, a permitted class in a sealed interface must specify whether itself is final, non-final or permits subsequent subclasses. 


API Commitments Imposed by Inheritance

In this article, the class Stack was mentioned as a failed inheritance implementation. It basically introduces the methods push(), pop(), peek(), empty() and search(). But, as it inherits from Vector, we also get all the methods/classes from List, AbstractList, RandomAccess, Cloneable and SerializableAbstractList, which in turn, inherits from AbstractCollection which implements Collection.


This increases the API weight by orders of magnitudes and I am perfectly certain the Java designers are regretting their incidental inheritance 25 years down the line. If Stack was just an interface and there was a static method available that provided a new empty Stack, things would look much better.


Classes that are Serializable or subject to other serialization mechanisms are often particularly problematic as the binary (or other) format more often than not limits the way implementations can ever evolve over time.


As seen above and in previous clauses, a public non-final class cannot ever change in many cases.

Should Inheritance Across Package Boundaries Ever be Used?

This is a matter of opinion. 


Many times, it is better to use composition. In simpler cases delivering functions to a concrete class’ constructor providing tailored functionality would be preferable over allowing subclassing and overriding methods. To give an example of this, instead of overriding a handler method, a method handler could be provided via the constructor to a non-extensible class.


If, after very careful consideration, one arrives at the conclusion that one should provide an extensible class (across packages), then all the constraints above must be taken into careful consideration. Just allowing subclassing by default is a right-out mistake, particularly for library and API designers. Instead, classes should be marked final by default, and only after careful review and testing, opening up for subclassing could be regarded.

A Final Note

As I moved away from using inheritance across packages and switched to exposing just interfaces, many other advantages became apparent. It becomes much easier to keep internal considerations… well internal.


Composition whereby potentially several components can be used in a single class provides more code reuse capability than inheritance, albeit requiring a bit more code ceremony in the using class. It can also simplify testing of the code and provides better test coverage with much fewer and less brittle tests.


It also fits very well with the module system (JPMS).  Providing components as pure services, for example, using Java’s ServiceLoader, adds flexibility while minimizing the API footprint. This makes it easier to learn and use the API and provides much more flexibility to evolve libraries over time. 


Finally, it all makes sense...

References

[Bloch18]

Bloch, Joshua., Effective Java, Third Edition, ISBN 0-13-468599-7, 2018


[Snyder80]

Snyder, Allan. “Encapsulation and Inheritance in Object-Oriented Programming Languages”. In Object-Oriented Programming Systems, Language and Applications Proceedings, 35-45, New-York, NY ACM Press.



No comments:

Post a Comment

Note: Only a member of this blog may post a comment.