The Hunt of Broken Liskov Substitution Principle

Over the course of my programming experience, I’ve met plenty of codes that break this principle. In this article, we will recap on what the principle is about, a common symptom, and discuss what does it mean when we found a codebase with a broken LSP.

The Principle

Liskov Substitution Principle (LSP) is one of the most critical object-oriented programming principles. It was first introduced by Barbara Liskov, an American computer scientist, one of the first women to be granted a doctorate in computer science. It is part of the infamous ‘L’ in SOLID by uncle Bob.

Understanding LSP is critical in order to create a good object composition in programming. It defines an inherent rule, exists, in object inheritance. In the original paper, Barbara Liskov and Jeannette Wing describe the principle as follows:

“Subtype requirement: Let o(x) be a property provable about objects x of type T. Then o(y) should be true for objects y of type S where S is a subtype of T”

Wikipedia further explained LSP as; if S is a subtype of T, then objects of type T may be replaced with objects of type S without altering any of the desirable properties of the program. In simpler words, LSP states that; any objects with a parent-child relationship must be substitutable between each other without any consequences to the correctness of the application.

Consequence

The principle, once understood is simple. It is, however, difficult to follow in practice. Let’s look at the consequences of breaking the principle.

In object-oriented programming, violation of this rule means creating a dynamic class with a parent-child relationship that will not hold true even in a context that it was created in. A caller of the parent class must know the behavior of the children, subsequently, when creating a new children class, a change must be made to accommodate the new children behaviors in the caller class.

“This is the very definition of a distributed monolith.”

Any change to the children class must be accompanied by a full regression test to everything that affects the caller. This is the very definition of a distributed monolith. A small change to the system requires changes in other places that may not be necessarily relevant. The change becomes a risky exercise as application deployment tangled between one and another.

Example

One of the most common symptoms of a broken LSP is a usage of a switch like the following:

public void ProcessCustomer(Customer cust)
{
    switch(cust)
    {
        case Lawyer lawyer:
            ProcessLawyerCustomer(lawyer);
            break;
        case Doctor doctor:
            ProcessDoctorCustomer(doctor);
            break;
        case Police police:
            ProcessPoliceCustomer(police)
            break;
        default:
            throw new Exception(“Unknown Customer Job Type”);
    }
}

The innocent snippet above is a symptom of a broken LSP that developers need to keep an eye on. When we follow down the code, the broken principle usually becomes more apparent. Quite likely, we’d find something similar with the following code:

public void ProcessDoctorCustomer(Doctor doctor) 
{
    var specialty = doctor.GetSpecialty();
    ...
}

public void ProcessLawyerCustomer(Lawyer lawyer) 
{
    var lawyerType = lawyer.GetType();
    ...
}

public void ProcessPoliceCustomer(Police police) 
{
    var rank = police.GetRank();
    ...
}

ProcessCustomer method is a user of the parent class ‘Customer’. When we are adding a new customer type with the above code, ProcessCustomer must be modified as well. This is a direct violation to both LSP and Open-Closed Principle (OCP) in SOLID.

Another way of handling different object types is the usage of the generic T factory for processing the type. Using this method, when creating a new customer type, the caller does not require any change.

The caller code becomes:

public class CustomerProcessor
{
    private readonly ICustomerJobProcessorFactory _customerJobProcessorFactory;

    public CustomerProcessor(ICustomerJobProcessorFactory customerJobProcessorFactory)
    { … }

    public void ProcessCustomer(Customer cust)
    {
        _customerJobProcessorFactory.CreateProcessor(cust).Process();
    }
}

ICustomerJobProcessorFactory interface is as follows:

public interface ICustomerJobProcessorFactory
{
    CustomerJobProcessor CreateProcessor(T customer);
}

By abstracting out the processor into a different class, this makes CustomerJobProcessor becomes extendable. If there is an addition to the customer type, we can imagine that we just need to create a new CustomerJobProcessor. The code now conforms with OCP.

In order to further makes the code more robust, the Customer, Lawyer, Doctor, and Police class must be further re-structured. The Customer and their child becomes:

public interface ICustomer 
{
    string GetJobDescription();
}

public class Doctor : ICustomer 
{
    public string GetJobDescription() { … }
}

public class Lawyer : ICustomer 
{
    public string GetJobDescription() { … }
}

public class Police : ICustomer 
{
    public string GetJobDescription() { … }
}

If there is a method that needs to be across all child classes, we can create an abstract class of Customer. We can also force methods that require all the child class to override. Let’s force GetJobDescription() method for the child class to override.

public abstract class AbstractCustomer : ICustomer
{
    public abstract string GetJobDescription();
}

We then change the child classes to inherit from the abstract class.

public class Police : AbstractCustomer
{
    public override string GetJobDescription()
    {
        return _getRank();
    }
    ...
}

Using these simple inheritance techniques we have addressed both OCP and LSP issues. The solution now conforms to both principles. We also (tried to) addressed the distributed monolith problem. Additional customer job type does not necessarily require full regression for other job types.

But, Hold on! Does the new solution above is actually better? Implementing a generic factory and re-structuring code bears an inherent cost by having the extra layers. The notion of whether we should take further steps depend on whether we truly need the modularity attribute in our codebase. In a smaller codebase, the risk of change is not that high, and hence modularity is not that important. In a larger codebase, the modularity plays a critical part to reduce the cost of change.

Wider Issues

Broken LSP, when found, usually felt in conjunction with other problems. It is an indication of wider issues in software development.

Abuse of Yagni

Over the course of my experience, one of the usual culprits is, improper usage of ‘Ya Ain’t Gonna Need It’ (YAGNI) principle.

“YAGNI principle sometimes translate to; don’t create extra layers.”

YAGNI principle sometimes translates to; don’t create extra layers. This, however, is a short-sighted solution. As the example above suggests, conforming to the principles require extra layers. These layers, in turn, allows the application to be modular.

As a developer, we must be able to balance between a short-term and a long-term gain. To balance this, we require both experience and skill. We need the experience to know when to start to think about the long-term gain, as well as the skill necessary to make the changes itself.

Too Late To Fix

When a broken LSP identified, as the example suggests, it also breaks other principles. Refactoring the code become quite costly, and maybe, a destructive exercise.

“Refactoring the code become quite costly, and maybe, a destructive exercise.”

The symptom usually only felt when the entanglement became too much. When the application requires change, it needs to modify several other classes, making the change itself, a risky and costly exercise.

Once a broken LSP has been identified, if not tackled in a responsive manner, it may never be fixed at all. The cost of fixing became substantial and could no longer be accommodated with a simple fix. A ‘re-write’ to the entire codebase became cheaper.

Final Words

A symptom of a broken LSP is key to wider issues. It indicates that the application may already be a distributed monolith. A simple change in the application requires a regression and deployment of several other parts in the application. The change becomes a risky exercise.

Understanding the LSP is the first step. Being able to follow it, is another. The ability to adhere to the principle, identify a broken one and fix it, is critical in creating a modular long-term maintainable and robust application.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s