I recently had a revelation about monolithic codebase’s that I want to share with you. A monolith is not a bad design pattern, rather I’d argue microservices are just a collection monoliths. That’s not the revelation I wanted to share, the revelation is that if your monolith isn’t working for you — it’s likely because you’ve failed to see that your monolith is hosting multiple domains under one roof which leads to chaos.
Hmmmm, go on…
It is very common to have a large codebase that many developers can dev against. A typical n-tier app looks like this in your solution:
Web Layer (MVC/WebAPI)
Core (business logic)
Data (holds all the entities and repositories)
So far nothing seems amiss, we’ve separated three concerns and we pat ourselves on the back. We might even go the next step and have a few “web layers” that represent different deployables and some test projects like so:
Customer-facing MVC app project
Admin-facing MVC app project
Core business logic project
Data project
Customer-facing MVC tests
Admin-facing MVC tests
Core tests
Data tests
Our DRY (don’t repeat yourself) principle is shining brightly and our OCD tendencies are calm. When it comes time to add a feature, we add our requisite code to each layer as needed. As time goes by, we decide a folder naming scheme to keep things tidy. Even further down the timeline we decide some areas of the business are fairly specialized and begin to organize teams.
What could possibly go wrong?
A subtle failed assumption
At this point in our monolithic evolution, everything has been going swimmingly. Teams who need data can implement a public data repo that any other team can use because all of the entities and data repositories, live in the data project. The other teams can simply depend on some other dev’s data repository from their service in the Core project. In effect the low-level repositories become common sets of code that anyone can utilize. By having a common set of data, teams won’t have to re-implement them.
It is the above “good intentions” that I argue that is how your monolith turns against you.
Competing Concerns
Common code will work, but only if there is a single business domain in a code repository. What doesn’t work is having multiple domains in a single code repository if there are no hard boundaries.
By having common DB entities (e.g. an account entity) in a single domain, there isn’t a clear owner of that entity — it becomes communal. The entity is not good for any one-team, but rather a one-size fits none entity. Communal entities end up having many properties/columns to satisfy everyone rather than only fetch what is necessary for a single domain use-case.
The same is true for the data repositories, who is in charge of adding/maintaining the methods for fetching? Worse entities in a data project typically get defined with { get; set; }
auto-properties effectively make it so anyone calling these repos can make any sort of mutations they want with no domain-logic accountability.
Fences make for good neighbors
You can make a monolith work for you but there needs to be proper separation and encapsulation. But how can we create that separation? More folders, more “rules” that must be honored?
The answer is to use physically separate and encapsulate things. I may trigger your OCD for what comes next, but let’s first look at restructuring your monolith into a domain-monolith. A solution structure would differ from above by looking like the following:
Customer-facing MVC app project
Admin-facing MVC app project
Customer-facing MVC tests
Admin-facing MVC tests
Domain one business logic
Domain one Data project
Domain one Data tests
Domain two business logic
Domain two Data project
Domain two Data tests
… and so on with the domains.
To explain a little deeper, each domain is its own project and each domain has their own implementation of their data layers. Why do such madness? Any data repo/entity that a domain uses becomes private to the domain. The benefits become the following:
The domain has full-reign on the data layer for the domain.
The domain chooses whether to eager/lazy load data.
The domain chooses what methods should exist on the repos.
There is clear ownership of the entities and repositories.
A domain can choose its own ORM if desired (Entity Framework/Dapper/NHibernate/etc).
The data layer of this domain is private and is impossible to get at unless someone adds a dependency on your data project (which you will quickly reject).
The downsides of this become:
You’re still sharing the lowest-level of the DB (e.g. which columns exist on a table, the names of the columns/tables, what indexes there are. Additive changes won’t affect another domain (in most cases), but destructive breaking changes will (e.g. column renames, drops, etc). To offset this, you ideally want you own database per domain. If you go this route, you may as well move the domain out into a microservice.
Each domain has to implement a data layer.
By reorganizing away from common data, you put up an effective fence that has to be abided by. In fact, domains should prefer that their data model is hidden from other domains. A domain business logic then becomes the only way to mutate the state of a domain. This is what we want.
In order to complete encapsulating a domain’s state, the domain should never return the underlying entity to any service callers. In this way, again there is no way for any domain to mutate state unless the domain agrees to it via it’s public interfaces.
Entropy is what the universe wants, but we don’t
If there are no logical blockers (won’t compile, private setters), then developers will naturally find a way to get around a list of “rules”. The Second Law of Thermodynamics basically says, given enough time, disorder will come to all things unless there is something preventing it.
This means the “honor” system and code reviews aren’t good enough. What is needed is the correct usage of access modifier and dependency relationships. We shouldn’t trust user input nor should we trust coders to do the right thing. This is why when you download a Nuget package or use a third-party API, you don’t get keys to the kingdom. You often find private setters and factory methods to create things. A domain should be a reinforced strong box exposing only limited functionality in a deliberate fashion.
Truly common code
Think you have a use-case for truly common code? Put that code in a Nuget package and then each domain can optionally depend on it. As a bonus each domain can choose the version as it sees fit. Avoid the temptation to centralize things for the sake of being DRY. Be DRY, but only inside a single domain.
The setup
If your monolith is setup this way, you have set yourself up nicely to split one or more domains into a microservice. The only thing you have to do is provide ways for data to come in and out of your newly hosted service.
The end
A monolith that hosts a single domain can be super DRY. However a monolith that hosts multiple domains must put up fences or succumb to the spaghetti monster. Once spaghetti sets in, it becomes a sea of good intentions clashing with competing concerns. It’s never too late to build those fences, but catching it earlier rather than later will save you a good amount of headache medicine.
I hope your OCD isn’t offended by asking you to duplicate the data layers for privacy/encapsulation purposes; but I find the tradeoff to be beneficial.
Happy coding!