The Repository Pattern And Why Not

The repository pattern is one of the most misunderstood patterns in software design and engineering. Although Martin Fowler was clear about what it is and what it's for, the developer community interpreted it as something simpler than what it really is.

Let's look at the definition of the repository:
Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.
There are four actors in this statement: the repository, the domain, the data mapping layers and the domain objects. Based on this definition, the repository sits between the domain the data mapping layers. The domain objects may as well be somewhere in between as well, perhaps "shared" in knowledge by all of the other three actors.

Now let's look at how Martin Fowler introduced the pattern.
A system with a complex domain model often benefits from a layer, such as the one provided by Data Mapper [sic], that isolates domain objects from details of the database access code. In such systems it can be worthwhile to build another layer of abstraction over the mapping layer where query construction code is concentrated.
The data mapper is cited as an example of a data mapping layer, which is defined by Martin Fowler as follows:
A layer of Mappers [sic] that moves data between objects and a database while keeping them independent of each other and the mapper itself.
Which adds two more actors: the database and the mapper, which Martin Fowler defined as:
An object that sets up a communication between two independent objects.
That's six actors! If you think about it, it seems that you need a database to begin with, for which you create mappers to make up your data mapper layer on top of which you create an abstraction layer that is the repository to mediate domain objects with the domain. And we didn't even dig in the definitions further to find that there were also mentions of query, specifications, criteria and what have you.

Do you know what that means? It's simple, silly: the repository pattern is complicated!!!

So let's breathe in and breath out to break that down a little, shall we? Granted you have a database, which is easy enough, you first need to create a data mapping layer that can hide the details of accessing the database. Rings a bell? Yes, of course it does! These are ORMs! So that means if you get an ORM, then you only need to create the repository, right? Wrong! You know why? Because all popular ORMs already implement the repository pattern and even did better than you ever can by implementing it with the Unit of Work pattern, which Martin Fowler defined as follows:
Maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems.
That brings us to Microsoft's Entity Framework. EF implements the unit of work+repository pattern internally. That means, if you are using EF and you implement a "repository pattern" on top of it, you are already over-abstracting! However, if you read Microsoft's own publications, they actually have materials teaching developers about how to implement UoW+repository pattern on top of EF. A lot of developers also did the same defining abstractions over abstractions over abstractions of how the repository (and unit of work) patterns should be done with EF. What sorcery is this? Everyone is doing it! Are they really still the repository pattern per se? Regardless, should you or should you not?

The truth about business applications, they are not just doing create, read, update, delete and search (CRUDS) operations. Yes, you can start with a lot of those. However, business applications can grow and mature over time with potentially an unending list of evolving and varying purposes and reasons. The simple Person object of yesteryears can be the PersonWithSpecialPriviledges, PersonEmeritus and Royalty of tomorrow, all existing at the same time. The Create() method is now varied with CreateWithStyle(), CreateForFame() and CreateByAssociationAndInfluence(). Search() is now alongside SearchByDignity(), SearchByShame() and SearchWithIndifference(). Now here's the kicker: all of these can be achieved without dramatic changes to the original data source/store. Next thing you know, you are creating objects for each specific app, specific UI, specific module and specific report, that you don't even know how many entities the business is really made of.

Is that really worth all the abstractions for?

A sound recommendation when using ORMs is to skip abstracting the data layer and to jump right ahead to creating your business/service layers. Setting up the ORM is a big task enough by itself. Let the ORM guys do their thing and let the rest of the team focus immediately on what the business actually needs. If abstraction is really your thing, focus on abstracting the business/service layer to promote maintainability, productivity, quality and testability. Be careful though that your abstractions would limit creativity and invention, two things that you should never take away from your team.

The value of your efforts today should consider what's really in store for tomorrow. There are a lot of things in software design and engineering that the business don't really care about. In retrospect, are the bulk of your efforts serving the developers or are they serving the business? Think about it.

Comments