As we continue to think aloud about MVC patterns... When you get rid of the Repository, you also get rid of the "sad tragedy" of repository architecture debate theater. If you want to see today's real lesson, go ahead and skip to the end.

CodeBetter --- DDD The Generic Repository

Consider the following code:

Repository<Customer> repository = new Repository<Customer>(); foreach(Customer c in repository.FetchAllMatching(CustomerAgeQuery.ForAge(19)) { }

The intent of this code is to enumerate all of the customers in my repository that match the criteria of being 19 years old. This code is fairly good at expressing its intent in a readable way to someone who may have varying levels of experience dealing with the code. This code also is highly factored allowing for aggressive reuse.

Especially due to the aggressive reuse the above code is commonly seen in domains. Developers are trained that reuse is good and therefore tend towards designs where reuse is applied

It bugs me that anyone could use "code reuse" as a positive when talking about repositories (but skip to the end to see what's really going on here). By definition, all this jive is repeated code -- or at least code that runs through a Rube Goldberg machine before it becomes SQL, which is worse.

Again, let me say again that I believe entities make some sense when you're looking to enforce business logic, but then I'm challenging you again to let me know why that isn't better handled -- ONCE! -- by your rdbms. Your entities are your data objects, and your capital-r Reads don't give a flying flip about them other than joining them together to produce their views.

CodeBetter -- DDD Specification or Query Object

One of the nice benefits of a Specification is that one could write some code like the following:

IEnumerable<Customer> customers =
CustomerRepository.AllMatching(CustomerSpecifications.IsGoldCustomer);

Writing code like this has allowed the developer to reuse a specification from the domain within their repository as a method for querying. While this may seem to be a good thing at the outset this mentality introduces a host of problems.

Performance

The first and largest problem that one will run into when dealing with this type of API is that the Repository is necessarily a leaky abstraction. The GoldCustomerSpecification is a piece of code, it represents a predicate for whether a single customer is or is not a gold customer. In order to return a set of customers that represents all of the customers matching the GoldCustomerSpecification the repository will need to run the specification on every customer. ... On the read side of your domain (a different layer if you use cqs) you want clients to be able to pass query objects directly to your repositories. Keep in mind that these are not the repositories on the transactional side (read: domain) but are supporting the complex reporting behaviors needed. It is often times not possible to completely isolate every type of report you may like to run (but you should still try to do this where possible as the strong contract has benefits).

CodeBetter -- CQRS and Event Sourcing
related: Martin Fowler -- Event Sourcing

If we were to say use a relational database, object database, or anything else that only keeps current state we would have a slight issue. The issue is that we have two different models that we cannot keep in sync with each other. Consider that we are publishing events to the read model/other integration points, we are also saving our current state with a tool like nhibernate. How can we rationalize that what nhibernate saved to the database is actually the same meaning as the events we published, what if they are not?

Ayende -- Repository is the new Singleton

There most commonly used definition for Repository is defined in Patterns of Enterprise Application Architecture:

A system with a complex domain model often benefits from a layer, such as the one provided by Data Mapper, that isolates domain objects from details of the database access code. In such systems it can be worthwhile to build another layer of abstraction over the mapping layer where query construction code is concentrated. This becomes more important when there are a large number of domain classes or heavy querying. In these cases particularly, adding this layer helps minimize duplicate query logic.

That's actually pretty interesting -- I mean, query repetition is very obviously the problem with what I'm proposing (a SQL query per controller action), but worded fairly well. Of course my response is that there's nothing wrong with a defensive separation of logic. Think smartly self-contained microservice.

PlanetGeek.ch -- What is that all about the repository anti pattern?

Complex queries should be placed into query objects according to his article. So do we really need a repository? This article tries to answer this question.

No. ;^D

SpiendWorks -- The Generic Repository Is An Anti-Pattern

A repository is a concept to abstract the access to the persistence, that is not to depend on data access implementation details. There is no formula and no rules. ... Other offender in regard to generic repositories is the fact that lots of developers just use it to wrap the DAO (Database Access Object) or an underlying ORM (like EF or Nhibernate). Doing so they add only a useless abstraction, pretty much just making the code more complex with no benefits. A DAO makes it easy to work with a database, an ORM makes it easy to access a database as an OOP virtual storage and to eventually abstract the access to a specific database.

Emphasis mine. Thanks for that line. Phew. Though I still dislike most ORM-based implementations, I think.

Moneyball for today

Also from the above link:

But the repository should abstract the whole persistence layer, hiding implementation details like database engine or what DAO or ORM the app is using but also providing a contract that makes sense from the application point of view. The repository serves the application needs, NOT the database needs.

Now we're getting somewhere, aren't we? THIS, not DRYness, is a repository's real advantage. And who the heck really swaps out the datastore of a mature app? Bueller? Then why abstract it?!?!!!1!

If you're not going to abstract the engine from the application, you don't use a repository. And if you want performance, you don't want to abstract the engine. Trust me.

That is, in brief, my bets are on SQL (though SQL is less important than your data persistence model -- and I'm leaving myself open to situationally microservice my way away from whatever persistence model I initially pick too), not the convoluted code overhead and repetition of Repositories.

If you're honest with yourself, you're very likely already are betting on [your code persistence model]. If you're not factoring your persistence engine into your code, you're almost certainly going to see performance problems at scale. That is, if you're "hiding implementation details like database engine or what DAO or ORM the app is using", you've already eliminated too many possibilities for optimization and made your codebase more difficult to maintain. Lose lose, man, lose lose.

Labels: , , ,