You are on page 1of 6

Effective Aggregate Design

Part I: Modeling a Single Aggregate
Vaughn Vernon: vvernon@shiftmethod.com

Clustering entities and value objects into an aggregate The concepts of this domain, along with its performance
with a carefully crafted consistency boundary may at first and scalability requirements, are more complex than any of
seem like quick work, but among all [DDD] tactical guid- them have previously faced. To address these issues, one of
ance, this pattern is one of the least well understood. the DDD tactical tools that they will employ is aggregate.
To start off, it might help to consider some common ques- How should the team choose the best object clusters? The
tions. Is an aggregate just a way to cluster a graph of aggregate pattern discusses composition and alludes to in-
closely related objects under a common parent? If so, is formation hiding, which they understand how to achieve. It
there some practical limit to the number of objects that also discusses consistency boundaries and transactions, but
should be allowed to reside in the graph? Since one ag- they haven't been overly concerned with that. Their chosen
gregate instance can reference other aggregate instances, persistence mechanism will help manage atomic commits
can the associations be navigated deeply, modifying various of their data. However, that was a crucial misunderstanding
objects along the way? And what is this concept of invari- of the pattern's guidance that caused them to regress. Here's
ants and a consistency boundary all about? It is the answer what happened. The team considered the following state-
to this last question that greatly influences the answers to ments in the ubiquitous language:
the others.
• Products have backlog items, releases, and sprints.
There are various ways to model aggregates incorrectly.
• New product backlog items are planned.
We could fall into the trap of designing for compositional
convenience and make them too large. At the other end of • New product releases are scheduled.
the spectrum we could strip all aggregates bare, and as a
• New product sprints are scheduled.
result fail to protect true invariants. As we'll see, it's imper-
ative that we avoid both extremes and instead pay attention • A planned backlog item may be scheduled for
to the business rules. release.
• A scheduled backlog item may be committed to a
Designing a Scrum Management Application sprint.
The best way to explain aggregates is with an example. From these they envisioned a model, and made their first
Our fictitious company is developing an application to sup- attempt at a design. Let's see how it went.
port Scrum-based projects, ProjectOvation. It follows the
traditional Scrum project management model, complete
with product, product owner, team, backlog items, planned First Attempt: Large-Cluster Aggregate
releases, and sprints. If you think of Scrum at its richest, The team put a lot of weight on the words “Products have”
that's where ProjectOvation is headed. This provides a fa- in the first statement. It sounded to some like composition,
miliar domain to most of us. The Scrum terminology forms that objects needed to be interconnected like an object
the starting point of the ubiquitous language. It is a graph. Maintaining these object life cycles together was
subscription-based application hosted using the software as considered very important. So, the developers added the
a service (SaaS) model. Each subscribing organization is following consistency rules into the specification:
registered as a tenant, another term for our ubiquitous lan-
guage. • If a backlog item is committed to a sprint, we must
not allow it to be removed from the system.
The company has assembled a group of talented Scrum ex-
perts and Java developers.1 However, their experience with • If a sprint has committed backlog items, we must
DDD is somewhat limited. That means the team is going to not allow it to be removed from the system.
make some mistakes with DDD as they climb a difficult • If a release has scheduled backlog items, we must
learning curve. They will grow, and so can we. Their not allow it to be removed from the system.
struggles may help us recognize and change similar unfa-
vorable situations we've created in our own software. • If a backlog item is scheduled for release, we must
not allow it to be removed from the system.
1 Although the examples use Java and Hibernate, all of this material is applicable
to C# and NHibernate, for instance.

1

If the ver. thus avoid- ing the use of database locks. public void planBacklogItem( • Bill plans a new BacklogItem and commits. reserve your verdict for a with it.. Let's look more closely at a few cli- ent usage patterns and how they interact with our technical solution model. With Scrum. This design is shown in the follow.. private TenantId tenantId. not real business rules. The root object. private Set<Sprint> sprints. These false invariants are artificial con- straints imposed by developers. • Joe schedules a new Release and tries to save. but his commit fails because it was based on Product version 1. The big aggregate looked attractive.. private Set<BacklogItem> backlogItems. view the same Product public class Product . Persistence mechanisms are used in this general way to deal gregate. the client's copy. Hibernate provides optimistic concurrency in this way. } . ing aggregate invariants from concurrent changes. lapping modifications during the sprint planning meeting private ProductId productId. Our aggregate instances employ optimistic concurrency to protect persistent objects from simultaneous overlapping modifications by different clients. BacklogItemType aType. which is the identity of Product con- sidered the parent of the other three. Second Attempt: Multiple Aggregates Now consider an alternative model as shown in Figure 2. . 2 . multi-client usage scen. . The interface design protected all parts from inad. Nothing about planning a new backlog item should logic- } ally interfere with scheduling a new release! Why did Joe's commit fail? At the heart of the issue. StoryPoints aStoryPoints) { . With the large cluster ag- gregate design the method signatures looked like this: • Two users. unless designed to save composed parts separately. multiple users often make these kinds of over- private String name. Breaking the large aggregate into four will change some ario: method contracts on Product. all Release.2 If you argue that the default concurrency Item. Figure 2: Product and related concepts are modeled as sion on the persisted object is greater than the version on separate aggregate types. The same could be true of a key-value store because the entire aggregate is often serial- ized as one value. held all Backlog with concurrency. Product. on an ongoing basis is completely unacceptable. 2 For example. the design also has performance and scalability drawbacks. but it wasn't truly practical. ing code. The Product version is incremented to 2.. the client's is considered stale and updates are rejected. Objects carry a version num- ber that is incremented when changes are made and checked before they are saved to the database. the large cluster ag- gregate was designed with false invariants in mind. Bill and Joe. and begin to work on it. and in sprint execution. Besides causing transactional issues. while longer. Consider a common simultaneous. in which we have four distinct aggregates. public class Product extends ConcurrencySafeEntity { Add more users. There are other ways for the team to prevent inappropriate removal without being ar- bitrarily restrictive. Once the application was running in its intended multi-user environment it began to regularly experience transactional failures. Product was first modeled as a very large ag. Failing all but one of their requests private Set<Release> releases.... and as a UML diagram in Figure 1: These consistency problems came up with just two users.. private String description. and this becomes a really big problem. String aSummary. This approach is actually important to protect- vertent client removal.. ProductId. Each of the de- pendencies is associated by inference using a common Figure 1: Product modeled as a very large aggregate..As a result. and all Sprint instances associated configurations can be changed. String aCategory. { marked as version 1.

. Thus. Sprint instances can now be safely created by simultan- } eous user requests. if c is anything but 5. and .planBacklogItem( inside adheres to a specific set of business invariant rules aSummary. To ensure that c is consistent. AggregateType1 { String aBacklogItemType. int c.. they away. Date anEnds) { away. However. int b. Date aBegins.. Date aBegins.. String aStoryPoints) { int a. According al application service must do the following: to that rule and conditions. String aGoals.. the transaction. String aProductId. backlogItemRepository. the four Date aBegins. There is also eventual consistency. BacklogItemType aType. Release.valueOf(aBacklogItemType). let's consider the most String aSummary.. aggregate is synonymous with transactional consistency boundary.. so they have a void return type. so why ment to a collection. Date anEnds) { . } actional. } ive of client consumption.. An invariant is a business rule that must always be consist- } ent. By } setting our Hibernate mapping optimistic-lock op- tion to false. Date anEnds) { . There are different kinds of consistency..valueOf(aStoryPoints)).. productRepository. StoryPoints. Date anEnds) { smaller aggregates are less convenient from the perspect- . } new ProductId(aProductId))... when a is 2 and b is 3. StoryPoints aStoryPoints) { important modeling tip the team needed.... When discussing invariants. Release. or Sprint instances. public void scheduleRelease( So we've solved the transaction failure issue by modeling it String aName.. That's pretty simple. That is. Product product = operations. { invariant is violated. Boundaries Date aBegins. . When trying to discover the aggregates in a bounded con- } text.. String aGoals.. Therefore. we must understand the model's true invariants. } AggregateType1 has three attributes of type int. gregate. There is no invariant on the total number of created modify the state of the Product by adding the new ele. Perhaps instead we could tune . String aSummary. String aCategory. of everything outside this boundary is irrelevant to the ag- BacklogItemType. String aDescription. Now when a client wants to plan a backlog item. we are referring to transactional consistency. String aCategory.productOfId( new TenantId(aTenantId). but . String aDescription.) } When employing a typical persistence mechanism we use a 3 . any given aggregate could hold attributes of various types. should be clustered into a given aggregate. we .. even with clear transactional advantages. One is trans- . BacklogItem..add(plannedBacklogItem). The consistency aCategory. public BacklogItem planBacklogItem( Before thoroughly examining why. public void scheduleSprint( String aName.. the large aggregate to eliminate the concurrency issues. Any number of BacklogItem. we have: these specific modifications on Product? What additional cost would there be in keeping the large cluster aggregate? public class Product . But not just allow the collections to grow unbounded and ignore with the multiple aggregate design.. { . a system public class ProductBacklogItemService .. they each respectively create a c = a + b new aggregate instance and return a reference to it. no matter what operations are performed. } Rule: Model True Invariants In Consistency public Release scheduleRelease( String aName. model a boundary around these specific attributes of the @Transactional public void planProductBacklogItem( model: String aTenantId. Only public Sprint scheduleSprint( with that knowledge can we determine which objects String aName.. We might These redesigned methods have a [CQS] query contract.. which is considered immediate and atomic. (In this limited example. That is.. the transaction failure domino effect goes All of these methods are [CQS] commands. have the invariant: and act as factories. The problem is that it could actually grow out of control. The consistency boundary logically asserts that everything BacklogItem plannedBacklogItem = product. c must be 5.

limit never load all backlog items. For example. What is eration. Even being memory conscious. multiple large collec- gregate? Even if we guarantee that every transaction would tions load during many basic operations.cgi? 3 The transaction may be handled by a unit of work. it's going to bring in lots of tenants. Figure 3: With this Product model. it is a rule of thumb and should be the goal in most cases. transaction may sound overly strict. keep growing over time. each with multiple teams and many products. what does pens when one user of one tenant wants to add a single “small” mean? The extreme would be an aggregate with backlog item to a product. Performance and scalability are non.c2. If we are going to design small aggregates. must be consistent with others.single transaction3 to manage consistency.4 The correct into memory just to add one new element to the already minimum is the ones necessary. thousands of backlog items would be loaded of attributes and/or value-typed properties. drove the design.* fool you. and others. Rule: Design Small Aggregates We can now thoroughly address the question: What addi- tional cost would there be in keeping the large cluster ag. As each tenant makes a deep commitment to Pro. I leases or all sprints. It's worse if a persistence mechanism does Which ones are necessary? The simple answer is: those that not support lazy loading. We have to keep in mind that without applying transactional analysis. Limiting the modification of one aggregate instance per And over time the situation will only become worse. and no more. and sprints all at the aggregate to just the root entity and a minimal number once. We almost truly what one specific aggregate requires). one that is years old and already only its globally unique identity and one additional attrib- has thousands of backlog items? Assume a persistence ute. they'll host more and more projects and the well. functional requirements that cannot be ignored. Even so. Therefore. as does Ward Cunningham when describing Whole Value. This large cluster aggregate will never perform or scale jectOvation. That's just for a single team member of a single more. backlog items. look at the diagram in Figure 3 contain- action commits. Some real-world invariants will be more complex than this. releases. would be loaded.com/wiki. what hap. to the detriment of transactional success. thousands and thousands of objects into memory all at And a properly designed bounded context modifies only once. When the trans. Still. Since aggregates must be designed with a consistency fo- cus. it will force the application to modify multiple in- stances at once. everything inside one boundary must be ing the zoomed composition. invariants and a desire for compositional convenience sprints. distinguish this from a simple attribute such as a String or numeric type. If user requests try to accomplish too much. typically invariants will be less demanding on our modeling efforts. A properly designed aggregate is one that can number of associations will almost never be zero and will be modified in any way required by the business with its in. To see this clearly. specify them as rules. Rather. Keeping performance and scalability in mind. It was deficient from the start because the false ult in vast numbers of products. and scalability. performance. WholeValue 4 . It is more likely to become a nightmare leading only management artifacts to go along with them. just to carry out what should be a relatively basic op- one aggregate instance per transaction in all cases. Product has name such as when scheduling a backlog item for release or com- mitting one to a sprint. That will res. large collection. aggregates are chiefly about consistency bound- aries and not driven by a desire to design object graphs. see http://fit. all backlog items. even if domain experts don't sometimes we would have to load multiple collections. this could happen all at once with hundreds and thousands of tenants. succeed. it implies that the user interface should concentrate each request to execute a single command on just one ag- gregate instance. 4 A value-typed property is an attribute that holds a reference to a value object. we cannot correctly reason on aggregate design tenant on a single product. which is not what's being recommended (unless that is mechanism capable of lazy loading (Hibernate). It addresses the very reason to use aggregates. the consistent. making it possible to design small ag- gregates. We would likely need to load variants completely consistent within a single transaction. However. we still limit performance and scalability. and either all re.. As our company develops its market. releases. Don't let the 0. to failure.

which would be fine. but I want to emphasize that most of the time the invariants of business models are simpler to manage than that example. but continue to push yourself to keep or making the description more fitting to the name. design. (And. then add another few entities.5 What does attempting this indicate about your design? [Qi4j]. We are only concerned with use more usable. This makes a system not be viewed as synonymous with transaction. Therefore. Trying to keep multiple aggregate root entity containing some value-typed properties. it will affect many of our design de- change is necessary. of limited to a single entity. Depending on your persistence mechanism. modeled in separate situations. Overhead is higher model. it points to the use of a value object rather than an way does not carry the perspective of the domain experts entity. or if There are important advantages to limiting internal parts to it occurs within just one. Yet. you'll easier for unit tests to prove their correctness. Niclas [Hedhman] reported that his team was able to The answer to that question may lead to a deeper under- design approximately 70% of all aggregates with just a standing of the domain. it might be only parts of the old aggregates that get rolled into the new one. whereas entities usu. We still through this exercise on a case-by-case basis. but concurrent additions by multiple users could collectively do so. Yet. case we must determine whether the specified large user placed. users all trying to access the same two aggregate instances. as necessary. the modification of multiple aggregate instances. We can't imagine name and constraints that force you into large composition design description being inconsistent. If it is the latter. with entity parts. missed an invariant. Assuming your aggregate boundaries are aligned with real cessary to read them using Hibernate. meaning that 5 This doesn't address the fact that some use cases describe modifications to mul- tiple aggregates that span transactions. many con. we mustn't forget that use cases derived in this placed. Your domain will not often have true invariant cases that actually indicate the modification of multiple aggregates instances in one transaction. may not accurately reflect the true aggregates of our ally require separately tracked storage. it is an implicit one. or whether it can be completely replaced when detailed specification. cisions. or possibly a col- other. Don't Trust Every Use Case What if you think you should model a contained part as an Business analysts play an important role in delivering use entity? First ask whether that part must itself change over case specifications. Since much work goes into a large and time. instances consistent may be telling you that your team has maining 30% had just two to three total entities. When you occasionally encounter a true consist- change the description. for example. leading to a high number of transactional failures. Even though domain experts will probably not think of this as an explicit business rule. values skeptical. the root. 5 . it's probably because you are fixing a spelling error lection. The rule becomes tricky to enforce when multiple users simultaneously add line items. In such a self mutates when one of its value-typed properties is re. At times entity parts are necessary. Any one addition is not permitted to exceed the limit. goal is spread across multiple persistence transactions. No matter how well it is written. A user goal should conflicts preventing a commit are rare. It aggregates into one new concept with a new name in order does indicate that a high percentage of aggregates can be to address the newly recognized business rule. as. Think- ler and safer to use (fewer bugs). if we run and developers of our close-knit modeling team. it is just plain smart to limit aggreg- aggregates. including our decisions about aggregates. must reconcile each use case with our current model and cepts modeled as entities can be refactored to value ob. when SQL joins are ne. A com- jects. If instances can be completely re. business analysts specify what you see in Figure 4. Favoring value types as aggregate parts doesn't mon issue that arises is a particular use case that calls for mean the aggregate is immutable since the root entity it.) The [DDD] discussion of aggregates gives an example where multiple entities makes sense. see that there are cases where two of the three requests will On one project for the financial derivatives sector using fail. the overall size as small as possible. I won't repeat the solution here. Reading a single business constraints. Recognizing this helps us to model aggregates with as few properties as pos. course. then it's going to cause problems if database table row is much faster. Smaller aggregates not only perform and scale better. Value objects are smal. and the sum of all line items must not surpass the total. Figure 4: Concurrency contention exists between three sible. it pays to be values. When you change the name you probably also ate size. The re.and description attributes. You may end up folding the multiple n't indicate that all domain models will have a 70/30 split. they are also biased toward transactional success. If you change one and not the ency rule. such a use case can be serialized with the root entity. This does. A purchase order is assigned a maximum allowable total. Due to immutability it is ing through the various commit order permutations.

too. Greg new concept leads you toward designing a large cluster ag.org/licenses/by-nd/3. and you can reach him Effective Aggregate Design is licensed under the Creative by email here: vvernon@shiftmethod. Qi4j framework. The new use case [Hedhman] Niclas Hedhman. that can end up with all the problems of large ag- gregates. http://www. His QCon San essay covers how aggregates reference other aggregates as Francisco 2010 presentation on context mapping is well as eventual consistency. CommandQuerySeparation.com Commons Attribution-NoDerivs 3. face an uncooperative business analyst). Part II of this implementing domain-driven design. in such cases.0/ 6 . and training services. providing aggregates and their internals. this essay.com/niclas/ would specify eventual consistency and the acceptable update delay. especially when following them as [DDD] Eric Evans.com/bliki/ be achieved with eventual consistency between aggregates. blogs here: http://vaughnvernon. but be skeptical here.So a new use case may lead to insights that push us to re. Domain-Driven Design—Tackling written would lead to unwieldy designs. There will be cases that architecture. Often.co/. What different approach may help? Just because you are given a use case that calls for main. Forming Eric Evans and Paul Rayner did several detailed reviews of one aggregate from multiple ones may drive out a com- pletely new concept with a new name. gregate. This three-part essay is based on his upcoming book on especially when we keep aggregates small.org/library/vernon_2010. This is one of the issues taken up in Part II of [Qi4j] Rickard Öberg.jroller. The team may Complexity in the Heart of Software. mentoring. require references and consistency between aggregates. Young. All rights reserved. 2003. Niclas Hedhman. Niclas Hedhman. available on the DDD Community site: http://dddcommunity. ISBN 0-321-12521-5. the business goal can Command-Query Separation: http://martinfowler. development. yet if modeling this this essay. Acknowledgments model the aggregate. I also received feedback from Udi Dahan. http://qi4j. Vaughn Copyright © 2011 Vaughn Vernon. and Rickard Öberg. Jimmy Nilsson. References taining consistency in a single transaction doesn't mean you [CQS] Martin Fowler explains Bertrand Meyer's should do that.html The team should critically examine the use cases and chal- lenge their assumptions.0 Unported License: http://creativecommons. Addison- have to rewrite the use case (or at least re-imagine it if they Wesley.org/ Coming in Part II Biography Part I has focused on the design of a number of small Vaughn Vernon is a veteran consultant.