Which Java Objects exist? When do they exist? No more CRUD!

Which Java objects exist? When do they exist? Is it enough that objects are instantiated and have memory allocation? What about lazy initialization? How can temporary objects be non-existing, since they exist in my code and program? What about soft-deleted entity objects? How about objects which are waiting for the GC? There are two aspects to consider. First, from technical point of view, objects exist after they have been created, i.e. they have been allocated memory space and instantiated. Second, objects at higher (abstraction) level might not exist even after they have been created: here is the realm of domain objects (DOs). For example, if an object represents a real-world object and certain state of it, we should define its existence. Think that only when the object has been created and its state has been loaded the object is considered to be “real” (which means that it exists too). This post discusses the existence of higher level objects.

Domain objects should consist of simple JPA entities – not generally speaking of course, but assuming JPA is used. The idea is a bit like using DAOs but different. The entities should be anemic and separate domain objects would be the base of a proper domain model! For a long time I used repository pattern with CRUD methods, but then I started to think that what is the point of having source code lines like: “myCar.update()”. I moved to thinking that at least CRUD-operations should be hidden, i.e. protected. Why? IMO, any operation that changes the state of itself should be encapsulated – remember objects? But I still didn’t get the idea of the repository pattern anymore: shouldn’t all updates to any object be automatically stored to persistence storage e.g. at the end of service method call? The answer to this problem did come after a long thinking.

Then it hit me. If I use domain objects (and a proper domain model) together with anemic data entities (a proper data model) I could discard the repository pattern with my Renovator Pattern. But first it needs a bit tweaking. The Image 1 shows the improved version.

When objects exist?
Image 1: The improved Renovator Pattern

I have introduced Actions (PBRD): Plan, Build, Renovate, and Destroy. Additionally, I introduce the corresponding states (EPBRD): Empty, Planned, Built, Renovated, and Destroyed. Note that existing, i.e. real objects are in states “Built” and “Renovated”: built objects have not ever been stored to persistence storage. The actions are similar to the operations in a repository (CRUD): i.e. Create, Read, Update and Delete. And corresponding entity states are described in JPA Entity Life Cycle model. But there is crucial difference: real objects are mapped to domain objects and domain objects are real objects: real objects either exist or don’t. This makes life easier to programmers – please read on.

DEFINITION: only Objects which can be read after a system failure are considered to be Existing, otherwise they are Non-existing. When the system fails the existing objects must maintain their states! This implies that write operations are “more powerful” than read operations, because only writing can change state.

Both domain objects and data entities must have a default objects which has always the same identifier, for example Long id = 0L, because database IDs typically start from one. A default object is in “Planned” state which makes it non-existing and therefore you can not invoke any business logic method calls from them. To summarize: I argue that domain objects should consist of data entities and only existing domain objects should be used at runtime!

// Outcome is an existing object (Car.class)
CarEntityBuilder.build();
// Outcome is a non-existing object
Long registrationID = 2343L;
CarEntityBuilder().renovate(registrationID).destroy();

Life-cycle of a Car domain object.

// Non-existing (note: "new" operation is allowed only within DO)
Car c = CarEntityBuilder().setRegistrationNumber(CarRegistrationNumber.Default()).plan();
// Build it to make it existing by definition!
CarRegistrationNumber nb = RegistrationCenter.requestNewCarRegistrationNumber();
c = c.setRegistrationNumber(nb).build();
// Copies are must be allowed
Car copy = CarEntityBuilder().renovate(c.setRegistrationNumber());
// Finally destroy it
c = c.destroy();
// Can not destroy car two times => "destroy if exists" cf. "create if not exists"
copy.destroy();

But what about objects that do not need persistence but should be existing? Objects like beans which are for instance application scoped or singleton and can always be read and instantiated if they do not have dependencies to other objects which require persistence. The implication is that such objects have no state, or the immutable state is coded into the source code. The recommendation is NOT to put these objects into database! For example, if you use domain model, the domain objects can exist without the persistence layer if they are in accordance with the DEFINITION. However, note that they have identifiers still, because these higher level objects have different semantics for them – and are stateless too.

Why Object-Relational Mapping (ORM) Is Wrong?

Domain objects are high-level business objects. Programmers often use a “domain object” concept at a wrong level: the wrong usage can often be described as naming a “domain object” with the same name in a database table. There is nothing wrong in ORM or using it, per se, but there is one caveat: ORM is about mapping relation tables into objects, which is a flawed concept in object-oriented world (a related book: Succeeding with Object Databases: A Practical Look at Today’s Implementations with Java and XML by Chaudhri and Zicari).

“An object database (also object-oriented database management system) is a database management system in which information is represented in the form of objects as used in object-oriented programming. Object databases are different from relational databases which are table-oriented: they are a hybrid of both approaches.” Object databases are better than ORM for objects for OO programming, but there is a better alternative. Even in JPA2 there is a thing called “Entity Graph”, which leads us to “Graph Databases”: but do not mix entity graphs to graph databases!

A real world modern graph database could be for example OrientDB or Neo4J. Both of these are graph databases, and at least Neo4J is a better match to modeling and using objects than relational databases with ORM. For instance, take a look at JCypher that provides a query language that is similar to SQL but it is more readable. Object oriented data naturally forms graphs so there is no need for extra mapping layer.

For example, you can have a database table called “Insurance” that is mapped with ORM to a “Insurance” object. However, the Insurance object is named wrong because “Insurance” is a domain object and hence a grouping object: an “Insurance” object holds every piece of information that is needed to use that object in business methods. An “Insurance” persisted in a relational database is really not an insurance but rather an insurance model that will get richer/smarter when it is loaded into a program: it is plain container for relational data.

ORM maps data between database tables and software objects. Is this mapping really necessary? Could it be that it causes pain to programmers and should be removed? I will start by asking a question: “What if there was no need for databases?” This implies that RDMBS are efficient but force programmers to use anemic domain model where data is represented as entities without business logic. In a rich domain model business methods can be constructed in smart ways where all CRUD-operations are hidden (by encapsulation) from other domain objects, relations are independent separate domain objects, and domain objects expose their (business) behavior!

Next I am going to lay some rules I have found to be useful in rich domain models. First, every domain object (DO) must have a business ID that is different from the row ID in relational databases. Second, every DO must have a default implementation, we need also a default ID. The default is zero (for empty objects), but every other object should have an ID that is not zero but not null (note that NULL is not allowed ever!). Third, collections should be represented with independent objects (as separate classes): for example, use GroupMembership between many Groups and many GroupMembers (M-to-N relation) instead of a mapped Collection.

There could be two types of domain objects: non-persisted DOs get ID from a sequence (that in turn should be persisted if needed), and persisted DOs which get their IDs from persistence provider. Growing IDs and no reuse (immutability), would make possible override “equals” and “hashcode” without any bigger problems – at business level. Also caching would be relatively easy to implement. A concept of grouping must also supported because of the association: see Appendix below.

What else?

  • Every domain object must have a numeric unique ID (primary key) but also other candidate primary keys, because sometimes the first numeric primary key is not what the rich domain model requires!
  • As said, “associations” should be mapped as separate objects, e.g. Club, ClubMember, and ClubMembership (Club,ClubMember) Associations include: unidirectional and bidirectional directions with one-to-one, one-to-many, etc. cardinalities
  • APIs take in and put out only data in String format! ALWAYS! This is what should be different in JSONs: {issue: “this should be like {‘issue’: ‘no complaints’}”}…

I think that rich domain models allow more efficient and natural expression of business logic in source code but it is likely a trade-off: maintainability and readability vs. performance. So it is not a magical solution for every problem the software industry suffers from , but I think that it would be better suited in many cases – at least better than traditional solutions that use ORM.

APPENDIX. Relationship between collection IDs and REST APIs:

Collections.

  • /api/v1/groups => (return metadata for this category)
  • /api/v1/groups/default => (a reference object, replaces NULL)
  • /api/v1/groups/all => (i.e. every collection)
  • /api/v1/groups/ => (i.e. return filtered collections)
  • /api/v1/groups/none => (i.e. no collections)
  • /api/v1/groups/group => (return metadata for the group objects)

Objects in a collection.

  • Zero ID (0L) is the default object => /api/v1/groups/group/default
  • One ID (1L) is a real group => /api/v1/groups/group/id=1
  • Two+ ID (2L+) is a real group => /api/v1/groups/group/id=2