Why Object-Relational Mapping (ORM) Is Wrong?

Domain objects are high-level business objects. Programmers often use a “domain object” concept at a wrong level: the wrong usage can often be described as naming a “domain object” with the same name in a database table. There is nothing wrong in ORM or using it, per se, but there is one caveat: ORM is about mapping relation tables into objects, which is a flawed concept in object-oriented world (a related book: Succeeding with Object Databases: A Practical Look at Today’s Implementations with Java and XML by Chaudhri and Zicari).

“An object database (also object-oriented database management system) is a database management system in which information is represented in the form of objects as used in object-oriented programming. Object databases are different from relational databases which are table-oriented: they are a hybrid of both approaches.” Object databases are better than ORM for objects for OO programming, but there is a better alternative. Even in JPA2 there is a thing called “Entity Graph”, which leads us to “Graph Databases”: but do not mix entity graphs to graph databases!

A real world modern graph database could be for example OrientDB or Neo4J. Both of these are graph databases, and at least Neo4J is a better match to modeling and using objects than relational databases with ORM. For instance, take a look at JCypher that provides a query language that is similar to SQL but it is more readable. Object oriented data naturally forms graphs so there is no need for extra mapping layer.

For example, you can have a database table called “Insurance” that is mapped with ORM to a “Insurance” object. However, the Insurance object is named wrong because “Insurance” is a domain object and hence a grouping object: an “Insurance” object holds every piece of information that is needed to use that object in business methods. An “Insurance” persisted in a relational database is really not an insurance but rather an insurance model that will get richer/smarter when it is loaded into a program: it is plain container for relational data.

ORM maps data between database tables and software objects. Is this mapping really necessary? Could it be that it causes pain to programmers and should be removed? I will start by asking a question: “What if there was no need for databases?” This implies that RDMBS are efficient but force programmers to use anemic domain model where data is represented as entities without business logic. In a rich domain model business methods can be constructed in smart ways where all CRUD-operations are hidden (by encapsulation) from other domain objects, relations are independent separate domain objects, and domain objects expose their (business) behavior!

Next I am going to lay some rules I have found to be useful in rich domain models. First, every domain object (DO) must have a business ID that is different from the row ID in relational databases. Second, every DO must have a default implementation, we need also a default ID. The default is zero (for empty objects), but every other object should have an ID that is not zero but not null (note that NULL is not allowed ever!). Third, collections should be represented with independent objects (as separate classes): for example, use GroupMembership between many Groups and many GroupMembers (M-to-N relation) instead of a mapped Collection.

There could be two types of domain objects: non-persisted DOs get ID from a sequence (that in turn should be persisted if needed), and persisted DOs which get their IDs from persistence provider. Growing IDs and no reuse (immutability), would make possible override “equals” and “hashcode” without any bigger problems – at business level. Also caching would be relatively easy to implement. A concept of grouping must also supported because of the association: see Appendix below.

What else?

  • Every domain object must have a numeric unique ID (primary key) but also other candidate primary keys, because sometimes the first numeric primary key is not what the rich domain model requires!
  • As said, “associations” should be mapped as separate objects, e.g. Club, ClubMember, and ClubMembership (Club,ClubMember) Associations include: unidirectional and bidirectional directions with one-to-one, one-to-many, etc. cardinalities
  • APIs take in and put out only data in String format! ALWAYS! This is what should be different in JSONs: {issue: “this should be like {‘issue’: ‘no complaints’}”}…

I think that rich domain models allow more efficient and natural expression of business logic in source code but it is likely a trade-off: maintainability and readability vs. performance. So it is not a magical solution for every problem the software industry suffers from , but I think that it would be better suited in many cases – at least better than traditional solutions that use ORM.

APPENDIX. Relationship between collection IDs and REST APIs:

Collections.

  • /api/v1/groups => (return metadata for this category)
  • /api/v1/groups/default => (a reference object, replaces NULL)
  • /api/v1/groups/all => (i.e. every collection)
  • /api/v1/groups/ => (i.e. return filtered collections)
  • /api/v1/groups/none => (i.e. no collections)
  • /api/v1/groups/group => (return metadata for the group objects)

Objects in a collection.

  • Zero ID (0L) is the default object => /api/v1/groups/group/default
  • One ID (1L) is a real group => /api/v1/groups/group/id=1
  • Two+ ID (2L+) is a real group => /api/v1/groups/group/id=2

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.