I have been using NHibernate for about a year now and have become very comfortable with many of its nuances to meet the requirements my current project's domain has presented. I have a long way to go! I am at a point, though, where I am starting to make some assumptions regarding what developers know or don't know about one of the most important decisions in their application - data access. Before I forget what I had to learn to become effective, I want to begin logging some of the things about using NHibernate, my OR/M of choice, as if I were writing to myself a year ago.
I work by myself and have learned- for better or worse, everything on my own through lots of head scratching and a slough of great blogosphere writers to whom I am indebted. That said, Specifically, I want to write about the conceptual and impedence difficulties which I discovered coming from a mountain of Microsoft 'best practices' articles , application blocks, etc. I am not at all interested in slamming Microsoft...frankly I tire of really talented software developers only blogging about how evil Microsoft is while not contributing any meaningful snippets of useable information to the community in their posts. But there was a dramatic difference of design assumptions which I needed to uncover in order to work with an OR/M like NHibernate.
Before beginning here are some things I want to log about (off the top of my head):
- Impedence mismatch between Domains-which-use-objects (not DataSets) and Relational concepts (ie, normalization)
- What a 'data layer' is and what it is not - the differences between conceptual and physical layers
- Things to avoid when embracing an OR/M - pick a tool and then use it for its intended purpose
- What a 'Domain' is and how to relate to it from your ORM
- Getting data from here to there but without making your code look like it should have meatballs on it
DATA-CENTRIC versus OBJECT-CENTRIC
Tonight, I just want to note the first thing I had to overcome. Microsoft uses lots of terminology borrowed from OOP practicioners and even goes so far as to promote applying OOP in your applications, but their whole 3-tier architecture thing would always confuse me. On one hand, I'd read that I shouldn't have so many dependencies and then I'd only find code that had DataSets getting filled up in code-behind web pages. "AAAARGH!" I'd say...you mean each time I have to make a change to my DataSet I have to change my ADO.NET code and then my controls which bind to them? On top of that I am supposed to have validation and something called a Eunuch At Work...eh....Unit of Work? The problem was, I didn't understand where Microsoft was coming from in solving business problems. This was a battle between Data-Centric 'domains' and Object-Centric domains and I was becoming a casualty.
Everything I was reading about the richness of OOP just FELT right to me, but I couldn't find a single full blown sample application that actually used lots of these objects. I just saw some helper objects that performed actions on DataSets and lots of repetitive ADO.NET code with helpers for that. I'll share more how I came to use an ORM later, but I finally had to realize that I simply wasn't going to find what I was looking for in Microsoft documentation or in most of her very visible writers-of-books. Objects, it seemed to me, just carry property values from the database to the web controls. That didn't feel right, but who am I?
Nowadays, I think of Microsoft as being data-centric in their approach to solving business problems. The majority of samples whether online or in really expensive books have data getting stuffed into a DataSet from home-grown ADO.NET code after writing sometimes elaborate Stored Procedures. In other words, the assumption is always, "Golly...you need to have that 'object' geterself loaded wit dat der data from de sql server, so jes sit down and write yerself out some ado.net gotcha code and bind the returned set into one of our controls." Ok...I am not interested in the DataSet debate, but the problem I kept having was where I'd read about Object-Oriented Principles being driven by a desire to encapsulate and reduce repetitive code all the while repeating tons of code.
I work by myself and maintainability is more important to me than speed. Ironically, having finally come to understand the richness a robust domain model offers, I am faster than I ever was writing acres of ADO.NET and data-layer-coupled web forms. Change isn't scary to me anymore. Now, it's fun.
An example of the difference in these approaches follows.
I need to write a client relationship management solution (CRM) for my project. Something kind of light...not too heady. So where do I start?
DATA CENTRIC
In the 'old days' I'd leap onto my white board and start drawing out my tables, identifying the various properties each 'object' will need along the way being very careful to at least be 3rd norm all the way. Woohoo! I am a DB STUD! Awesome! Now, let's write the Data Access code to populate a...um...DataSet? Well, I was hoping that my Person and Entity classes could subclass from a Contact base class (that's the only OOP trick I know). What's the difference between a typed and untyped dataset? Wait a minute, I read somewhere that datasets are slow and then this other Famous Guy Who Writes Lots of Books says he doesn't ever use DataSets. Dagnabbit! Ok...let's just inherit from an untyped DataSet and we'll have the column names the same for both types of Contacts. Feels funny, but I gotta do something....
Soon, I would have written lots of code, stored procedures (since I had read that they are the only secure way of doing things...) and haven't even gotten around to actually implementing a thing I heard of called 'Business Rules'. This, to me, was the Data-Centric approach to solving problems. Now...there are some who work this way and have had lots of success and if it works for them, I am glad. I found my way of working to be more effective with the opposite approach though.
OBJECT CENTRIC
Now, before I leap to the white board and start identifying loads of properties and how the relational model should look, I leap into a Test and start to flush out what I NEED and how I'll make it WORK even if there weren't ever a thing called a database. My emphasis changes to identifying , weaving, and enforcing BUSINESS RULES and my data access is a supporting role to my Domain needs. What the objects in my code need from a Database my ORM provides, not vice versa. Instead of starting with a table called Person and making it relate to Address and Phone and Email tables as if all were equals, now I start with a simple class called Person and determine whether my IAddress, IPhone, and IEmail implementation will be indeed autonomous with their own identity (Entity) or simply be recognized by their unique values (Value Object). HUGE difference in approach that ultimately makes me write code instead of stored procedures that I can't refactor very easily and makes my ideas resilient to change.
This is a very brief contrast between their approaches that I uncovered but should at least indicate that when I thinking in terms of Objects and not Data I am thinking in terms of the BUSINESS driving my decisions. I make the assumption that the technology will support whatever business need I have. Conversely, when I have to conform my objects to meet my relational requirements (ie, normalization) my business concerns get squeezed in between the technology solution.
So that was the first growing pain - either be Data Centric or be Object Centric. But don't be Data Centric while calling your portable data structures a rich Domain Model. I am sure this is reductionistic, but I at least needed to know there was a difference in ways to tackle business problems in software.