Saturday, 29 August 2009

Re-use throughout the .NET stack - from discreet functions to ESB's

I have had this conversation with myself and my peers hundreds of times. Should we re-use, should we version and branch the code base, should we create small discreet web services. are we following the single responsibility principle of SOLID, why do we have so many web service calls, how do we achieve loose coupling, when do we compromise principles for performance & speed-to-market.


Whenever I enter into a conversation about re-use the discussion often gets heated as a result of the following...

Re-use contradictions

Re-use should be encouraged: why maintain the same logic in multiple places?
Re-use should be discouraged: re-use introduces unnecessary dependencies in your code and discourages refactoring and progress.

Use versioning to ensure that changes to shared code do not introduce breaking changes
Do not version code, write enough unit tests to ensure that changes are not breaking and keep everyone on the latest version.


Re-use should be planned. Effort up front to make code re-usable will save time in the long run.
Re-use should be realised, not planned. Planning re-use wastes valuable resources for something that may never be realised.

Code should be written such that re-use is possible.
Code should be written to meet known requirements.


I don't think that there is one answer. There is no rule of thumb, no silver bullet, but hopefully this will be food for thought. From my recent years working on a major SOA development this is my viewpoint to date:-


Variables/Fields

Arguably the lowest level of granularity in modern times (ignoring the old school guys who enjoy passing bits and bytes through registers) the variable should never be re-used.
On the basis that a function should not be more that 15-20 lines, variables should be used for a single purpose only. No exceptions. Re-using an iteration variable, such as the infamous "for(i..." should not be necessary if the function is short. Re-using variables introduces bugs as a) variables need to be re-initialised which is often missed, and b) it normally means that functions are too long - a sign of fragile code.

Local Function Re-use

With modern refactoring tools such as Refactor from DevExpress, function re-use should be actively encouraged. Twinned with the idea that functions should be less that 20 lines, constant refactoring is necessary to keep the code clean, readable and testable. These small functions mean that re-use becomes possible because the functions perform a single task - a tasks that is often required by other functions within your code. Adding unit tests to your solution to tests both the re-used (or potential re-usable) functions as well as the consumers of these functions ensures that breaking changes are limited. Versioning should not be necessary in order to avoid breaking changes - unit tests replace the cautious versioning approach and re-factoring tools identify the dependencies.

Library function re-use and versioning


By definition, when a library functions is created or realised as a result of a re-factoring exercise, there is known or planned re-use.
There is an expectation that library functions should a) be versioned, and b) be backwardly compatible. The versioning approach allows clients to choose when they move to the latest version yet comes from a nervousness that changes will be breaking. The backwardly compatible expectation contradicts the versioning argument: either the library is backwardly compatible in every respect _or_ a new version is needed because breaking changes have been introduced with the latest changes. Again, unit tests can help here and can avoid the need for versioning and branching of the library code base.

Versioning should only be used to denote additions to the library that are not available in previous versions, NOT as a mechanism to avoid shift responsibility to the client to recognise bugs and breaking changes in the latest version of the library. In stark contradiction to this argument is the common practice of using major version to draw a line between the old and the new world. Major versions may decide to cut ties with the old world and no longer support many of the functions of previous versions, forcing a significant re-write or re-factor for existing clients. This decision is always subjective with a major factor being the size of the client base that is using the existing library.


Web Services

Web services are the building blocks of a SOA. Whilst many argue that web services are not necessary for SOA, I have seen very few SOA implementations that use something other than SOAP web services or REST as the mechanism for achieving a loosely coupled distributed architecture. If SOAP/REST is not used, then some bespoke HTTP/XML combination is typically used. I will focus on SOAP services.

One of the four well known tenets of SOA is "Services share schema and contract, not class". This by definition both encourages re-use yet imposes a restriction on what can and can't change. Clearly the underlying classes that implement the service logic can change independently of the schema and contract without breaking the client (assuming the changes do not upset the behaviour of the service), however, the contract itself becomes something that needs to be set in stone.

The challenge of introducing a re-use tenet to web services is simply that not all web services are created equal.

Data services can be uses to surface a data model as a web service. Re-use is expected and difficult to avoid as the underlying data model is surfaced with little or no abstraction other than a change in the transport mechanism (from ADO to SOAP for example). This makes for an extremely fragile service interface: a change to the data model breaks the web service; fixing the web service interface breaks the web service client(s). Not good. Re-use in unavoidable yet undesirable in this situation. You can easily infer that changes to the schema break the contract - something that should not happen, should it?

A repository pattern exposed as a web service improves the re-use story. The repository pattern uses business objects, or domain objects (or fragments of business/domain objects) as input/output parameters to methods. This approach maps the domain object (exposed by the web service schema) to the underlying database schema allowing the database schema to change independently of the web service schema. This is where an ORM tool can come in handy. This sounds like a good thing and in many ways it is, but there are some gotcha's

1) There is a tendency to fabricate business/domain objects ahead of known requirements by logically dividing the database up into "entity views" over the database schema that "make sense". These views typically contain either too much data or require modification to meet the requirement of the client. The resultant contract is a compromise and prone to change causing problems with existing clients.

2) Large object graphs exposed on the service boundary are a) desirable, as these discourage chatty web service interfaces by reducing the amount of calls necessary for persistence/retrieval operations, yet b) cause the interface to become brittle as changes to any area of the model/graph force a version change of the contract. These large object graphs also encourage re-use by clients that see the contract as containing "most" or "all" of what is needed. This compounds a bad situation - the service is re-used heavily, is transporting unwanted data and is fragile due to the large object graph and also encourages the client request ongoing changes to fulfil the missing 1% that is "just a minor tweak".

The repository pattern, used appropriately, allows for re-use to be realised within the same business domain. Domain objects that are suffixed with Summary/Detail are probably too generic or contain too much information. A fine balance between a non-chatty interface, appropriate re-use and light-weight domain objects is the end goal here.


Business Objects

Many of the arguments for re-using business objects are mirrored in the arguments for how and when to re-use web services - web services essentially replace the DCOM/COM+/Remoting options of old so I will not go into any detail on this.


Business processes and Service Composition

Business processes in contrast to data services perform a specific task. These are seen as one of the higher levels of abstraction, by orchestrating services or aggregating content and logic from other services, business objects and data repositories. It is near impossible to re-use these processes for anything other than a single purpose. Any attempt to re-use business processes should be reviewed - is there enough re-use here? Have I written something to satisfy a business requirement or am I looking to satisfy a habit of trying to make everything re-usable.


Inversion of Control

“Inversion of control” and “dependency injection” are terms used to describe classes that supports a pluggable provider model. The "provider" is a class that implements a specific behaviour as required by the hosting class to deliver "pluggable" logic via a pre-defined interface.

This means that re-use can be looked at in a slightly different way. Rather than introducing fragility via a dependency between the client and the implementing class, stability is achieved through a defined interface and dependent classes are "injected". This removes the requirement to version and re-test the client every time an "injected" class changes.


Business Events and ESBs

Seen as the answer to re-use, scalability, loose coupling and hub-spoke monoliths, business events are arguably the next layer of abstraction on top of the development stack. A business event, such as "customer address changed" is published as a light-weight business oriented event. Systems or services that care about the event will subscribe to this message (via the Enterprise Service Bus) and then re-act to the event accordingly. The order processing system may re-retrieve the updated data from the customer repository to update outstanding orders; the CRM system may update it's mailing list to send the next promotional mail to the updated address.

It will remain to be seen if this (relatively) new paradigm will introduce new considerations around re-use, but it is certainly an exciting development in the Enterprise development space.

Summary

It would be easy to draw conclusion from this that could result in over architecting something that should be a simple piece of code; that additional layers of abstraction reduce the number of touch points ; that abstraction is a requirement for stability; that loose coupling is required to prevent fragile dependency chains; that an event driven architecture is the ultimate in re-use and flexibility. The truth is that the scale of the system will dictate how few or how many of these arguments are applicable to your solution. The one thing that is consistent is that unit testing is necessary to offer and evolve re-usable components with confidence.

No comments: