ROI and SOA
Two ZDNet analysts, Dana Gardner and Joe McKendrick, have had recent posts (I’ve linked their names to the specific posts) regarding ROI and SOA. This isn’t something I’ve blogged on in the past, so I thought I’d offer a few thoughts.
First, let’s look at the whole reason for ROI in the first place. Simply put, it’s a measurement to justify investment. Investment is typically quanitified in dollars. That’s great, now we need to associate dollars with activities. Your staff has salaries or bill rates, so this shouldn’t be difficult, either. Now is where things get complicated, however. Activities are associated with projects. SOA is not a project. An architecture is a set of constraints and principles that guide an approach to a particular problem, but it’s not the solution itself. Asking for ROI on SOA is similar to asking for ROI on Enterprise Architecture, and I haven’t seen much debate on that. That being said, many organizations still don’t have EA groups, so there are plenty of CIOs that may still question the need for it as a formal job classification. Getting back to the topic, we can and do estimate costs associated with a project. What is difficult, however, is determining the cost at a fine-grained level. Can you determine the cost of developing services in support of that project accurately? In my past experience, trying to use a single set of fine-grained activities for both project management and time accounting was very difficult. Invariably, the project staff spent time that was focused on interactions that were needed to determine what the next step was. These actions never map easily into a standard task-based project plan, and as a result, caused problems when trying to charge time. (Aside: For an understanding on this, read Keith Harrison-Broninski’s book Human Interactions or check out his eBizQ blog.) Therefore, it’s going to be very difficult to put a cost on just the services component of a project, unless it’s entire scope of the project, which typically isn’t the case.
Looking at the benefits side of the equation, it is certainly possible to quantify some expected benefits of the project, but again, only a certain level. If you’re strictly looking at IT, your only hope of coming up with ROI is to focus on cost reduction. IT is typically a cost center, with at best, an indirect impact on revenue generation. How are costs reduced? This is primarily done by reducing maintenance costs. The most common approach is through a reduction in the number of vendor products involved and/or a reduction in the number of vendors involved. More stuff from fewer vendors typically means more bundling and greater discounts. There are other options, such as using open source products with no licensing fees, or at least discounted fees. You may be asking, “What about improved productivity?” This is an indirect benefit, at best. Why? Unless there is a reduction in headcount, the cost to the organization is fixed. If a company is paying a developer $75,000 / year, that developer gets that money regardless of how many projects get done and what technologies are used. Theoretically, however, if more projects are completed within a given time, you’d expect that there is a greater potential for revenue. That revenue is not based upon whether SOA was used or not, it’s based upon the relevance of that project to business efforts.
So now we’re back to the promise of IT – Business agility. For a given project, ROI should be about measuring the overall project cost (not specific actions within it) plus any ongoing costs (maintenance) against business benefits (revenue gain) and ongoing cost reduction. So where will we get the best ROI? We’ll get the best ROI by picking projects with the best business ROI. If you choose a project that simply rebuilds an existing system using service technologies, all you’ve done is incurred cost unless those services now create the potential for new revenue sources (a business problem, not a technology problem), or cost consolidation. Cost consolidation can come from IT on its own through reduction in maintenance costs, although if you’re replacing one homegrown system with another, you only reduce costs if you reduce staff. If you get rid of redundant vendor systems, clearly there should be a reduction in maintenance fees. If you’re shooting for revenue gain, however, the burden falls not to IT, but to the business. IT can only control the IT component of the project cost and we should always be striving to reduce that through reuse and improved tooling. Ultimately, however, the return is the responsibility of the business. If the effort doesn’t produce the revenue gain due to inaccurate market analysis, poor timing, or anything else, that’s not the fault of SOA.
There are two last points I want to make, even though this entry has gone longer than I expected. First, Dana made the following statement in his post:
So in a pilot project, or for projects driven at the departmental level, SOA can and should show financial hard and soft benefits over traditional brittle approaches for applications that need integration and easy extensibility (and which don’t these days?).
I would never expect a positive ROI on a pilot project. Pilots should be run with the expectation that there are still unknowns that will cause hiccups in the project, causing it to run at a higher cost that a normal project. A pilot will then result in a more standardized approach for subsequent projects (the extend phase in my maturity model discussions) where the potential can be realized. Pilots should be a showcase for the potential, but they may not be the project that realizes it, so be careful in what you promise.
Dana goes on to discuss the importance of incremental gains from every project, and this I agree with. As he states, it shouldn’t be an “if we build it, they will come” bet. The services you choose to build in initial projects should be ones that you have a high degree of confidence that they will either be reused, or, that they will be modified in the future but where the more fine-grained boundaries allow those modifications to be performed at a lower cost than previously the case.
Second, SOA is an exercise in strategic planning. Every organization has staff that isn’t doing project work, and some subset of that staff is doing strategic planning, whether formally or informally. Without the strategic plan, you’ll be hard pressed to have accurate predictions on future gains, thus making all of your ROI work pure speculation, at best. There’s always an element of speculation in any estimate, but it shouldn’t be complete speculation. So, the question then is not about separate funding for SOA. It’s about looking at what your strategic planners are actually doing. Within IT, this should fall to Enterprise Architecture. If they’re not planning around SOA, then what are they planning? If there are higher priority strategic activities that they are focused on, fine. SOA will come later. If not, then get to work. If you don’t have enterprise architecture, then who in IT is responsible for strategic planning? Put the burden on them to establish the SOA direction, at no increase in cost (presuming you feel it is higher priority than their other activities). If no one is responsible, then your problem is not just SOA, it’s going to be anything of a strategic nature.
Is the word “Application” still relevant?
Gary So of webMethods asked this question in a recent blog entry. He states:
With SOA, the concept of what constitutes an application will have to be rethought. You might have dozens of different services – potentially originating from different vendors, with some on-premise and some served up on demand – that, individually, don’t fulfill a complete business requirement but are collectively arranged to address a business need. Some of these services could also be used in addressing other, unrelated, business requirements. Traditional application logic – logic that would normally reside in the middle tier of a three-tier application architecture — is no longer monolithically implemented in an application server, but distributed among the services themselves and, potentially, process engines that manage the orchestration of these services, governance engines that apply policies related to the use of and interaction between services, and rules engines that externalize logic from everything else. Where in this scenario does one application begin and another end? Is the word “application” even relevant anymore?
I’ve actually thought about this a lot, and agree with Gary. The term application implies boundaries and silo-based thinking. Typical IT projects were categorized as one of two things: application development or Application Integration. In the case of application development, the view is of end-to-end ownership of the entire codebase. In the case of application integration, it was about finding a bunch of glue to stick between two or more rigid processing points that could not be modified. Neither of these models are where we want to be with SOA. Some have to tried to differentiate the future by referring to systems of the past as monolithic applications. Personally, I don’t like that either. Why? Simple. For me, the term application has always implied some form of end-to-end system, most typically with a user interface, with the exception of the batch application. If this is the case, what are we building when a project only produces services? It’s not really an application, because it will just sit there until a consumer comes along. I also don’t think that we should wait to build key services until consumers are ready, however. It’s true that many services will be built this way, but there is no reason that some analysis can’t occur to identify key or “master” services that will be of value to the enterprise.
The terminology I’ve begun using is solution development. An IT project produces a solution, it’s as simple as that. That solution may involve the creation of a user interface, executable business processes, new tasks within a workflow system, services, databases, etc. or any subset of them. It may also make use of existing (whether previously built, purchased and installed, or hosted externally) user interface components, business processes, workflow activities, services, and databases. While vendors may still bundle all of these together and sell them as an “application,” a key evaluation point will be whether those “applications” are build on a solid, composable architecture that allows pieces to be interchanged as needed.
I think that this topic is a good example of the cultural change that is associated with becoming a service-oriented enterprise. If all of your projects can still be defined as application development, you’re not there. In fact, you’re at risk of just creating a bunch of services, as those application boundaries are not conducive to the type of thinking required for success with SOA. If you’re running into projects where the term “application development” just doesn’t seem to make sense, then you’re starting to experience the culture change that comes with the territory. Have fun building your solutions that leverage your SOA!
Is the SOA Suite good or bad?
I haven’t listed to the podcast yet, but Joe McKendrick posted a summary of the discussion in a recent Briefings Direct SOA Insights conversation organized by Dana Gardner. In his entry, Joe asks whether vendors are promoting an oxymoron in offering SOA suites. He states:
“Jumbo shrimp” and “government organization” are classic examples of oxymorons, but is the idea of an “SOA suite” also just as much a contradiction of terms? After all, SOA is not supposed to be about suites, bundles, integration packages, or anything else that smacks of vendor lock-in.
“The big guys — SAP, Oracle, Microsoft, webMethods, lots of software vendors — are saying, ‘Hey, we provide a bigger, badder SOA suite than the next guy,'” Jim Kobelius pointed out. “That raises an alarm bell in my mind, or it’s an anomaly or oxymoron, because when you think of SOA, you think of loose coupling and virtualization of application functionality across a heterogeneous environment. Isn’t this notion of a SOA suite from a single vendor getting us back into the monolithic days of yore?”
Personally, I have no issue with SOA suites. The big vendors are always going to go down this route, and if anything, it simply demonstrates just how far we have to go on the open integration front. If you follow this blog, you know that I’ve discussed SOA for IT. SOA for IT, in my mind, is all about integration across the technology infrastructure at a horizontal level, not a vertical level. SOA for business is concerned about the vertical level semantics of the business, and allowing things to integrate at a business sense. SOA for IT is about integration at the technical level. Can my Service Management infrastructure talk to my Service Execution infrastructure? Can my Service Execution infrastructure talk to my Service Mediation infrastructure? Can my Service Mediation infrastructure talk to my Service Management infrastrucutre? The list goes on. Why is their still a need for these SOA suites? Simply put, we still lack standards for communication between these platforms. It’s one thing to say all of the infrastructure knows how to speak with a UDDI v3 registry. It’s another thing to have the infrastructure agree on the semantics of the metadata placed in a registry/repository (note, there’s no standard repository API), and leverage that information successfully across a heterogeneous set of environments. The smaller vendors try to form coalitions to make this a reality, as was the case with Systinet’s Governance Interoperability Framework, but as they get swallowed up by the big fish, what happens? IBM came out with WebSphere Registry/Repository and it introduced new, proprietary APIs. Competitive advantage for an all IBM environment? Absolutely. If I don’t have an all IBM environment, am I that much worse off however? If I have AmberPoint or Actional for SOA management, I’m still dealing with their proprietary interfaces and policy definitions, so vendor lock-in still exists. I’m just locked in to multiple vendors, rather than one.
The only way this gets fixed is if customers start demanding open standards for technology integration as part of their evaluation criteria. While the semantics of the information exchange may not exist yet, you can at least ask whether or not the vendor exposes management interfaces as services. Put another way, the internal architecture of the product needs to be consistent with the internal architecture of your IT systems. If you desire to have separation of management from enforcement, then your vendor products must expose management services. If the only way to configure their product is through a web-based user interface or by attempting to directly manipulate configuration files, this is going to be very costly for you if you’re trying to reduce the number of independent management console that operations needs to deal with. Even if it’s all IBM, Oracle, Microsoft, or whoever, the internal architecture of that suite needs to be consistent with your vendor-independent target architecture. If you haven’t taken the time to develop one, then you’re allowing the vendors to push their will on you.
Let’s use the city planning analogy. A suite vendor is akin to the major developer. Do the city planners simply say, “Here’s your 80,000 acres, have fun?” That probably wouldn’t result in a good deal for the city. Taking the opposite extreme, the city doesn’t want individual property owners to do whatever they want, either. Last year, there was article about a nearby town that had somehow managed to allow an adult store to set up shop next door to a daycare center in a strip mall. Not good. The right approach, whether you want to have a diverse set of technologies, or a very homogenous set is to keep the power in the hands of the planners, and that is done through architecture. If you can remain true to your architecture with a single vendor? Great. If you prefer to do it with multiple vendors, that’s great at well. Just make sure that you’re setting the rules, not them.
Models of EA
One thing that I’ve noticed recently is that there is no standard approach to Enterprise Architecture. Some organizations may have Enterprise Architecture on the organizational chart, other organizations may merely have an architectural committee. One architecture team may be all about strategic planning, while another is all about project architecture. Some EA teams may strictly be an approval body. I think the lack of a consistent approach is an indicator of the relative immaturity of the discipline. While groups like Shared Insights have been putting on Enterprise Architecture conferences for 10 years now, there are still many enterprises that don’t even have an EA team.
So what is the right approach to Enterprise Architecture? As always, there is no one model. The formation of an EA team is often associated with some pain point in the enterprise. In some organizations, there may be a skills gap necessitating the formation of an architecture group that can fan out across multiple projects, providing project guidance. A very common pain point is “technology spaghetti.â€? That is, over time the organization has acquired or developed so many technology solutions that the organization may have significant redundancy and complexity. This pain point can typically result in one of two approaches. The first is an architecture review board. The purpose of the board is to ensure that new solutions don’t make the situation any worse, and if possible, they make it better. The second approach is the formation of an Enterprise Architecture group. The former doesn’t appear on the organization chart. The latter does, meaning it needs day to day responsibilities, rather than just meeting when an approval decision is needed. Those day to day activities can be the formation of reference architectures and guiding principles, or they could be project architecture activities like the first scenario discussed. Even in these scenarios, however, Enterprise Architecture still doesn’t have the teeth it needs. Reference architectures and/or guiding principles may have been created, but these end state views will only be reached if a path is created to get there. This is where strategic planning comes into play. If the EA team isn’t involved in the strategic planning process, then they are at the mercy of the project portfolio in achieving the architectural goals. It’s like being the coach or manager of a professional sports team but having no say whatsoever in the player personnel decisions. The coach will do the best they can, but if they were handed players who are incompatible in the clubhouse or missing key skills necessary to reach the playoffs, they won’t get there.
You may be thinking, “Why would anyone ever want an EA committee over a team?â€? Obviously, organizational size can play a factor. If you’re going to form a team, there needs to enough work to sustain that team. If there isn’t, then EA becomes a responsibility of key individuals that they perform along with their other activities. Another scenario where a committee may make sense is where the enterprise technology is largely based on one vendor, such as SAP. In this case, the reference architecture is likely to be rooted in the vendor’s reference architecture. This results in a reduction of work for Enterprise Architecture, which again, has the potential to point to a responsibility model rather than a job classification.
All in all, regardless of what model you choose for your organization, I think an important thing to keep in mind is balance. An EA organization that is completely focused on reference architecture and strategic planning runs the risk of becoming an ivory tower. They will become disconnected from the projects that actually make the architecture a reality. The organization runs a risk that a rift will form between the “practitionersâ€? and the “strategists.â€? Even if the strategists have a big hammer for enforcing policy, that still doesn’t fix the cultural problems which can lead to job dissatisfaction and staff turnover. On the flip slide, if the EA organization is completely tactical in nature, the communication that must occur between the architects to ensure consistency will be at risk. Furthermore, there will still be no strategic plan for the architecture, so decisions will likely be made according to short term needs dominated by individual projects. The right approach, in my opinion, is to maintain a balance of strategic thinking and tactical execution within your approach to architecture. If the “officialâ€? EA organization is focused on strategic planning and reference architecture, they must come up with an engagement model that allows bi-directional communication with the tactical solution architects to occur often. If the EA team is primarily tasked with tactical solution architecture, then they must establish an engagement model with the IT governance managers to ensure that they have a presence at the strategic planning table.
How many versions?
While heading back to the airport from a recent engagement, Alex Rosen and I had a brief discussion on service versioning. He said, “you should blog on that.” So, thanks for idea Alex.
Sooner or later, as you build up your service library, you’re going to have to make a change to the service. Agility is one of the frequently touted benefits of SOA, and one way of looking at it is the ability to respond to change. When this situation arises, you will need to deal with versioning. In order to remain agile, you should prepare for this situation in advance.
There are two extreme viewpoints on versioning, and not surprisingly, they match up with the two endpoints associated with a service interchange- the service consumer and the service provider. From the consumers point of view, the extreme stance would be that the number of versions of a service remain uncapped. In this way, systems that are working fine today don’t need to be touched if a change is made that they don’t care about. This is great for the consumer, but it can become a nightmare for the provider. The number of independent implementations of the service that must be managed by the provider is continually growing, increasing the management costs thereby reducing the potential gains that SOA was intended to achieve. In a worst-case scenario, each consumer would have their own version of the service, resulting in the same monolithic architectures we have today, except with some XML thrown in.
From the providers point of view, the extreme stance would be that only one service implementation ever exists in production. While this minimizes the management cost, it also requires that all consumers move in lock step with the service, which is very unlikely to happen when there are more than one consumer involved.
In both of these extreme examples, I’m deliberately not getting into the discussion of what the change is. While backwards compatibility can have influence on this, regardless of whether the service provider claims 100% backwards compatibility or not, my experiences have shown that both the consumer and the provider should be executing regression tests. My father was an electrician, and I worked with him for a summer after my freshman year in college. He showed me how to use a “wiggy” (a portable voltage tester) for checking whether power was running to an outlet, and told me “If you’re going to work an outlet, always check if it’s hot. Even if one of the other electricians or even me tells you the power is off, you still check if it’s hot.” Simply put, you don’t want to get burned. Therefore, there will always be a burden on the service consumers when the service changes. The provider should provide as much information as possible so that the effort of the consumer is minimized, but the consumer should never implicitly trust that what the provider says is accurate without testing.
Back to the discussion- if we have these two extremes, the right answer is somewhere in the middle. Choosing an arbitrary number isn’t necessarily a good approach. For example, suppose the service provider states that no more than 3 versions of a service will be maintained in a production. Based upon high demand, if that service changes every 3 months, that means that the version released in January will be decommissioned in September. If the consumer of that first version is only touched every 12 months, you’ve got a problem. You’re now burdening that consumer team with additional work that did not fit into their normal release cycle.
In order to come up with a number that works, you need to look at both the release cycle of the consuming systems as well as the release cycle of the providers and find a number that allows consumers to migrate to new versions as part of their normal development efforts. If you read that carefully, however, you’ll see the assumption. This approach assumes that a “normal release cycle” actually exists. Many enterprises I’ve seen don’t have this. Systems may be released and not touched for years. Unfortunately, there’s no good answer for this one. This may be a symptom of an organization that is still maturing in their development processes, continually putting out fires and addressing pain points, rather than reaching a point of continuous improvement. This is representative of probably the most difficult part of SOA- the culture change. My advice for organizations in this situation is to begin to migrate to a culture of change. Rather than putting an arbitrary cap on the number of service versions, you should put a cap on how long a system can go without having a release. Even if it’s just a collection of minor enhancements and bug fixes, you should ensure that all systems get touched on a regular basis. When the culture knows that regular refreshes are part of the standard way of doing business, funding can be allocated off the top, rather than having to be continually justified against major initiatives that will always win out. It’s like our health- are you better off having regular preventative visits to your doctor in addition to the visits when something is clearly wrong? Clearly. Treat your IT systems the same way.
The value of events
Joe McKendrick quoted a previous blog entry of mine, but he prefaced my quotes with the statement that I was “questioning the value of EDA to many businesses.” One of things that any speaker/author has to always deal with is the chance that the message we hope comes out doesn’t, and I think this is one of those cases. That being said, if you feel like you are misrepresented, it probably means you didn’t explain it well in first place! So, in the event that others are feeling that I’m questioning the value of EDA, I thought I’d clarify my stance. I am a huge fan of events and EDA. Events can be very powerful, but just as there has been lots of discussions around the difference between SOA and ABOS- a bunch of services, the same holds true for EDA.
The problem does not lie with EDA. EDA, as a concept, has the potential to create value. EDA will fail to produce value, just as SOA will, if it is incorrectly leveraged. Everyone says SOA should begin with the business. Guess what, EDA should as well. While the previous entries I’ve posted and the great comments from the staff at Apama and their postings have called out some verticals where EDA is already being applied successfully, I’m still of the opinion that many businesses would be at significant risk of creating ABOE- a bunch of events. This isn’t a knock on the potential value of events, it’s a knock on the readiness of the business to realize that potential. If the business isn’t thinking of themselves in a service-oriented context, they are unlikely to reach the full potential of SOA. If the business isn’t thinking of themselves in an event-driven context, they are unlikely to reach the full potential of EDA.
Teaming Up for SOA
I recently “teamed up” with Phil Windley. He interviewed me for his latest InfoWorld article on SOA Governance which is now available online, and is in the March 5th print issue. Give it a read and let me know what you think. I think Phil did a great job in articulating a lot of the governance challenges that organizations run into. Of the areas where I was quoted, the one that I think is a significant culture change is the funding challenge. It’s not just about getting funding for shared services which is a challenge on its own. It’s also a challenge of changing the way that organizations make decisions to include architectural elements in the decision. Many organizations that I have dealt with all tend to be schedule driven. That is, the least flexible element of the project is schedule. Conversely, the thing that always gives is scope. Unfortunately, it’s not usually visible scope, it’s usually the difference in taking the quickest path (tactical) versus the best path (strategic). If you’re one of many organizations trying to do grass roots SOA, this type of IT governance makes life very difficult as the culture rewards schedule success, not architectural success. It’s a big culture shift. Does your Chief Architect have a seat at the IT Governance table?
Anyway, I hope you enjoy the article. Feel free to post your questions here, and I’d be happy to followup.
IT in homes, schools
I’ve had some lightweight posts on SOA for the home in the past, and for whatever reason, it seems to be tied to listening to IT Conversations. Well, it’s happened again. In Phil and Scott’s discussion with Jon Udell, they lamented the problems of computers in the home. Phil discussed the issues he’s encountered with replacing servers in his house and moving from 32-bit to 64-bit servers (nearly everything had to be rebuilt, he indicated that he would have been better off sticking with 32-bit servers). Jon and Phil both discussed some of the challenges that they’ve had in helping various relatives with technology.
It was a great conversation and made me think of a recent email exchange concerning my father-in-law’s school. He’s a grade school principal, and I built their web site for them several years ago. They host it themselves, and the computer teacher has done a great job in keeping it humming along. That being said, there’s still room for improvement. Many of the teachers still host their pages externally. My father-in-law sends a letter home with the kids each week that is a number of short paragraphs and items that have occurred throughout the week. Boy, that could easily be syndicated as a blog. Of course, that would require installing WordPress on the server, which while relatively easy for me, is something that could get quite frustrating for someone not used to operating at the command line. Anyway, the email conversation was about upgrading the server. One of the topics that came up was hosting email ourselves. Now, while it’s very easy to set up a mail server, the real concern here comes up with reliability. People aren’t going to be happy if they can’t get to their email. Even if we just look at the website, as it increasingly becomes part of the way the school communicates with the community, it starts to become critical.
When I was working in an enterprise, redundancy was the norm. We had load balancers and failover capabilities. How many people have a hardware load balancer at home? I don’t. You may have a linux box that does this, but it’s still a single point of failure. A search at Amazon really didn’t turn up too many options for the consumer, or even a cash-strapped school for that matter. This really brings up something that will become an increasing concern as we march toward a day where connectivity is ubiquitous. Vendors are talking about the home server, but when corporations have entire staffs dedicated to keeping those same technologies running, how on earth are we going to expect Mom and Pop in Smalltown U.S.A. to be able to handle the problems that will occur?
Think about this. Today, I would argue that most households still have normal phones and answering machines. Why don’t we have the email equivalent? Wouldn’t it be great if I could purchase a $100 device that I just into my network and now have my own email server? Yes, it would be okay if I had to call my Internet provider and say, “please associate this with biske.com” just as I must do when I establish a phone line. What do I do, however, if that device breaks? What if it gets hacked and becomes a zombie device contributing to the dearth of spam on the Internet? How about a device that enables me to share videos and pictures with friends and family? Again, while hosted solutions are nice, it would be far more convenient to merely pull them off the camcorder and digital camera and make it happen. I fully believe that the right thing is to always have a mix of options. Some people will be fine with hosted solutions. Some people will want the control and power of being able to do it themselves, and there’s a marketplace for both. I get tired of these articles that say things like “hosted productivity apps will end the dominance of Microsoft Office.” Phooey. It won’t. It will evolve to somewhere in the middle, rather than one side or the other. Conversations like that are always like a pendulum, and the pendulum always swings back. I’m off on a tangent, here. Back to the topic- we are going to need to make improvements in orders of magnitude on the management of systems today. Listen to the podcast, and here the things that Jon and Phil, two leading technologists that are certainly capable of solving most any problem, lament. Phil gives the example of calls from his wife (I get them as well) that “this thing is broken.” While he immediately understands that there must be a way to fix it, because we understand the way computers operate behind the scenes, the average joe does not. We’ve got a long way to go to get the ubiquity that we hope to achieve.
Starter SOA
Jeff Schneider has posted a series of entries on “Starter SOA” on his blog. The first deals with what he “believes is at the heart of the SOA issue.” It recommends attacking three specific areas in getting started: portfolio management, enterprise architecture, and information management. I think this is right on, for very straightforward reasons. First, portfolio management deals with what services should be created. If you don’t make any changes to this discipline, you’re simply going to get the same solutions you always have, except with some services thrown in. That’s not SOA. Secondly, enterprise architecture is the technical counterpart to the portfolio management side. While portfolio management is concerned about the business aspects of shared services, enterprise architecture needs to be concerned about the technical aspects of shared services. Finally, information management is the source of consistency across our services. If every service team defines its own service schemas, we really haven’t made things much better, as additional effort must now be made to mediate between the information models of every consumer and every service that must talk to each other. Get two or more services and consumers involved, and it simply increases in complexity.
In the next entry, Jeff discusses the fact that SOA will challenge the organizational structure of SOA. How are organizations supposed to address these challenges? He suggests forming an SOA Steering Committee. The committee consists of a cross-discipline team of people who are normally thinking in enterprise terms, rather than project-specific. Importantly, however, he emphasizes that this committee must interact with their project-specific counterparts. That is, the enterprise architect works with application architects. The portfolio analyst works with the project analyst. The PMO rep works with the project manager, and so on. An important aspect of this group is that they can make enterprise decisions as things progress with SOA. An enterprise architect trying to drive SOA on his or her own isn’t left trying to find an open ear when they determine that organizational change is needed, or that a project should be split into multiple projects.
In part 3 (I don’t know if he has more parts planned!), he gets into a more sensitive and difficult area: money. The most important thing that he introduces here is the simple notion that the funding model has to change. Where funding was previously all about getting the “application” completed, we now need models that fund shared items- shared services, shared infrastructure. This shouldn’t be new to organizations, as shared infrastructure is certainly something that they should be dealing with today, this now just extends it into the application development domain.
It’s good to get back to the basics every now and then. Those of us that are out there commenting on this on a regular basis can get into modes where the only other people who care about what we’re saying are other commentators, and not everyone is at that point.
Metrics, metrics, metrics
James McGovern threw me a bone in a recent post, and I’m more than happy to take it. In his post, “Why Enterprise Architects need to noodle metrics…” he asks:
Hopefully bloggers such as Robert McIlree, Scott Mark, Todd Biske and others would be willing to share not only successes within their own enterprise when it comes to metrics but also any unintended consequences in terms of collecting them.
I’m a big, big fan of instrumentation. One of the projects that I’m most proud of was when we built a custom application dashboard using JMX infrastructure (when JMX was in its infancy) for a pretty large web-based system. The people that used it really enjoyed the insight it gave them into the run-time operations of the system. I personally didn’t get to use it, as I was rolled onto another project, but the operations staff loved it. Interesting, my first example of metrics being useful comes from that project, but not from the run time management. It came from our automated build system. At the time, we had an independent contractor who was acting as a project management / technical architecture mentor. He would routinely visit the web page for the build management system and record the number of changed files for each build. This was a metric that the system captured for us, but no one paid much attention to it. He started posting graphs showing the number of changed files over time, and how we had spikes before every planned iteration release. He let us know that those spikes disappeared, we weren’t going live. Regardless of the number of defects logged, the significant amount of change before a release was a red flag for risk. This message did two things: first, it kept people from working to a date, and got them to just focus on doing their work at an appropriate pace. Secondly, I do think it helped up release a more stable product. Fewer changes meant more time for integration testing within the iteration.
The second area where metrics have come into play was the initial use of Web Services. I had response time metrics on every single web service request in the system. This became valuable for many reasons. First, because the thing collecting the new metrics was new infrastructure, everyone wanted to blame it when something went wrong. The metrics it collected easily showed that it wasn’t the source of any problem, and actually was a great tool in narrowing where possible problems were. The frustration switched more to the systems that didn’t have these metrics available because they were big, black boxes. Secondly, we caught some rogue systems. A service that typically had 200,000 requests per day showed up on Monday with over 3 million. It turns out a debugging tool had been written by a project team, but that tool itself had a bug and started flooding the system with requests. Nothing broke, but had we not had these metrics and someone looking at them, it eventually would have caused problems. This could have went undetected for weeks. Third, we saw trends. I looked for anything that was out of the norm, regardless of whether any user complained or any failures occurred. When the response time for a service had doubled over the course of two weeks, I asked questions because that shouldn’t happen. This exposed a memory leak that was fixed. When loads that had been stable for months started going up consistently for two weeks, I asked questions. A new marketing effort had been announced, resulting in increased activity for one service consumer. This marketing activity would have eventually resulted in loads that could have caused problems a couple months down the road, but we detected it early. An unintended consequence was a service that showed a 95% failure rate, yet no one was complaining. It turns out a SOAP fault was being used for a non-exceptional situation at the request of the consumer. The consuming app handled it fine, but the data said otherwise. Again, no problems in the system, but it did expose incorrect use of SOAP.
While these metrics may not all be pertinent to the EA, you really only know by looking at them. I’d much rather have an environment where metrics are universally available and the individuals can tailor the reporting and views to information they find pertinent. Humans are good at drawing correlations and detecting anomalies, but you need the data to do so. The collection of these metrics did not have any impact on the overall performance of the system, however, they were architected to ensure that. Metric collection should be performed as an out-of-band operation. As far the practice of EA is concerned, one metric that I’ve seen recommended is watching policy adherence and exception requests. If your rate of exception requests is not going down, you’re probably sitting off in an ivory tower somewhere. Exceptions requests shouldn’t be at zero, either, however, because then no one is pushing the envelope. Strategic change shouldn’t solely come from EA as sometimes the people in the trenches have more visibility into niche areas for improvement. Policy adherence is also needed to determine what policies are important. If there are policies out there that never even come up in a solution review, are they even needed?
The biggest risk I see with extensive instrumentation is not resource consumption. Architecting an instrumentation solution is not terribly difficult. The real risk is in not provided good analytics and reporting capabilities. It’s great to have the data, but if someone has to perform extracts to Excel or write their own SQL and graphing utilities, they can waste a lot of time that should be spent on other things. While access to the raw data lets you do any kind of analysis that you’d like, it can be a time-consuming exercise. It only gets worse when you show it to someone else, and they ask whether you can add this or that.
Why SOA?
I’ve been meaning to post an entry on this for some time now. Back in early February, Joe McKendrick had a post entitled “A CTO underwhelmed by SOA.” I recently had a conversation with a CIO (not specifically on SOA) who lamented that we’re still solving the same old problems. A google search on “trough of disillusionment” and “SOA” yields about 12,000 hits. So why are we doing SOA?
If you’ve seen my bio at a conference, or looked at my “About Me” page here, you’ll know that I’ve always been interested in human computer interaction and usability. Why? It’s because I’ve always felt that the user interface can be a differentiator. While that was probably more the case in the early 90’s during my graduate school days, I do think that it still holds true today. So how did I make the switch to being such an advocate of SOA? Largely, it was for very similar reasons. I looked at SOA and felt that if done properly, it could be a differentiator to the bottom line. While you could argue that the Web did the same thing, that whole craze was far more about using Java and J2EE than about changing the business. The best thing about SOA is that the discussion has been dominated by SOA and not by underlying technologies.
So how can SOA be a differentiator? Well, here’s a recent case study that was published. Yes, it’s a classic problem of getting that single view of the customer in the hands of the people that need it most. What’s great about this case study it clearly shows the difference between the monolithic application world (they had to have the Data Warehouse operators generate an Excel-based extract) to the world of SOA (the applications they use pulled it from the data warehouse directly via services). Break down those application walls, and start viewing things in a bigger picture. While the hype is going down, there are still plenty of opportunities available. Keep in mind, however, that the business strategy should be the driver of where your SOA efforts are focused. Kudos to the person at Helzberg that recognized that they needed to make customer information available at any store. That was the business need, and it gave them an excellent opportunity to leverage SOA appropriately.
EDA begins with events
Joe McKendrick asks, “Is EDA the ‘new’ SOA?” First, I’ll agree with Brenda Michelson that EDA is an architecture that can effectively work in conjunction with SOA. While others out there view EDA as part of SOA, I think a better way of viewing it would be that services and events must both be part of your technology architecture.
The point I really want to make however, which expounds on my previous post, is that I simply think event-oriented thinking is the exception, rather than the norm for most businesses. I’m not speaking about events in the technical sense, but rather, in the business sense. What businesses are truly event driven, requiring rapid response to change? Certainly, the airlines do, as evidenced by JetBlue’s recent difficulties. There are some financial trading sectors that must operate in real-time, as well. What about your average retail-focused company, however? Retail thinking seems to be all about service-based thinking. While you may do some cold calls, largely, sales happen when someone walks into the store, goes to the website, or calls on the phone. It’s a service-based approach. They ask, you sell. What are the events that should be monitored that would trigger a change in the business? For companies that are not inherently event-driven, the appropriate use of events are for collecting information and spotting trends. Online shopping can be far more informative for a company than brick-and-mortar shopping because you’ve got the clickstream trail. Even if I don’t buy something, the company knows what I entered in the search box, and what products I looked at. If I walk into Home Depot and don’t purchase anything, is there any record of why I came into the store that day?
Again, how do we begin to go down the direction of EDA? Let’s look at an event-driven system. The October 2006 issue of Business 2.0 had a feature on Steve Sanghi, CEO of Microchip Technology. The article describes how he turned around Microchip by focusing on commodity processors. As an example, the articles states that Intel’s automotive-chip division was pushing for “a single microprocessor near the engine block to control the vehicle’s subsystems and accessories.” Microchip’s approach was “to sprinkle simpler, cheaper, lower-power chips throughout the vehicle.” Guess what, today’s cars have about 30 micro-controllers.
So, what this says is that the appropriate event-based architecture is to have many, smaller points of control that can emit information about the overall system. This is the way that many systems management products work today- think SNMP. To be appropriate for the business, however, this approach needs to be generating events at the business level. Look at the applications in your enterprise’s portfolio and see how many of them actually publish any sort of data on how it is being used, even if it’s not in real time. We need to begin instrumenting our systems and exposing this information for other purposes. Most applications are like the checkout counter at Home Depot. If I buy something, it records it. If I don’t buy something and just exit the store, what valuable information has been missed that could improve things the next time I visit?
I’d love to see events become more mainstream, and I fully believe it will happen. I certainly won’t argue that event-driven systems can be more loosely coupled, however, I’ll also add that the events we’re talking about then are not necessarily the same thing as business events. Many of those things will never be exposed outside of IT, nor should they be. It’s the proper application of business events that will drive companies opening up their wallets to purchase new infrastructure built around that concept.
What’s the big deal about BPEL
Courtesy of this news release from InfoWorld, I found out that Microsoft Windows Workflow Foundation (WWF, which has nothing to do with endangered animals or professional wrestlers) is going to support BPEL. This is a good thing, but what does it really mean?
I’ll admit that I’ve never gotten very excited about BPEL. My view has always been that it’s really important as an import/export format. You should have a line item on your RFI/RFP that says, “supports import/export of BPEL.” You should ask the sales guy to demonstrate it during a hands-on section. Beyond this, however, what’s the big deal?
The BPM tools I’ve seen (I’ll admit that I haven’t seen them all nor even a majority of them) all have a nice graphical editor where you drag various shapes and connectors around, and they probably have some tree-like view where you draw lines between your input structure and your output structure. At best, you may need to hand code some XPath and some very basic expressions. The intent of this environment is to extract the “business process” from the actual business services where the heavy duty processing occurs. If you sort through the marketing hype, you’ll understand that this is all part of a drive to raise the level of abstraction and allow IT systems to be leveraged more efficiently. While we may not be there yet, the intent is to get tools into the hands of the people driving the requirements for IT- the business. Do you want your business users firing up XML Spy and spending their time writing BPEL? I certainly don’t.
What are the important factors that we should be concerned about with our BPM technologies, then? Repeating a common theme you’ve seen on this blog, it’s the M: Management. No one should need to see BPEL unless you’re migrating from one engine to another. There shouldn’t be a reason to exchange BPEL between partners, because it’s an execution language. Each partner executes their own processes, so the key concern is the services that allow them to integrate, not the behind the scenes execution. What is important is seeing the metrics associated with the execution of the processes to gain an understanding of the bottlenecks that can occur. You can have many, many moving parts in an orchestration. Your true business process (that’s why it was in quotes earlier) probably spans multiple automated processes (where BPEL applies), multiple services, multiple systems, and multiple human interactions. Ultimately, the process is judged by the experiences of the humans involved, and if they start complaining, how do you figure out where the problems are? How do you understand how other forces (e.g. market news, company initiatives, etc.) influence the performance of those processes. I’d much rather see all of the vendors announcing support for BPEL begin announcing support for some standard way of externalizing the metrics associated with process execution for a unified business process management view of what’s occurring, regardless of the platforms where everything is running, or how many firewalls need to be traversed.
IT Conversations Podcast
Ed Vazquez and I were invited to join Phil Windley and Scott Lemon on Phil’s Technometria series within IT Conversations for a discussion on SOA, with quite a bit of detail on SOA Governance. Give it a listen and feel free to follow up with any questions.
You can download the podcast here.
More consolidation: Cisco to acquire Reactivity
The consolidation in the vendor space continues. From tmcnet.com:
Cisco Systems Inc., a leading supplier of Internet network equipment, has announced its intention to acquire Reactivity Inc., a provider of gateway solutions that simplify web services management. The agreement, which is subject to the standard closing conditions, states that Cisco will pay $135 million and assumed options of Reactivity.
Jeff will need to update his scorecard. This certainly puts two big boys head-to-head in this space with IBM previously having acquired DataPower. Cisco has been very quiet for some time regarding its AON technology, so I’m wondering how this acquisition impacts that effort.
The more important question, however, is what this means for the bigger picture. I previously posted on the need for an open, integrated world between service management, service connectivity, and service hosting. Unfortunately, these acquisitions may point more toward single-vendor, proprietary integrations, rather than open solutions. Will the Cisco version of Reactivity still partner with companies like AmberPoint, or will the management focus now shift all to Cisco technologies, starting with the Application Control Engine mentioned in Cisco’s press release? What’s been happening with the Governance Interoperability Framework now that HP owns Systinet? IBM’s Registry/Repository solution doesn’t support UDDI and has a new API.
While bigger fish eating the smaller fish is way things frequently work with technology startups, these players need to keep in mind the principles of SOA. Their products should be providing services to enterprise, and those services should be accessible in an open, standards-based way. Just as the business doesn’t want a bunch of packaged business applications that require redundant data and can’t talk to each other, IT doesn’t want a bunch of infrastructure that requires redundant management capabilities with only proprietary APIs to work with. We need SOA for IT, and any vendor selling in this space should be making that a reality, not making it worse.