Service-Oriented Architecture

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Abstract

A short investigation of the Service-Oriented Architecture and some best practices.

Introduction

Typically, web applications are written in one giant program or set of templates such as PHP. The web app will do all the tasks from processing incoming requests, accessing and modifying data in the database, and rendering the HTML response. The problem with this model is that it doesn't scale very well. It becomes very difficult to separate out the components so that they can be individually scaled.

The solution to this problem is to build the website using the principles of a Service-Oriented Architecture.

Overview

The structure of a website built using the Service-Oriented Architecture is that you have multiple processes running on multiple machines. Each of these processes are a service, meaning that they provided a limited number of features.

Some services are actually web servers. These take incoming HTTP requests and pass along work to other services. The results are aggregated and returned in the resulting HTML response.

Some services are data services. These handle mundane requests such as creating, modifying, and destroying certain objects. These also provide information through queries. These kinds of services are really simple wrappers on top of a database. But the abstraction is useful since the database behind the service can be changed or replaced without affecting the service.

Other services execute complicated business rules. These will talk to other business services and data services to do what they are asked to do.

All of the services use some protocol to communicate one with another. Ideally, there is only one protocol that will satisfy all the needs of any client, present or future. In reality, services eventually provide multiple interfaces through multiple protocols.

Each service publishes an interface, a document that describes what methods the service provides and how they are provided. It also describes what the method should do.

Orthogonality

"Orthogonal" means that two lines are at right angles---90&degrees;. Orthogonality is a property of certain coordinate systems. As you may recall from vector calculus, the coordinate system is a set of base vectors that every point in space can be described with. An orthogonal coordinate system means that every base vector is at right angles to all the other base vectors. In other words, one base vector doesn't add or subtract from another base vector. The benefit of this is that every point is described by a single coordinate, rather than a series of coordinates.

Orthogonality in terms of service interface design says that if you want to do a certain task, there is pretty much only one set of operations that will get it done and that each operation is as simple as it can get. For instance, you don't have one operation to add items to the cart and total the amount of each of the items and also calculate shipping, and another that does all that and then calculates the tax as well. Instead, you should have operations to add items to a cart, operations to calculate the total of the items in the cart, operations to calculate shipping, and operations to calculate taxes, each separate.

It is important that you keep your services orthogonal, as much as possible. This is so that you can predict what your clients will do and also so that the interfaces remain as simple as possible. If you are building an orthogonal system, then you can easily take an existing service and break it down into simpler services since each method was orthogonal from the others. In my previous example, this means that you could break out the methods that are used to calculate shipping into an entirely separate service.

Writing Web Apps

In order to write web apps in the Service-Oriented Architecture way, it is important to begin abstracting at points you normally wouldn't.

The Database

The first step is to abstract the interactions with the database. Libraries such as SQLAlchemy do a good job of creating objects that can be used to access the data. However, this is not enough. You want to create a limited set of functions with a simple interface that allow all the transactions you would like to do. These functions should work with a limited set of objects that represent concepts within the database.

The set of functions and objects and their behavior defines the interface of this service. Even though it doesn't live in a separate process---yet---the abstraction is an important step forward. Already, you can make major changes to the database and not have to change any surrounding code as long as the interface doesn't change.

The Business Logic

The next abstraction step is to abstract out the business logic into a set of simple functions with a defined interface. These functions may use the same objects that the data abstraction layer uses, or they may have their own objects. Regardless, it is separately defined from the data abstraction layer.

Scaling

When you need to do scaling exercises, it should be relatively easy to see how often each method in each of the proto-services are being called, as well as to calculate how long they take to run. Looking at this data should give you ideas on which bits should be broken into a separate service first. As you break the services out, you are going to replace the methods with stubs that call out to the right service and wait for the result. Of course, you will want to do a bit of parallelization here, rather than tying up valuable resources waiting for something that could take a while (in computer time.)

As you scale, you will still be dealing with fundamental problems of how much work can be done in any unit of time. However, you will have the advantage of being able to add more hardware to a particular set of features, or even completely redesigning how those tasks are done, without interfering with other services and clients. This is the goal of the Service-Oriented Architecture, a system whereby scaling isn't a difficult task and abstraction is sufficient such that redesigns and reworkings can be accomplished with relative ease.

Protocols

Which protocol should be used? My ideal protocol has these attributes:

  1. Simple for the computer to encode and interpret. If encoding and decoding the messages aren't trivial, you will have to pay a price for every transaction. Also, the simpler the protocol the easier it is to implement in different languages and environments.
  2. Ideal packet size. The data should be encoded into about one IP packet. If we can do this, then a whole world of opportunity opens up. If the packets are so small that a lot of the packet is left unused, then we are too efficient. If they are so big that you need more than one packet, then you will see different kinds of networking issues.
  3. Discovery. Ideally, you should be able to plug in a service into a network and have clients find it without any operator intervention.
  4. Published interfaces. Ideally, the interface for a service should be published as part of the service. This means peopl can read the documentation on the live service and that programs can see what methods are available and how they work and error accordingly on the client side when the service is being used improperly.
  5. Versioning. The protocol should allow the service to be upgraded seamlessly. For instance, if you need to deprecate an older version, you should be able to tell clients this so that they know they need to upgrade. It should be possible to have two or more versions of a service running at the same time, on the same machine or separate machines. If an upgrade is available and compatible, then the clients should immediately begin using that and not the older service.