Followers

Monday, November 9, 2009

Synchronization Architectures

Enterprise data synchronization is not a straightforward task. Being able to communicate with a variety of enterprise systems over wireless and wireline networks to an assortment of client applications requires some flexibility at the architectural level. Considering that there are half a dozen mobile operating systems, with at least that many networks, and even more back-end systems that have to be accessed, the synchronization layer is often one of the most complex components of a complete smart client application.

After you take a look at the most common synchronization architecture, along with common synchronization topologies and data access methodologies, you will have a good foundation from which you can make informed decisions on implementing data synchronization within your mobile solution.

Architecture Overview
Synchronization is most often implemented in a distributed computing architecture with a client layer, a middle-tier layer, and an enterprise data layer. Each layer can be implemented using varying techniques, all aimed at accomplishing the same goal: providing a way to extend enterprise data to a variety of mobile devices.

We are going to take a closer look at each synchronization component and how it contributes to the overall synchronization process.


Figure: Synchronization architecture.

Client
When synchronizing data between an enterprise server and a persistent data store on the client device, a synchronization layer must be present to manage the two-way data communication (see Figure). Ideally, this layer will have a minimal impact on your client application, while still providing a simple, easy-to-use client API for controlling the synchronization process. By implementing a modular, self-contained synchronization layer, you can control the entire synchronization process with little interaction from the client application. In some cases, all that is required from the client is the invocation of the synchronization process; the synchronization layer does everything else from there.

Because so many client devices are on the market, the synchronization client must have support for the leading mobile devices, including laptops, Windows CE devices, Palm OS devices, Symbian OS devices, as well as specialized devices with add-on features such as barcode scanners and other industrial components. Each of these devices can have a different mobile operating system with different network protocol support. The synchronization layer on the client takes care of the network communication from the device back to the synchronization middleware.

Middleware
The synchronization server is where most of the synchronization logic is contained. Figureillustrates the role of the synchronization server in relation to the other components of the synchronization architecture. This server is responsible for communicating with the client application to send and receive required data packets. In order to do this, it has to be able to communicate over a protocol that the client application understands. Most of the time, this protocol is IP-based, and often is HTTP. When the synchronization server receives the data from the client, it then has to execute the synchronization logic to determine how this data is transferred into the enterprise data source.

Many of the advanced synchronization features are implemented within the synchronization server. Some of these features include data subsetting, conflict detection and resolution, data transformation, data compression, and security. All of these features have to be implemented while still maintaining server performance and scalability.

Two common synchronization server implementations exist: as a standalone server application, or as Java servlets running in a servlet engine. Both of these methods have benefits and drawbacks. The standalone synchronization server is convenient because it does not require any additional software to execute. These servers are usually programmed using the C programming language, taking advantage of OS-level calls, leading to enhanced performance. This also means that the server has to be available for the operating system to which you are deploying or you are out of luck. In terms of scalability and availability, the server can either have its own built-in load-balancing and failover mechanisms or use third-party load-balancing solutions, such as the hardware-based systems provided by Cisco systems.

For Java servlet-based synchronization servers, you will need a servlet engine for deployment. Since J2EE application servers are now commonplace in most organizations, this requirement does not usually pose a problem. By using an outside servlet engine, the performance, scalability, and availability of the synchronization server now rely on the capabilities of the application server/servlet engine being used. The same goes for the server operating systems that are supported; that is, as long as the servlet engine works on a given platform, the synchronization servlet should work as well. That said, you should give extra consideration to the synchronization vendor's supported platforms and recommended application servers when deciding which application server and operating system to use.

Enterprise Integration
The final part of a complete synchronization solution is the enterprise integration layer. While this layer is often part of the synchronization server, we are discussing it separately because it provides different functionality. The enterprise integration layer enables you to communicate with various back-end data sources. If you are using a commercial mobile relational database on the client, you will most likely have integration to enterprise relational databases on the server using ODBC, JDBC, or native drivers.

In addition to providing integration to relational databases, you may also require access to other forms of enterprise data, such as ERP systems, CRM systems, or XML data. If this is the case, you will have to look for additional enterprise adapters for the solution you are implementing; or if you have the resources, you can create your own adapter.

Publish/Subscribe Model
One of the most common models of data access being used by synchronization solutions today is the publish/subscribe model, so it is worthwhile to explore how it works.

The publish/subscribe synchronization model is based on the concept of having a master copy of your data, the publisher, and one or more copies of this data, the subscriber(s). The data between the publisher and subscribers is updated periodically to keep the data consistent. The update process, or synchronization, is bidirectional, allowing the publisher to update the subscriber data, and vice versa.

The publisher is responsible for defining which tables (or subsets of those tables) are available for external access. The data sets that are defined as being available are called publications. The application clients that are interested in having access to that information are called subscribers. They define which data they want access to using subscriptions. It is possible to have a number of publications defining the available data. These publications can have parameters, which makes it possible to use different subsets of data for different users. In this way, one publication can be used to send a different subset of data to each subscriber.

Figure shows an example of a field service application's data set. In this diagram, you can see that the publisher has the complete set of master data. This data contains all of the information for each field service representative. In this example, the field service representatives are the subscribers. They require only the subset of the publisher's data that pertains to their territory, so rather than keeping a copy of the entire set of data, they have a subset of data that is based on their individual subscriptions. In this way, you can easily define which data is available for mobile users and which parts of the data needs to be synchronized down to the mobile devices.


Figure: Publish/subscribe data synchronization.

Common Synchronization Configurations
A variety of network and data management configurations are commonly used with enterprise synchronization. Synchronization configurations (or topologies) are arrangements of publisher and subscriber databases that transmit data to one another. The publish/subscribe model of data access supports both peer-to-peer and hierarchical configurations.

The most common of these two is the hierarchical configuration, wherein every database has a single parent database, except for the master enterprise database, which has no parent. A diagram of common hierarchical layouts can be seen in Figure. In this diagram are two hierarchical configurations: the configuration on the right is commonly referred to as a hub-and-spoke configuration. In both cases, hierarchical configurations are good when a publisher needs to publish data to a large number of subscribers, or, in other words, when there are a large number of remote workers who need to have access to the enterprise data. In this configuration, the master database contains all of the changes made by any of the remote databases. Those updates can then be propagated to the other remote databases during their next synchronization.


Figure: Hierarchical database configurations.

Unlike the hierarchical configuration, peer-to-peer configurations do not have a single common enterprise data store. In Figureyou can see that peer-to-peer configurations enable each database to contain the same information, with no single database acting as a central server. In this configuration, you can still use the publish/subscribe model, but the data updates are not going to be propagated to all databases in the configuration. For this reason, peer-to-peer configurations are best suited for situations in which only two systems need to be synchronized, or when it is not important for each remote user to have the updates from all other users. In general, a peer-to-peer configuration is not recommended, because there is no central server responsible for collecting updates from the remote users and propagating those changes to the other users in the system. Other difficulties of peer-to-peer configurations include maintaining data integrity, implementing conflict detection, and programming synchronization logic.

No comments:

Post a Comment