Solutions for Developers
"What exactly is distributed caching? How can it help me to build applications that deliver high performance on server farms?" You may be asking yourself these questions as you develop your next server-farm application. Lets take a quick dive into distributed caching from a developer's point of view and see what it's all about.
Eliminating the Scalability Bottleneck
You already know that server farms can deliver big performance gains - and do so cost-effectively by using industry-standard servers. For example, say you are running a Web application on one server, and that server begins to max out. Now if you add another server running another copy of your application, you can double the throughput. This means that you will be able to handle twice the load without increasing response times for Web users. As you keep adding servers, you should be able to just keep scaling throughput.
The trick of course is to make sure that adding servers does not also create bottlenecks that keep throughput from scaling linearly. The problem is that the servers may have to share a resource, such as a database server, and this shared resource may become overloaded. Web apps often use a database server to store session-state so that it can be accessed across the farm. They also may repeatedly retrieve popular data, such as product descriptions, or they may create application-wide data, such as "top 10" lists, schedules, or interim stock trading results, all of which have to be accessible to all servers. It is critical to avoid creating a storage bottleneck when storing and accessing this data.
This is where distributed caching can step in and help you avoid storage bottlenecks. By storing data in a scalable, in-memory cache that spans the whole server farm, you can be sure that access throughout scales with the farm. Since the distributed cache makes data uniformly available to all servers, you don't have to implement a mechanism for passing data between servers. And the distributed cache automatically creates replicas and handles server failures to keep data safe.
If you are just storing ASP.NET session-state, you can license SessionServer and let it transparently store session-state objects with just a one-line change to your web.config file. If you want the distributed cache to handle additional types of data, such as cached database data and application-wide calculations, you can upgrade to ScaleOut StateServer and use its APIs to directly access the cache.
Easy to Use APIs
Let's see just how easy it is to store data in ScaleOut StateServer's distributed cache. The product's APIs give your application a simple view of the cache that looks the same on all servers. Objects are stored as binary, serialized objects identified by a 256-bit key that you supply. Access methods are basically Add, Retrieve, Update, and Remove. These methods also take care of synchronization between servers by locking a cached object when you retrieve it and unlocking it after it's updated.
Here's a simple example. Let's say you want to store a "top 10" list. You just create a ScaleOut built-in helper object, called a "cached data accessor" to setup the object's key and then you add the object to the distributed cache, as the following code snippet illustrates:
TopTenList myList;
// (initialize myList)
CachedDataAccessor cda = new CachedDataAccessor(key);
cda.Add(myList);
The Add method serializes the object myList and then stores it in the cache. Now you can retrieve the list, modify it, and re-cache it, as follows:
myList = (TopTenList)cda.Retrieve();
// (modify the list in the local server)
cda.Update(myList);
As you can see, the APIs make access to the distributed cache easy and hide all of the machinery needed to implement global accessibility, scalability, and data replication for high availability. Here is another example of some of distributed caching's hidden power. Let's say you get a database invalidation event that requires that you remove the cached top 10 list. With one line of code you can instantly signal all servers in the farm that the cached data has been invalidated:
cda.Remove();
Imagine how much work it would be to develop code that removes the top 10 list from all servers in a synchronized manner, while handling possible server failures. ScaleOut Software's distributed caching efficiently and reliably takes care of this for you.
Powerful Capabilities
As you begin to explore the possibilities for distributed caching, you will find that it is a powerful tool for storing application data which needs to be quickly accessed and shared across a server farm. Distributed caching can be used in a wide variety of applications to help boost performance and avoid bottlenecks to scalable throughput.
To see an example of how ScaleOut Software's distributed caching can easily scale your application, consider the problem of handling object expiration after a timeout, such as a session timeout for a Web session-state object. You can assign a timeout value to an object when it is added to the cache, and if a timeout occurs, the cache signals an asynchronous event to notify your application. Because the cache has spread objects across the servers within the farm, it can simultaneously signal timeout events on different servers to scale the event handling load. The cache also takes of server failures by re-signaling events as necessary to ensure that they are reliably delivered. So your application automatically benefits from the cache's scalable, highly available event handling without any code development on your part.
Because objects are automatically load-balanced across the farm, the cache also provides an excellent means for distributing your application's workload, as we just saw with the event handling example. ScaleOut Software will continue to add API capabilities to simplify the design of server farm and "grid computing" applications so that you can take full advantage of the scalability and high availability provided by distributed caching. Stay tuned for more exciting developments in the months to come.
|