ScaleOut Product Suite Help


Introduction

Welcome

Welcome to ScaleOut’s Suite of Products, Version 5 for Windows

Thank you for selecting ScaleOut Software’s product suite for in-memory data grid and in-memory computing. This software product runs on every server within a Web or application server farm to store mission-critical, workload data. ScaleOut’s in-memory data grid with integrated, in-memory distributed caching provides extremely fast access to your critical but fast changing data, and its performance and capacity grow as you add servers to the farm. The software automatically replicates stored data between servers so that critical data are not lost if a server fails, and it maintains scalable, highly available access that a standalone database server (or even a failover database cluster) cannot duplicate. You can also use the Windows version to transparently save and retrieve ASP.NET session-state. All together, ScaleOut StateServer delivers a highly effective middleware storage tier for e-commerce session-state and other mission-critical, workload data combined with powerful tools for in-memory computing, including stateful stream processing and real-time, parallel data analysis.

The following help topics explain how to use ScaleOut’s product suite. Step-by-step instructions for installation and configuration help you to get your in-memory data grid up and running quickly and easily. Additional topics help you to troubleshoot problems and obtain technical support if necessary.

We want your feedback! Please send your comments on the product, documentation, or Web site to [email protected]. Thank you.

What Is an In-Memory Data Grid?

To capture the evolution of distributed caching and its integration with other advanced technologies, ScaleOut Software uses the term in-memory data grid to describe its scalable, distributed in-memory data storage. Also called distributed data grids, in-memory data grids combine distributed, in-memory caching with powerful analysis and management tools to give you a complete solution for managing fast-changing data in a server farm or compute grid. Distributed, in-memory data grids now have become an essential component of scalable, mission-critical applications and are increasingly relied upon for data-parallel analysis and computation.

What is In-Memory Computing?

Beyond just serving as a fast, scalable repository for live data, in-memory data grids provide the foundation for stateful stream processing and real-time analytics on in-memory data. By harnessing the scalable computing power of the server clusters on which they run, IMDGs enable large, in-memory data sets to be analyzed in parallel, delivering immediate results and important feedback to live systems. While they can serve distributed queries to select data of interest for client applications, their real power lies in the ability to host data-parallel computations within the grid — moving computing to where the data lives — to deliver blazing performance and eliminate bottlenecks to scalability.

ScaleOut’s Product Suite

ScaleOut Software suite of products comprises:

  • ScaleOut StateServer®: scalable, in-memory data grid for Windows and Linux with integrated parallel query
  • ScaleOut ComputeServer®: in-memory computing for real-time, data-parallel analysis; includes all the features of ScaleOut StateServer
  • ScaleOut StreamServer™: in-memory computing for stateful stream processing and real-time, data-parallel analysis; includes all the features of ScaleOut StateServer and ScaleOut ComputeServer
  • ScaleOut hServer®: in-memory computing for source code-compatible Apache Hadoop MapReduce on in-memory data
  • ScaleOut GeoServer®: optional extension to ScaleOut StateServer for WAN-based data replication and global data access
  • ScaleOut SessionServer™: subset of ScaleOut StateServer for scalable, highly available ASP.NET session-state storage

What’s New in Version 5.7

Version 5.7 contains numerous performance enhancements that accelerate server-to-server communications and event processing under heavy load and eliminate bottlenecks. In addition, the execution path for single method invocations has been streamlined to reduce overhead and latency. New performance counters report rates for event posting, queries, single method invocations, and parallel method invocations. Object metadata and the ScaleOut Object Browser now report object creation and last update times.

With this release, the open source time windowing libraries for managing streaming events in ScaleOut StreamServer has been upgraded to a full production version. The ASP.NET Core 2.0 NuGet package also has been upgraded with support for remote client (public) gateways. Multicast discovery now can be configured dynamically and globally without the need for a server restart.

Version 5.7 also introduces a preview of the ScaleOut Web Console, which lets users manage ScaleOut’s in-memory data grid from a web browser running on a local network. The web console offers all of the management capabilities of the ScaleOut Windows Management Console.

What’s New in Version 5.6

Version 5.6 introduces ScaleOut StreamServer™, a new software platform for stateful stream processing. This platform offers important new capabilities for analyzing streaming data by enabling applications to model and track the behavior of data sources instead of just analyzing the telemetry they emit. This allows applications to implement deeper introspection and more effective alerting on streaming data across a wide range of applications, including medical device monitoring, financial services, manufacturing and logistics, and the Internet of Things (IoT). ScaleOut StreamServer includes all of the features and capabilities of ScaleOut ComputeServer.

ScaleOut StreamServer includes open source time windowing libraries for managing streaming events; these libraries are available for Java and .NET on GitHub. It also includes support for posting events using the ReactiveX APIs and for integrating with Kafka message queues.

Version 5.6 also adds support for distributed caching to Microsoft’s ASP.NET Core 2.0 platform that lets application developers transparently take advantage of ScaleOut StateServer’s in-memory data grid technology. This distributed caching library is made available as a NuGet package.

Other features included with version 5.6 include support for Docker containers, OpenSSL 1.1 support, and enhancements to the ScaleOut Object Browser.

What’s New in Version 5.5

Version 5.5 introduces a new .NET API called Distributed ForEach for data-parallel programming in ScaleOut ComputeServer. Modeled after .NET’s Parallel.ForEach, this operator lets developers easily structure data-parallel computations that span all (or a queried subset of) objects within a name space in the in-memory data grid. This enables applications to handle much larger workloads than would be possible on a single server, deliver scalable throughput by adding servers, and maintain fast execution times. In addition, this operator streamlines garbage collection during parallel execution to deliver the best possible performance.

Version 5.5 also adds asynchronous .NET APIs for grid access and query. These APIs let applications fully integrate into .NET applications that use the async/await asynchronous programming model. Other new features for Windows include new PowerShell cmdlets, which enable IT administrators to use .NET’s PowerShell scripts to deploy and manage the in-memory data grid, and support for ISO 19770-2 software tagging that helps system administrators identify software assets.

This version adds important new optimizations that reduce memory usage for stored objects and accelerate performance. By default, all objects are now allocated on the heap instead of using pre-allocated memory buffers, and query indexes are no longer allocated unless in use. Also, memory overhead for string keys has been reduced and integrated into object allocation for higher efficiency.

Version 5.5 introduces a preview of distributed, push-based notifications for C# and Java. This new feature adds operators compatible with the popular ReactiveX library and lets applications scale the throughput of real-time event processing by transparently distributing notifications across the in-memory data grid and its integrated compute engine.

What’s New in Version 5.4

Version 5.4 incorporates numerous performance enhancements designed to take full advantage of large, multicore systems. All aspects of ScaleOut’s internal implementation have been redesigned to distribute the workload across all available cores and extract maximum performance. In addition, the Object Browser has been enhanced to handle very large numbers of objects with faster performance and lower memory usage.

The Windows version of ScaleOut StateServer now includes the Windows Server AppFabric Caching Compatibility Library. This API provides complete, source-level compatibility with the Windows AppFabric Caching API, including support for regions, tag-based query, and event notifications. In most cases, applications previously designed to use Windows AppFabric Caching as a distributed cache can easily migrate to ScaleOut StateServer with only a recompile. In addition, these applications can access ScaleOut StateServer’s extended functionality, such as fully distributed LINQ query, by calling native APIs side-by-side with AppFabric Caching API.

Version 5.4 adds support for Windows PowerShell cmdlets to manage the in-memory data grid. To assist former AppFabric Caching users, it also includes aliases for the corresponding AppFabric Caching cmdlets where applicable. Because of its highly scalable, peer-to-peer design, ScaleOut StateServer makes administration of an AppFabric Caching-compatible distributed cache much easier than ever before.

What’s New in Version 5.3

Version 5.3 adds new APIs for operational intelligence across ScaleOut’s product offerings. This release introduces ScaleOut ComputeServer™, which integrates a scalable, in-memory compute engine within ScaleOut’s in-memory data grid and lets applications perform fast, data-parallel computations on memory-based data; this product replaces ScaleOut Analytics Server®. Version 5.3 also adds several optimizations which enhance the performance of parallel method invocations within the .NET client libraries to take better advantage of large multicore systems.

Complementing support for executing standard MapReduce applications in ScaleOut hServer®, new SimpleMR APIs simplify and streamline MapReduce applications by avoiding the need for standard Hadoop libraries. These APIs are integrated directly into ScaleOut’s in-memory compute engine and in-memory data grid to deliver extremely fast execution times. SimpleMR eliminates the need to install and reference Hadoop MapReduce libraries from standard distributions in order to run in-memory, data-parallel computations with MapReduce semantics, further reducing execution times. These APIs are available for both Java and C#, and now .NET applications can run in-memory MapReduce.

ScaleOut StateServer® extends its property-based query APIs for in-memory data grids with the introduction of InvokeFilter methods for both Java and C#. This new feature allows applications to run data-parallel methods which can analyze properties when selecting objects within a parallel query. Applications now can eliminate the restrictions imposed by standard query techniques and harness the power of data-parallel computation to implement much deeper query analysis. In C#, invoke filters are integrated into Microsoft LINQ as extension methods which simplify program structure.

What’s New in Version 5.2

Version 5.2 increases ease of use and deepens support for operational intelligence across ScaleOut’s product offerings, including ScaleOut Analytics Server® and ScaleOut hServer®. Version 5.2 introduces REST APIs, support for Apache Hive and YARN, and .NET APIs for small object storage.

The new REST API service allows customer applications to easily access objects in ScaleOut’s in-memory storage using HTTP with built-in SSL for security. Objects now can be remotely accessed from applications written in almost any programming language. This new web service can be deployed either using its own built-in, high-performance embedded web server or as a FastCGI module behind an existing web server.

Version 5.2 includes support for Hadoop YARN, enabling Hadoop MapReduce applications to take advantage of ScaleOut hServer’s fast in-memory execution engine and integrated, in-memory data storage. This new capability lets ScaleOut hServer function as an alternative MapReduce execution framework within a Hadoop YARN cluster, running MapReduce applications with significantly lower execution times and zero code changes. ScaleOut hServer lets MapReduce applications analyze live, operational data and has demonstrated more than 40x faster execution than Apache MapReduce in benchmark testing.

Version 5.2 also allows ScaleOut hServer users to run Apache Hive queries using hServer’s fast, in-memory execution engine, thereby accelerating execution and enabling query of in-memory data. Standard Hive distributions can run queries without changes using ScaleOut hServer as a MapReduce framework under YARN. Most popular Hive distributions, including those from Cloudera and Hortonworks, are compatible with ScaleOut hServer.

Complementing ScaleOut’s existing Java support for large numbers of small objects, version 5.2 adds full small-object support for .NET users through a new "NamedMap" API in the Soss.Client.Concurrent namespace. This new storage model streamlines in-memory storage and accelerates parallel analysis when using ScaleOut’s Parallel Method Invocation engine. To maximize ease of use, APIs for this storage model follow the standard semantics of Java’s NamedMap and .NET’s ConcurrentDictionary, adding additional methods those familiar interfaces in order to support parallel query and data-parallel analysis.

What’s New in Version 5.1

Version 5.1 further expands the features introduced in version 5.0 and make ScaleOut StateServer even faster, more secure, more adaptive, and more versatile.

Starting with version 5.1, ScaleOut StateServer uses a new server-unit licensing (SUL) model, where a server unit is defined as 8 logical processors. Previous versions of ScaleOut StateServer were licensed under a per-host model. Existing customers upgrading to version 5.1 should contact their sales representative with any licensing questions.

Version 5.1 adds support for ScaleOut hServer to Windows. Scaleout hServer extends StateServer’s analytics capability to the Hadoop market, integrating its in-memory data grid and computation engine with Hadoop technologies, which enables Hadoop MapReduce code to execute in parallel and in-memory without necessitating a Hadoop cluster. Alternatively, ScaleOut hServer can be used as an HDFS cache in an existing Hadoop environment, greatly accelerating data access for repeated HDFS operations.

Version 5.1 welcomes the addition of a native, open source C++ API to the existing Java, .NET, and C APIs. The first release of this new API brings the Named Cache to C++ applications, including support for many advanced Named Cache features such as parallel query, backing store integration, and an in-process deserialized client cache.

ScaleOut StateServer now optionally uses secure connections between clients and hosts and between remote sites, encrypted with industry-standard SSL technology. This enables SOSS to be deployed in environments where plain-text transmission of serialized object data is a security concern, such as over untrusted WAN links.

The ScaleOut GeoServer option has been extended to support cloud-hosted environments, such as Amazon EC2 and Microsoft Azure, with minimal configuration, enabling your application data to be replicated to and accessible from the cloud. This feature enables cloud-hosted stores to benefit from GeoServer’s redundancy and protection against complete datacenter outages without data loss.

Support for deploying ScaleOut StateServer in public cloud environments has been extended with additional configuration options. Amazon Web Services deployments can now launch instances into a Virtual Private Cloud (VPC), and Microsoft Azure deployments can now select either Windows Server 2012 or Windows Server 2012 R2, in addition to Windows Server 2008 R2, as the base operating system.

ScaleOut StateServer’s internal network protocols have been significantly improved. The transport protocol used during internal load balancing has been optimized to deliver up to 5X higher performance, resulting in faster load balancing during membership changes and recovery due to a host failure. The heart-beating protocol used to determine overall store health has been enhanced with new adaptive heuristics, reducing the chance of heartbeat failure in heavily congested networks, especially in virtual server environments.

What’s New in Version 5.0

Version 5.0 introduces several exciting new features that dramatically extend ScaleOut StateServer’s capabilities. These features make version 5.0 significantly faster, more scalable, and cloud-ready.

ScaleOut StateServer’s membership architecture has been redesigned to enable the in-memory data grid to easily scale well beyond 100 hosts. This enables SOSS to take full advantage of the elastic computing resources quickly becoming available in public and private clouds. The new membership mechanism also lets hosts join and leave the in-memory data grid significantly more quickly. To accommodate cloud and other enterprise infrastructures, the use of UDP multicast can now be disabled and SOSS can be restricted to using only TCP for all network communication by manually configuring host groups. Please see the Introduction to Management section for more details.

The ScaleOut GeoServer option has been extended with a new replication model that enables applications to transparently access stored objects from remote SOSS stores as they are needed. Called pull-based replication, this new capability allows objects to be shared by a geographically diverse network of in-memory data grids, with policies on each object dictating how frequently remote datacenters should update their copies of the object. Furthermore, the authoritative "master" copy of an object can migrate from datacenter to datacenter as demand dictates.

Version 5.0 adds a major new method for querying the in-memory data grid. Prior to version 5.0, the grid was queried using metadata-based index values assigned to stored objects. Now the C# or Java properties of stored objects can be directly queried. C# applications can make use of .NET’s Language Integrated Query (LINQ) to structure queries using SQL-like semantics. Java applications can make use of "filter methods" to compose queries with logical and comparison operators. For high performance, SOSS’s client libraries automatically extract selected properties and store them as deserialized data during object updates. In addition, to ensure fast parallel queries, index tables are transparently used to accelerate each host’s portion of a parallel query.

ScaleOut Compute Server™ extends ScaleOut StateServer`s capabilities for parallel computation on stored data by adding support for invocation grids to prestage application code on grid servers for use in Parallel Method Invocation (PMI). This new capability dramatically simplifies code deployment for parallel data analysis.

In addition, a new feature called Single Method Invocation (SMI) lets C# and Java applications invoke a method on a specified object, supply parameters to invocation, and receive the method’s result value. Because of its highly optimized implementation which avoids all unnecessary network copies, SMI can be used to efficiently analyze a targeted set of stored objects as an alternative to the map/reduce model provided by PMI. In addition, applications can use SMI to efficiently update stored objects without replacing their full contents in a manner similar to the use of stored procedures in database systems.

With version 5.0, ScaleOut StateServer introduces two mechanisms for authorizing access to named caches within the in-memory data grid. (In this release, these mechanisms are intended for use within a secure datacenter by a single organization and do not secure the in-memory data grid from malicious attack.) The default login mechanism checks the application’s current login name against a list of authorized login names that have been associated with the named cache using the soss.exe command line program (see Command-line Control Programs). This management tool also can be used to authorize either read/write access or read-only access. The user also can implement an extensible authorization policy. When the application logs in to a named cache, the SOSS client passes the application’s encoded credentials to the user’s authorization provider, which is associated with SOSS using the soss.exe management tool. This provider validates the credentials using a user-defined mechanism and then returns an authorization ticket back to SOSS along with read/write or read-only authorization.

Document Version 5.7.0

©2018 by ScaleOut Software, Inc.

ScaleOut StateServer, ScaleOut ComputeServer, ScaleOut hServer, and ScaleOut GeoServer are registered trademarks and ScaleOut StreamServer, ScaleOut SessionServer, and ScaleOut Management Pack are trademarks of ScaleOut Software, Inc. Windows is a registered trademark of Microsoft Corporation; Windows Azure and Microsoft Azure are trademarks of Microsoft Corporation in the United States and other countries. Amazon Web Services, AWS, Virtual Private Cloud, VPC, Elastic Compute Cloud, and EC2 are either trademarks or registered trademarks of Amazon Web Services, LLC or its affiliated companies. Hadoop is a registered trademark of the Apache Software Foundation.

Notice: While ScaleOut Software strives to ensure the accuracy of the information contained herein, product requirements, specifications, and limitations are subject to change without notice.

Overview

ScaleOut StateServer®

ScaleOut StateServer provides a software-based, distributed, in-memory data grid, simply called a store, for mission-critical workload data. Intended for use within a data center, the software is installed as a Windows service on all servers within your Web or application server farm, and uses your Web farm’s existing LAN. The software stores, reads, updates, and removes contiguous, opaque, binary data objects such as session-state, e-commerce shopping carts, cached DBMS data, and business logic state, based on an identifying 256-bit key. These data objects either have been serialized from datasets or record sets that were previously accessed from a database or are generated as application objects. Stored objects can be uniformly accessed from any server in the farm.

The following diagram shows ScaleOut StateServer installed on a Web farm with four servers:

images/diagrams/WebFarm.png

Once installed and configured, ScaleOut StateServer provides fast, scalable in-memory storage for applications by saving and retrieving objects using powerful, easy-to-use APIs. It also operates in a transparent manner to Microsoft ASP.NET applications by saving and retrieving session-state objects for Internet clients. Under normal operations, the software automatically balances the amount of storage used by each server in the farm, adjusting the relative usage by each server to the amount you specify. To ensure highly available access, up to three copies of each stored object are maintained on different servers. If a server fails or loses network connectivity, ScaleOut StateServer automatically retrieves its session objects from replicas stored on other servers in the farm, and it creates new replicas to maintain redundant storage. When an offline server later rejoins the farm, it automatically regains its share of the storage workload.

As new servers running ScaleOut StateServer are added to the Web farm, they automatically expand the distributed cache’s storage capacity and scale its aggregate throughput once enough servers have been added to hold all replicas. This helps ensure that fast response times are maintained even as the Web farm grows to handle increasing load.

In summary, the key features of ScaleOut StateServer include:

  • self-discovery and self-aggregation of servers within a farm,
  • uniform access to stored data objects from any server in the farm,
  • scalable throughput as servers are added (beyond the minimum number needed to hold all replicas) and the storage load grows,
  • automatic partitioning and dynamic load-balancing of data objects among participating servers to maximize scalability,
  • local caching of recently used objects to minimize access times,
  • automatic, intelligent replication of data objects on up to three servers to provide high availability,
  • automatic detection of host and communication failures followed by fast recovery for single failures,
  • recovery usually without data loss after failure of any one or two servers (for farms with three or more servers and depending on the number of replicas specified),
  • automatic self-healing to restore data redundancy after permanent server failures,
  • highly available operation even after failure of N-1 servers within an N server farm,
  • fine-grained control of network usage that adapts to slow networks and virtualized servers,
  • transparent storage of ASP.NET session-state objects without programming changes,
  • programmatic access to the store with asynchronous event handling for .NET, Java, and unmanaged C/C++ applications,
  • programmatic access to the store for HTTP-enabled applications via a REST API service,
  • global management of the distributed store from any participating server,
  • networked access to the distributed store from remote client systems,
  • optional data replication and transparent access across multiple SOSS stores running in multiple data centers using the ScaleOut GeoServer option,
  • parallel query and parallel data analysis using an integrated computational engine (with the optional ScaleOut ComputeServer),
  • a complete set of Performance Counters for each SOSS host that report memory usage, access rates, and host status, and
  • sample code that shows how to make SOSS a second-level cache provider for NHibernate.

More about Workload Data

Relational DBMS’s have proven their value as the repository for essential line of business (LOB) data, such as inventory, purchase orders, billing records, etc. With the advent of "stateless" Web and application server farms, DBMS’s have increasingly been used to hold mission-critical but relatively short lived workload data, such as e-commerce shopping carts, SOAP requests, session-state, and intermediate business logic results. Workload data typically are updated several times prior to committing changes to the LOB database. The following table compares LOB data and workload data:

Characteristic LOB Data Workload Data

Volume

High

Low

Lifetime/turnover

Long/slow

Short/fast

Access patterns

Complex

Simple

Data preservation

Critical

Less critical

Access update ratio

~4:1

~1:1

Fast access & update

Less important

More important

To maintain quality of service, workload data are often stored in a DBMS so that they can be preserved across outages of Web or application servers. This creates significant traffic to and from the data storage tier and delays responses to clients. It also consumes resources within the expensive DBMS while making only rudimentary use of its feature-rich capabilities. By storing workload data in ScaleOut StateServer, you can avoid the overhead and expense of using a DBMS server, while simultaneously improving response time, scalability, and availability. In effect, ScaleOut StateServer provides a new, middleware-based storage tier that complements your existing storage tiers to maximize performance and cost-effectiveness.

Self-Discovery, Self-Aggregation, and Uniform Access

ScaleOut StateServer was designed to be as simple to install and use as possible. After you install ScaleOut StateServer on a server in your farm, you can create a store simply by activating the service. When you activate ScaleOut StateServer on additional servers in the farm on the network subnet, they automatically find the store that you originally created on other hosts (in a process called self-discovery), and then they join the store (using self-aggregation) and take on a portion of its workload. Application programs can uniformly access and update any stored data object from any participating server in the store.

Scalability and Load-Balancing

ScaleOut Software StateServer delivers scalable throughput and fast response time by partitioning and dynamically load-balancing workload data across the servers within a farm. It works seamlessly with an IP load-balancer and enables every server to access any workload data object stored in the farm. ScaleOut StateServer’s distributed data store eliminates the bottleneck created by a centralized DBMS and provides fast response time by avoiding disk access and DBMS overhead. It also speeds up access by allowing simultaneous access to multiple data objects stored on different servers, and performance scales as the farm grows.

When a new server is added to the farm and joined to the distributed store, ScaleOut StateServer automatically integrates the server into the store and migrates a portion of the data (called a region) to it. This technique, called load balancing, controls the amount of data stored and managed by each server in the farm and ensures that each server handles an appropriate portion of the overall workload as determined by its memory capacity and CPU speed. You can specify the total amount of memory on each server to be used for storing data objects.

images/diagrams/Partitions.png

Data Replication, Self-Healing, and Recovery

ScaleOut StateServer uses patented technology to keep your workload data safe and highly available. It automatically and intelligently replicates data across up to two additional servers using a patented quorum-based updating algorithm which ensures that updates are reliably committed, even in the case of server or network failures. The following diagram illustrates the creation of two replica objects (shown in red) for each data object (shown in blue).

images/diagrams/Replicas.png

You can adjust the number of replica objects that ScaleOut StateServer creates for each stored object (1 or 2). Setting the number of replicas lets you make the appropriate tradeoff between high availability and memory usage.

This data replication technology automatically initiates recovery and self-healing after server failures. Recovery often requires less than ten seconds to failover and resume access to the affected data (versus one or more minutes for a clustered DBMS). After initial recovery, ScaleOut StateServer removes a failed or inaccessible server from the store and reconstructs data replicas on the remaining servers. This self-healing process (which may take a minute or more to complete) restores full data redundancy in case of a subsequent server failure.

ScaleOut StateServer maintains service to clients even after up to N-1 servers have failed in an N-server store (although data may be lost if all replicas of an object are lost due to failures). The service reports if it detects the possibility that data was lost due to a multi-server failure, and then it heals the store to maintain service to clients. Although data may be lost in various failure scenarios, ScaleOut StateServer maintains and/or restores service to client applications whenever possible.

Internal Server and Client Caching

To maximize ScaleOut StateServer’s performance, each server accesses objects from local replica copies whenever possible. In addition, each server maintains a local cache of recently used objects that were accessed from other servers. This minimizes access times and reduces data motion across the network. The size of each server’s local cache can be set using a configuration parameter.

In addition to each server’s local cache of recently accessed objects, ScaleOut StateServer incorporates a transparent, coherent, internal cache for deserialized data objects within its .NET, Java, and C++ client libraries. When objects are repeatedly read from the in-memory data grid, this cache reduces access response time by eliminating data motion and deserialization overhead for objects that have not been updated in the SOSS distributed store. Performance tests with the client-side cache show a dramatic reduction in average response time. Please see the section Performance Considerations for details.

Because ScaleOut StateServer’s server and client caches are automatically kept coherent with the in-memory data grid, they operate transparently to applications (i.e., applications do not have to keep track of whether to access a client-side cache versus the grid to obtain the most recently updated data). If an application updates an object in the grid, it can be sure that any subsequent access to that object will return the latest data. This simplifies the structure of applications while maintaining fast access times.

Transparent Support for ASP.NET in Windows

The Windows version of ScaleOut StateServer transparently stores ASP.NET session state, providing a seamless environment for your ASP.NET applications. Once installed, ScaleOut StateServer automatically saves ASP.NET session state objects in its distributed store and retrieves them when needed to complete a Web request. As your IP load balancer directs Web requests to different servers within a farm, ScaleOut StateServer keeps their associated session objects immediately accessible, regardless of the server that handles the request.

By default, ScaleOut StateServer separates the session objects created by different ASP.NET applications and makes them subject to memory reclamation when necessary. However, multiple Web applications optionally can share the same session objects instead of placing them into separate application name spaces. A configuration parameter within the ASP.NET web.config configuration file specifies the name space to be used for session objects and overrides ScaleOut StateServer’s default use of the Web application’s name for this purpose. In addition, automatic memory reclamation for ASP.NET session objects can be selectively disabled with a second web.config parameter.

API Support for .NET Languages, Java, C/C++, and HTTP REST Clients

You also can directly access ScaleOut StateServer’s distributed store from .NET, Java, C/C++ applications using the application programming interfaces (APIs) supplied with the product. This gives you the flexibility you need to incorporate ScaleOut StateServer into your existing applications and delivers the best performance. The .NET APIs support all .NET languages, including C#, C++, and Visual Basic. Additional APIs support Java and standalone (“unmanaged code”) C/C++ applications, as well as any other applications with an HTTP client.

API Libraries

ScaleOut StateServer provides three sets of .NET APIs: the Named Cache APIs, the Data Accessor (DA) APIs, and the Cached Data Accessor (CDA) APIs. Most developers should use the Named Cache APIs for building distributed applications. Additionally, ScaleOut StateServer provides the Java Named Cache (JNC) APIs for Java, and Native Client APIs for C++.

The Named Cache APIs provide the core APIs that application developers should use to access and analyze data stored in ScaleOut StateServer. These APIs have collection-oriented semantics for managing groups of logically related objects. They also simplify the developer’s view of data shared across the in-memory data grid. For example, they take care of polling for object locks, and they handle synchronization issues that occur when multiple hosts attempt to create an object with the same key.

ScaleOut StateServer includes two additional sets of .NET APIs for lower level access to the in-memory data grid; they should rarely be used by application developers. The Data Accessor APIs provide low-level, direct access to the distributed store and manage state information for individual objects being accessed. The Cached Data Accessor APIs build upon the Data Accessor APIs to manage the complexities inherent in sharing, global, application-level data and thereby simplify application design.

ScaleOut StateServer also includes C and C++ APIs for use in C and C++ applications, and a REST API service for HTTP access. For more information on the APIs, please consult the associated help files listed in Components.

API-Based Access

ScaleOut StateServer’s APIs provide simple, straightforward access to the distributed store so that you can:

  • store serialized data objects identified by a string or a 256-bit key and a namespace identified by a string or a 32-bit value,
  • read previously stored data objects,
  • update previously stored data objects, and
  • remove data objects from the store.

Data objects are stored as contiguous, opaque, binary data. All data objects are uniformly accessible from any server in the farm that runs ScaleOut StateServer and participates in the distributed store.

Objects are stored in user-defined namespaces, which are typically used to store logically related objects, such as shopping carts, stock prices, etc. The Named Cache APIs integrate the use of these namespaces into the concept of object collections in C#, Java, and C++; this makes object namespaces easy and intuitive to use. Many advanced features, including object locking, backing store integration, remote store access, and parallel query, can be managed on a per-namespace basis.

Objects optionally can be locked for up to ninety (90) seconds to synchronize distributed access from multiple threads running on different hosts. For example, a server can read and lock an object to maintain exclusive access until it subsequently updates and unlocks the object. Both pessimistic and optimistic locking models are supported.

To manage object lifetimes, objects can be assigned either a fixed (absolute) or a sliding timeout value from one (1) minute to forty-five (45) days after which the object is removed from the store. (ASP.NET session-state objects use the ASP.NET session timeout value.) An object’s sliding timeout is reset whenever the object is accessed. Objects can maintain dependency relationships to other objects so that they can be expired in logically related groups. ScaleOut StateServer also automatically reclaims memory used by the least recently accessed objects (which optionally can be marked as non-reclaimable) when a user-specified threshold is reached.

Windows Server AppFabric Caching Compatibility Library

ScaleOut StateServer includes the Windows Server AppFabric Caching Compatibility Library, which provide complete, source-level compatibility with the Windows AppFabric Caching APIs, including support for regions, tag-based query, and event notifications. In most cases, applications previously designed to use Windows AppFabric Caching as a distributed cache can easily migrate to ScaleOut StateServer with only a recompile. In addition, these applications can access ScaleOut StateServer’s extended functionality, such as fully distributed LINQ query, by calling native APIs side-by-side with AppFabric Caching APIs.

In addition to native management tools, ScaleOut StateServer provides PowerShell cmdlets that implement relevant AppFabric Caching management commands. Several management steps, such as creating of a configuration store and granting Windows account access to data caches, are not required by ScaleOut StateServer.

Scalable Event Handling

Applications can catch asynchronous session and timeout events so that session and application objects can be examined and explicitly re-saved or removed after a timeout occurs. For example, session data can be saved to a database server awaiting a future login to provide very long term storage for user sessions. ScaleOut StateServer takes full advantage of its built-in scalability and high availability to automatically distribute the event handling load across the server farm and to ensure that timeout events are delivered with high availability in case a server or network outage occurs.

ScaleOut StateServer extends its scalable event handling mechanism to remote clients by automatically distributing events across remote clients. In addition, it automatically handles the failure of remote clients by redirecting events to other remote clients if they are available. (This mechanism requires that there are at least as many remote clients as there are servers within the in-memory data grid.)

Parallel Query

Applications can quickly query the in-memory data grid to obtain a list of all objects that match a set of specified criteria. ScaleOut StateServer uses fully parallel lookup and internal indexing across all servers to deliver fast, scalable query performance.

Objects can be queried by class properties in .NET, Java, and C++ applications. (Stored objects also can be assigned explicit index values as metadata for querying by all applications.) C# applications can make use of .NET’s Language Integrated Query (LINQ) to structure queries using SQL-like semantics. Java and C++ applications can make use of "filter methods" to compose queries with logical and comparison operators. For high performance, SOSS’s client libraries automatically extract selected properties and store them as deserialized data during object updates.

Backing Store Integration

ScaleOut StateServer includes support for integrating the in-memory data grid with a backing store, such as a database server or file system. This feature is available in the Named Cache APIs for C#, Java, and C++. It incorporates a rich set of capabilities which let the user choose the appropriate policy for keeping the in-memory data grid in sync with a backing store to maximize application performance and minimize the load on the backing store. Two synchronous access policies (read-through and write-through) are settable for a named cache. Two asynchronous access policies (refresh-ahead and write-back) are settable for individual objects and run on a periodic basis using a specified timeout.

Authorization

With version 5.0, ScaleOut StateServer introduces two mechanisms for authorizing access to namespaces (either on a per-namespace basis, or at a global control level) within the in-memory data grid. Authorization is disabled by default, making all namespaces fully permissive. If enabled, the default login mechanism checks the application’s current login name against a list of authorized login names that have been associated with the named cache using the set_user_auth command of the soss.exe command-line control program (see Command-line Control Programs). This management tool also can be used to authorize either read/write access or read-only access. The user also can implement an extensible authorization policy. When the application logs in to a named cache, the SOSS client passes the application’s encoded credentials to the user’s authorization provider, which is associated with SOSS using the soss.exe management tool. This provider validates the credentials using a user-defined mechanism and then returns an authorization ticket back to SOSS along with read/write or read-only authorization.

[Note] Note

In this release, these authorization mechanisms are intended for use within a secure datacenter by a single organization and do not secure the in-memory data grid from malicious attack.

ScaleOut GeoServer® Option

An increasing number of companies employ multiple data centers to improve their quality of service and to help mitigate the impact of catastrophic events such as earthquakes and floods. If one data center goes offline, its workload can be handled by another, healthy data center to avoid service interruptions. For this strategy to be effective, changes to application data must be continuously replicated to a remote site so that it is quickly ready to handle the workload.

With version 3.0, ScaleOut StateServer introduced the GeoServer option and allowed distributed caching to extend across multiple, geographically distributed data centers. This option replicates stored objects between ScaleOut StateServer stores running on server farms at different sites. The original GeoServer approach accomplishes replication by asynchronously pushing changes across WAN links as soon as objects are updated-a datacenter in New York City can continuously push its updates to Los Angeles, allowing the L.A. site to remain fully synchronized with NYC at every moment.

Version 5.0 of ScaleOut StateServer introduces a new replication model that allows a remote site to pull objects from the local site as they are needed. Pull-based replication allows objects to be shared by a geographically diverse network of in-memory data grids, with policies on each object dictating how frequently remote datacenters should update their copies of the object. Furthermore, the authoritative “master” copy of an object can migrate from datacenter to datacenter as demand dictates.

images/diagrams/grid-diagram.png

[Note] Note

The ScaleOut GeoServer option is licensed separately from ScaleOut StateServer. Its functions are enabled by a license key from ScaleOut Software. Push-only replication can be licensed with the ScaleOut GeoServer DR option. Please contact ScaleOut Software sales for details.

Push Replication

An increasing number of companies employ multiple data centers to improve their quality of service and to help mitigate the impact of catastrophic events such as earthquakes and floods. If one data center goes offline, its workload can be handled by another, healthy data center to avoid service interruptions. For this strategy to be effective, changes to application data must be continuously replicated to a remote site so that it is quickly ready to handle the workload.

The ScaleOut GeoServer option quickly and efficiently replicates cached data to a remote server farm for access after a site-wide failure. To start replication, you simply connect to a server on a remote farm using an existing virtual private network or other secure communications channel. ScaleOut GeoServer automatically configures itself to distribute traffic among servers within the remote server farm and then automatically forwards storage updates to the remote farm. To maximize performance and availability, all servers within both StateServer farms participate in data replication. As servers are added or removed at each farm, GeoServer automatically reconfigures its network connections to maintain the best possible replication performance without the need for manual intervention.

Pull replication

GeoServer pull replication was introduced in version 5.0 of ScaleOut StateServer. Pull replication differs from push replication in that objects are only transmitted across a WAN to remote datacenters as the objects are needed. Furthermore, the frequency that remote sites refresh their copies (called proxies) of an object can be adjusted by setting up a coherency policy that controls how out-of-date a replicated object may become before it is refreshed in a remote datacenter. A loose coherency policy can result in reduced bandwidth usage between sites when compared to push replication, since fewer updates are sent across the link between datacenters.

Updates performed on a remote, unlocked proxy of a replicated object will cause the updated value to be pushed across the WAN back to the datacenter containing the master copy of the object.

Ownership of an object can migrate from datacenter to datacenter. The datacenter that contains the master copy of an object will relinquish its ownership if an application in a remote datacenter acquires a lock on a proxy of the object (by either reading or locking the object). GeoServer’s pull replication architecture thereby extends ScaleOut StateServer’s distributed locking model to span datacenters, allowing a single thread on a single machine to acquire a lock so that no other client code across a globally distributed network of caches will be allowed to acquire a lock on that same object.

Because the mastership of an object can migrate from store to store, it is common for GeoServer pull replication to be configured in both directions (that is, Los Angeles would be configured to pull from New York, and New York would be configured to pull from Los Angeles). This allows the master copy of an object to migrate from datacenter to datacenter as needed. GeoServer can also manage pull replication across more than two sites-for example, if three datacenters are involved, then each site will need to be separately configured to pull from the other two. If the remote sites will not be locking any replicated objects then configuring bi-directional replication is not necessary, since the master copy of the object will always remain in the local site.

Remote Client Option

In some situations, it may be advantageous to run ScaleOut StateServer on a separate server farm that is networked to a Web or application server farm. This approach lets you provision the SOSS farm with the dedicated CPU, memory, and networking resources required to handle the largest possible storage loads. It also offloads SOSS’s use of these resources from your Web and application server farm.

To support this usage, the Remote Client option lets your application access an SOSS store from a networked computer. Once configured, the Remote Client option automatically connects to all of the servers in the SOSS distributed store, and it load-balances its access requests to the servers within the store. The Remote Client option automatically handles membership changes in the SOSS store by retrying access requests if a server fails and by tracking the addition of new servers to the SOSS store.

images/diagrams/RemoteClient.png

[Note] Note

The ScaleOut Remote Client option is licensed separately from ScaleOut StateServer. Its functions are enabled by a license key from ScaleOut Software.

ScaleOut ComputeServer®

The optional ScaleOut ComputeServer integrates a powerful computational platform with ScaleOut StateServer’s distributed, in-memory storage. This enables applications to quickly and easily analyze data stored in the in-memory data grid using a technology called "parallel data analysis." It builds upon ScaleOut StateServer’s parallel query capability to analyze data with the grid servers instead of moving queried data back into a client system for lengthy, sequential analysis.

ScaleOut ComputeServer introduces a parallel method invocation (PMI) API that lets developers quickly write data-parallel programs which implement "map/reduce" semantics on a queried set of stored objects. This powerful mechanism enables applications to obtain scalable performance by seamlessly distributing data-parallel tasks across both CPU cores and servers. Both the map and reduce methods only need to manipulate in-memory data which ScaleOut StateServer automatically accesses from the distributed cache. Versions of this new API are available for C#, Java, and C/C++.

images/diagrams/pmi.png

To simplify and automate the deployment of application code to grid servers for parallel method invocations, ScaleOut ComputeServer enables C# and Java applications to define an invocation grid prior to running parallel method invocations on a collection of objects (called a named cache). The invocation grid specifies the application’s executable file and libraries which are needed to perform an invocation. When an invocation grid is loaded, ScaleOut ComputeServer creates a set of worker processes, one per grid server, and loads the application’s executable file and libraries in preparation for running parallel method invocations. Once a named cache is associated with an invocation grid, all parallel method invocations on this named cache automatically are sent to the invocation grid’s worker processes for execution.

ScaleOut ComputeServer also includes an API called "single method invocation" (SMI) which lets C# and Java applications invoke a method on a specified object, supply parameters to invocation, and receive the method’s result value. Because of its highly optimized implementation which avoids all unnecessary network copies, SMI can be used to efficiently analyze a targeted set of stored objects as an alternative to the map/reduce model provided by SOSS’s Parallel Method Invocation (PMI). In addition, applications can use SMI to efficiently update stored objects without replacing their full contents in a manner similar to the use of stored procedures in database systems.

[Note] Note

ScaleOut ComputeServer is licensed separately from ScaleOut StateServer. Its features are enabled by a license key from ScaleOut Software and include all of the features of ScaleOut StateServer.

ScaleOut StreamServer™

ScaleOut StreamServer combines a scalable, stream-processing compute engine with an integrated, in-memory data grid (IMDG) into a powerful, unified software platform for stateful stream processing. Applications can perform lightning-fast event analysis using sophisticated in-memory state tracking to provide deep introspection and precise real-time feedback. Ideal for a wide range of applications, including the Internet of Things (IoT), manufacturing, logistics, and financial services. ScaleOut StreamServer introduces breakthrough technology for the next generation in stream processing.

Live systems generate streams of incoming events that need to be tracked, correlated, and analyzed to identify patterns and trends — and then generate immediate feedback and alerts to steer operations. With today’s ever more complex real-time systems, it’s not enough to just analyze patterns within data streams using conventional techniques. Applications need deeper introspection to extract full value from the telemetry they receive. They need to build dynamic models of data sources that they can continu­ously update and analyze. Called stateful stream processing and popularized as the “digital twin” by Gartner, this breakthrough approach can harness machine learning, neural networks, and other advanced techniques to enable deep introspection and provide precise, timely feedback for live systems.

ScaleOut StreamServer’s innovative architecture delivers both breakthrough capabilities and peak performance for stateful stream processing. It processes incoming data streams within an in-memory data grid — where the data lives — ensuring minimum latency and peak throughput. Other platforms need to pull state information from remote data stores, such as database servers and distributed caches; this creates delays and network bottlenecks. Instead, ScaleOut StreamServer delivers streamed events directly to their associated state data, enabling immediate, fully contextual processing. Its transparently scalable platform minimizes the latency required for event tracking and analysis, ensuring timely feedback and/or alerts for the largest workloads.

ScaleOut StreamServer’s capabilities are delivered as an intuitive, easy to use SDK that makes application development in C# and Java simple and straightforward. Key features and capabilities include:

  • All of the features in ScaleOut StateServer and ScaleOut ComputeServer
  • Integrated IMDG and stream-processing engine to enable digital twin models while avoiding unnecessary data motion
  • Automatic code shipping to grid servers to simplify application deployment
  • Support for Reactive Extensions APIs for fast, straightforward event processing using familiar APIs
  • Integration with Kafka connectors and producers to enable seamless connectivity to Kafka messaging pipelines
  • Transparently scalable Kafka connections that maximize messaging throughput as the workload grows
  • Comprehensive time windowing libraries that make it easy to add time windowing to digital twin models
  • Automatic event routing to associated grid objects that scales throughput as grid servers are added
[Note] Note

ScaleOut StreamServer is licensed separately from ScaleOut StateServer. Its functions are enabled by a license key from ScaleOut Software.

ScaleOut hServer®

ScaleOut hServer enables standard Hadoop MapReduce programs to access data directly from ScaleOut StateServer’s IMDG and provides a full execution engine that enables standard Hadoop MapReduce programs to execute entirely within ScaleOut hServer’s infrastructure. ScaleOut hServer runs in both Linux and Windows environments and is certified for use with both Cloudera 5 and Hortonworks 2.1.

In addition to supporting Hadoop MadReduce, applications can easily create, read, update and delete fast-changing data in the ScaleOut IMDG using straightforward Java APIs. Together, these capabilities enable you to bring the power of Hadoop’s analytics to live, operational systems.

Applications also can use ScaleOut hServer as an in-memory data cache for HDFS data sets which fit within the IMDG’s memory. In this usage model, when you run a Hadoop MapReduce, key/value pairs pass from your HDFS record readers into the mappers, and ScaleOut hServer stores them in the IMDG. On subsequent runs, it transparently reads key/value pairs from the IMDG, providing a significant speed-up in data access time.

ScaleOut hServer’s new Java API library integrates Hadoop MapReduce with ScaleOut StateServer’s in-memory data grid (IMDG). This open source library (licensed under the Apache License, Version 2.0) consists of several components: a Hadoop MapReduce execution engine, which runs MapReduce jobs in memory without using Hadoop job trackers or task trackers, and four I/O components to pass data between the IMDG and MapReduce job. The I/O components include the Named Map Input Format, the Named Cache Input Format, and the Grid Output Format that together allow MapReduce applications to use the IMDG as a data source and/or result storage for MapReduce jobs. In addition, the Dataset Input Format accelerates the performance of MapReduce jobs by caching HDFS datasets in the IMDG.

Using ScaleOut hServer, developers can write and run standard Hadoop MapReduce applications in Java, and these applications can be executed stand-alone by ScaleOut hServer’s execution engine. The Apache Hadoop distribution does not need to be installed to run MapReduce programs; it is only needed to optionally make use of other Hadoop components, such as the Hadoop Distributed File System (HDFS). (If HDFS is used to store data sets analyzed by MapReduce, ScaleOut hServer should be installed on the same cluster of servers to minimize network overhead.) ScaleOut hServer’s execution engine offers very fast job scheduling (measured in milliseconds), highly optimized data combining and shuffling, in-memory storage of intermediate key/value pairs within the IMDG, optional use of sorting, and fast, pipelined access to in-memory data within the IMDG for analysis. In addition, ScaleOut hServer automatically sets the number of splits and partitions for IMDG-based data. Lastly, the performance of the Hadoop MapReduce engine automatically scales as servers are added to the cluster and IMDG-based data is automatically redistributed across the cluster.

Developers can use ScaleOut hServer’s Java APIs to create, read, update, and delete objects within the IMDG. This enables MapReduce applications to input "live" data sets which are stored and updated within the IMDG. Complex IMDG-based objects can be stored within a named cache, which provides comprehensive semantics, such as object timeouts, dependency relationships, pessimistic locking, and access by remote IMDGs. These objects are input to MapReduce applications using the Named Cache input format. Alternatively, large populations of small key/value pairs can be efficiently stored within a named map, which provides highly efficient memory usage and streamlined semantics following the Java concurrent map model. These objects can be input to MapReduce applications using the Named Map input format. The Grid output format can be used to output objects from MapReduce applications to both a named cache or a named map.

ScaleOut hServer is available as a free community edition for use on up to 4 servers or 256GB of data, or for unrestricted use with a commercial edition.

Management Tools

You can manage ScaleOut StateServer using three management tools:

The management console uses a centralized, graphical user interface that gives you the status and performance of ScaleOut StateServer’s distributed store and of its participating servers, called hosts. You also can configure and control individual hosts from the console, which runs on any server in the farm that runs ScaleOut StateServer. You can join all hosts to the distributed store or have all hosts leave the store with one command. In addition, you can simultaneously restart the StateServer service on all hosts to form a new distributed store. Using the Remote Client option, you can manage a remote SOSS store from a networked administrative workstation. The management console includes real-time performance charting and a "heat map" that shows activity and health of the in-memory data grid.

The soss.exe command line program provides all of the capabilities of the management console with individual commands that you can run from a command prompt. You can use soss.exe to quickly obtain status information, make configuration changes, control hosts, or wait to be notified of a status change. You also can use this tool to incorporate control of ScaleOut StateServer into your command-line scripts.

The geos.exe command line program adds support for the GeoServer option. When used in conjunction with the soss.exe command line program, it provides all of the capabilities of the management console.

Optional ScaleOut Management Pack

The ScaleOut Management Pack adds important enhancements to ScaleOut StateServer’s built-in management tools for managing, analyzing, and protecting data stored in ScaleOut StateServer’s distributed grid. The Management Pack contains two components: an object browser for visually browsing and managing objects stored in the in-memory data grid and a parallel backup and restore feature for archiving its contents in the file system.

Object Browser

The ScaleOut Object Browser lets you directly browse data stored within the in-memory data grid. This gives developers and administrators a unique new means of accessing the contents of the data grid, including both metadata and serialized data for individual C/C++, Java, and .NET objects. In addition, the object browser can load .NET assemblies so that it can deserialize .NET objects and display properties/fields from custom classes, including items within ASP.NET session objects.

images/Browsing.png

The object browser can also help you manage stored data. For example, you can browse the data grid to find specific objects by name or find all objects in a particular named cache within the grid that may be of interest. Objects can be sorted by name, size, or other attributes. The object browser also lets you clear individual objects, groups of objects, or the entire data grid.

Parallel Backup/Restore

The ScaleOut Parallel Backup and Restore Utility saves and restores all of the data within the data grid or a selected named cache to and from the file system. Backup/restore operations are allowed while the grid is active. This utility delivers extremely high performance, and its unique, fully parallel architecture ensures that backup/restore operations never become a bottleneck as the grid grows to handle large volumes of data. For maximum speed and scalability, backup/restore operations are performed in parallel on all grid servers.

Beyond archiving grid data for disaster recovery, the ability to perform backups while the data grid is active has many additional uses. For example, snapshots of the grid’s data can be captured at key times and saved for later analysis, and multiple snapshots can be taken over time to analyze trends.

Backup/restore operations are initiated and controlled from the ScaleOut Management Console. The console also reports on the status of an ongoing operation and automatically coordinates requests from multiple consoles. ScaleOut StateServer’s command-line management program also can be used to manage backup and restore requests.

By default, backups are targeted to a unique, time-stamped set of files within ScaleOut StateServer’s backup directory on each grid server. You can also specify a path name to a file share accessible by all grid servers, and all backup files automatically will be merged into that directory instead. For maximum flexibility in managing backed up data, you can collect the backup files created in parallel on the grid servers and then restore them from a single file share.

Together, the Management Pack’s capabilities dramatically increase both the grid’s visibility and manageability. Your data grid applications can benefit in numerous ways. For example, ecommerce applications can use the object browser to inspect customer shopping carts in real-time, tracking sales and managing inventory to optimize supply chain management. Likewise, financial applications can use the backup and restore utility to quickly and repeatedly take snapshots of data for later analysis of market trends, which is particularly useful for portfolio risk analysis and algorithmic trading scenarios.

[Note] Note

The ScaleOut Management Pack is licensed separately from ScaleOut StateServer. Its functions are enabled by a license key from ScaleOut Software that includes support for this option.

Advantages

Advantages over a Database Server

ScaleOut StateServer is intended to complement and offload your database management server (DBMS) in managing mission-critical, business data. Your DBMS does an outstanding job of storing and retrieving large volumes of slowly changing data (such as purchase orders, customer and employee records, and inventory) over the long term. However, ScaleOut StateServer has several advantages over the use of a DBMS (or a standalone storage server) to store the fast changing, application-generated workload data your server farm must handle. Here’s why:

  • It eliminates the bottleneck created by a centralized DBMS by allowing simultaneous access to multiple data objects stored on different servers. ScaleOut StateServer’s fast, in-memory storage also avoids the additional overheads associated with disk access and with the use of a DBMS. The following diagram illustrates the performance bottleneck created by a centralized DBMS:

images/diagrams/dbms_bottleneck_example.png

  • Unlike a centralized DBMS, ScaleOut StateServer’s performance grows as the server farm grows. This keeps response times low, and gives you the flexibility to scale your infrastructure to meet increased business demands. Because you just add servers to the farm to increase performance, your incremental cost remains low compared to replacing a large, back-end database server. The following figure illustrates the power of ScaleOut StateServer’s scalability:

images/diagrams/web_scaling_example.png

  • ScaleOut StateServer makes use of the N-way redundancy inherent in a server farm. When a server fails, the remaining servers take on the failed server’s share of the workload data, and they maintain highly available access to stored data. This also allows you to take a server down for planned maintenance while maintaining the service’s high availability. A standalone DMBS or storage server usually cannot be taken offline without disrupting operations.

images/diagrams/failure_example.png

  • ScaleOut StateServer typically recovers much faster than a clustered DBMS because it doesn’t have to restart the service process on another server. All servers in the farm are active and can handle access requests. In most cases, ScaleOut StateServer only requires about ten seconds or less to detect that a server is unavailable and route access requests to other servers.
  • ScaleOut StateServer’s self-discovery, self-aggregation, and self-healing technology make it easier to manage than a complex, clustered DBMS. Keeping management overhead to a minimum lowers your costs and improves reliability
  • Using ScaleOut StateServer, you can keep your workload data close to where it is needed by using the distributed cache as an intermediate storage tier. This lets you optimize performance, and if your DBMS is kept behind a firewall, it avoids the need to traverse the firewall to store rapidly changing data.
  • ScaleOut StateServer’s simple, straightforward semantics make it easy to store and retrieve serializable data objects in your application programs. This lets you avoid the complexity and overhead of database access for managing simple program objects.
  • ScaleOut StateServer typically is significantly less expensive (at least 3X lower cost and 10X lower cost/update) than even low cost, failover database clusters.
  • ScaleOut GeoServer’s data replication is a fast and cost-effective alternative to using DBMS replication for synchronizing cached data across remote sites.