Access Data Fast

In-Memory, Object-Oriented Storage for Fast Access

Delivering fast data access requires keeping data close to where it is used while enabling sharing by multiple clients and scaling to handle large workloads. Unlike database servers and file systems, which are designed for long term data storage and “mostly read” access, ScaleOut’s in-memory data grid (IMDG) was specifically created to handle fast-changing data with minimum latency.

To ensure fast data access — and, at the same time, simplify application design — IMDGs directly integrate with business logic using an object-oriented representation for stored data. This avoids the complex transformations needed to use data storage providers (such as relational databases), and it fully leverages object-oriented languages (such as Java, C#, and C++) in which business logic typically is written. Straightforward APIs for creating, reading, updating, and deleting objects accessed by unique, shareable keys minimize access time and simplify integrating an IMDG into business logic. Together, these characteristics ensure fast access, and they streamline development.

Client Caching for Near “In Process” Latency

IMDGs use “out-of-process” storage to keep their services securely separated from client applications and to facilitate scaling by adding servers. To mitigate the serialization and network data transfer overheads associated with out-of-process storage, ScaleOut’s IMDG incorporates client caching within its API libraries. This results in fast read access times that are very close to the performance of an “in-process” store and significantly better than access times for all “out-of-process” stores, such as database servers. For example, the chart below shows 6X faster access time for ScaleOut’s IMDG in comparison to a database server.

Comparison of Read Access Time for ScaleOut’s IMDG vs. a Database Server

Scale Without Bottlenecks

IMDG’s Peer-to-Peer Architecture Scales for Fast Access

Maintaining fast data access as the workload increases requires a scalable architecture. ScaleOut’s IMDG and compute engine use a fully peer-to-peer design to scale both access and computational throughput without bottlenecks. Compare the scalability of an IMDG to that of a database server, which experiences rapidly increasing access times as its maximum throughput is reached:

An IMDG scales to maintain fast data access as the workload increases.

Unlike a database server, ScaleOut’s IMDG maintains fast access times as the workload grows and as the IMDG’s throughput is scaled by adding servers. Also, notice that the IMDG’s access time is always lower than a database server due to the use of client-side caching.

Using actual measurements from a test lab with 64 blade servers and a 20 Gbps Infiniband network, the following graph shows how the IMDG’s throughput scales as the workload grows and servers are added to the IMDG. Because of the IMDG’s peer-to-peer architecture which avoids centralized bottlenecks, throughput increases linearly, and access times stay fast. Only the network’s bandwidth forms the ultimate limit to linear scalability.

Linear Throughput Scaling with Increasing Workload

Integrated Compute Engine Scales for Fast Results

Throughput scaling extends beyond data access to computing using ScaleOut’s integrated, in-memory compute engine, which performs data-parallel computations using a technique called “parallel method invocation.” This compute engine delivers fast results with linearly scalable throughput by accessing local data stored within the IMDG’s servers. Integrating the IMDG and compute engine avoids data motion to/from disk and reduces network overhead in comparison to separate compute clusters.

To see the performance benefits of ScaleOut’s integrated compute engine, compare its performance in a financial services application to that of a separate compute cluster, which must access IMDG data across the network (denoted as “random access” in the graph). By avoiding network overhead, ScaleOut’s compute engine delivers linear throughput scaling as the workload grows, while random access quickly saturates the network.

Linear Throughput Scaling for ScaleOut’s Parallel Method Invocation

ScaleOut hServer uses ScaleOut’s in-memory computing architecture to run Hadoop MapReduce applications significantly faster than standard Hadoop platforms, while analyzing fast-changing, in-memory data within an operational environment. A recent demonstration of ScaleOut hServer in a financial services application for a hedge fund showed 40X faster analysis times than Apache Hadoop.

Read the full case study in which ScaleOut hServer tracked portfolio changes for a hedge fund and created real-time trading alerts.

See the video of the case study which shows ScaleOut hServer completing its analysis in under 350 msec. while Apache Hadoop requires more than 15 seconds.

Handle Large Workloads

Track and Analyze a Terabyte of Fast-Changing Data

ScaleOut’s IMDG and integrated compute engine scale in-memory storage, access throughput, and computing power just by adding servers. With today’s memory and CPU technology, this architecture can hold and analyze large data sets in memory, transparently harnessing the large memory capacity and many cores in a cluster of servers. Cloud computing further enhances this value proposition by making powerful and elastic computing resources available on demand.

The following benchmark test illustrates the power of in-memory computing on a large workload. This test measured ScaleOut StateServer Pro’s scalability on a data set that grew to one terabyte in size for an application in financial services. This application modeled a continuous analysis of stock trading strategies over a large pool of stock histories while stock data was dynamically updated using a simulated market feed. Testing was performed on a compute cluster with 75 virtual servers running in the Amazon Web Services EC2 cloud environment.

The measurements show that ScaleOut StateServer Pro was able to complete a full analysis of a terabyte data set within 4.1 seconds while this data was being updated at the rate of 1.1 GB/second. They also demonstrated linear throughput scaling, which ensures that analysis times do not increase as the workload grows.

Fast Analysis of a 1TB Data Set with Linear Throughput Scaling

This benchmark test vividly demonstrates the power of ScaleOut’s in-memory computing technology to provide operational intelligence for live data sets. This technology can be applied in diverse applications, including financial services, e-commerce, retail, media, and the Internet of Things. For example, in a recent proof of concept demonstration for cable television, ScaleOut StateServer Pro tracked channel change events flowing from 10M simulated set-top boxes at the rate of more than 30K events/second, correlating and enriching this data in milliseconds while analyzing the entire data set in under 5 seconds. With ScaleOut’s in-memory computing, the power of operational intelligence is now within reach.

Scalable Performance at
In-Memory Speed

In-Memory Computing that Scales

Keep fast-changing data in memory.

Extend the data warehouse to live data.

Access Data Fast

In-Memory, Object-Oriented Storage for Fast Access

Client Caching for Near “In Process” Latency

Scale Without Bottlenecks

IMDG’s Peer-to-Peer Architecture Scales for Fast Access

Integrated Compute Engine Scales for Fast Results

Handle Large Workloads

Track and Analyze a Terabyte of Fast-Changing Data

Try ScaleOut for free

Scalable Performance at In-Memory Speed

In-Memory Computing that Scales

Keep fast-changing data in memory.

Extend the data warehouse to live data.

Access Data Fast

In-Memory, Object-Oriented Storage for Fast Access

Client Caching for Near “In Process” Latency

Scale Without Bottlenecks

IMDG’s Peer-to-Peer Architecture Scales for Fast Access

Integrated Compute Engine Scales for Fast Results

Handle Large Workloads

Track and Analyze a Terabyte of Fast-Changing Data

Try ScaleOut for free

Scalable Performance at
In-Memory Speed