ScaleOut ComputeServer® Java Simple MapReduce Programmer's Guide


This programmer’s guide describes how to use ScaleOut ComputeServer’s Simple MapReduce framework for real-time operational intelligence. The following examples are intended to supplement the ScaleOut StateServer help file. For complete installation instructions, information on configuration parameters, and other information regarding the in-memory data grid, please consult the ScaleOut StateServer help file.

Installation of the IMDG

Please refer to the ScaleOut StateServer help file for instructions on installing the IMDG service on a cluster of servers.

The following "quick start" instructions for installing the IMDG on RedHat Linux will get you started. For each server in the cluster:

  1. Download the RPM file from the ScaleOut Software web site.
  2. Install the RPM: sudo rpm -ivh soss-5.4.1-253.el6.x86_64.rpm (It will be installed into /usr/local/soss5.)
  3. Verify the daemon is running: soss query
  4. Configure the network settings to bind the grid service to the desired network, for example: soss set net_interface=10.0.3.0 subnet_mask=255.255.255.0
  5. Join this server to the cluster of IMDG servers: soss join

To install ScaleOut StateServer on Windows, download the appropriate installer from the the ScaleOut Software web site and follow the installation instructions. The server is installed as a Windows service and can be configured by using the SOSS Management Console.

The IMDG servers will automatically discover each other and balance the storage workload among all servers.

Sufficient physical memory should be provisioned for the IMDG to hold all data set objects and their associated replicas following the best practices described in the SOSS Help File. By default, named cache objects have one replica on a different server to ensure high availability in case a server fails. For example, if a 100GB data set is to be stored in the IMDG, this will require approximately 200GB of aggregate memory for the data set and its replicas (using the default parameters). If the cluster has four servers, this will require 50GB per server. Note that additional memory is required for object metadata and other data structures used by the IMDG.