Simple MapReduce API

ScaleOut ComputeServer includes a Hadoop-independent, in-memory MapReduce framework for running MapReduce applications on a NamedMap. All Hadoop-specific configuration and setup/cleanup is removed from the simple MapReduce framework which makes developing lightning fast MapReduce applications quick and easy. To get started with Simple MapReduce, you will need to implement the following interfaces:

  1. com.scaleoutsoftware.soss.simplemr.Mapper
  2. com.scaleoutsoftware.soss.simplemr.Reducer
  3. Optional com.scaleoutsoftware.soss.simplemr.Combiner

In the next sections, we will convert the word count MapReduce application to simple MapReduce.

[Note] Note

The ScaleOut ComputeServer Simple MapReduce API Java library can be found in soss-simplemr-5.4.jar, and the Java API library for creating, reading, updating, and deleting objects can be found in soss-jnc-5.4.jar. These JARs and their dependencies are located in the java_api and java_api/hslib/simple_map_reduce (JavaAPI and JavaApi/hslib/SimpleMapReduce on Windows) subdirectories of the ScaleOut StateServer installation directory.

Requirements

  • If a combiner is specified, the returned key must be the same as the parameter key and the input/output keys must implement Writable and Serializable.
  • The input/output keys and values of the mapper and the reducer must implement Writable or Serializable.