Explicitly specifying the invocation grid

An invocation grid (IG) represents a set of JVMs attached to the grid service processes used to execute MapReduce applications within the IMDG. Each invocation grid is identified by a user-specified name and specifies the necessary dependencies (JARs, classes, folders, or files) and JVM parameters.

By default, ScaleOut hServer automatically creates an IG for each MapReduce application’s execution; the IG includes the JAR file for the MapReduce job as the only dependency. The IG is loaded and the dependencies are copied to the worker nodes and the worker JVMs before the job is run. After execution completes, the IG is unloaded, and the JVMs shut down. ScaleOut hServer supports the use of multiple IGs to independently run different jobs at the same time.

To avoid creating an IG for each job, it can be managed manually and provided as a constructor parameter to the HServerJob instance. This is advantageous for the following reasons:

  • Loading the IG can take a considerable time (up to several seconds). If a relatively short job is run several times in a short time span, each subsequent iteration can share a single invocation grid to avoid load time and maximize performance.
  • If the job’s dependencies include multiple JARs and classes, they can be specified explicitly through the invocation grid.
  • A custom IG can be used to pass parameters to the worker JVMs, such as memory or garbage collector settings.
  • An IG can be reused for NamedMap parallel invocations or queries that may be needed in conjunction with MapReduce applications.

If HServerJob creates its own IG, the job will automatically unload the IG upon completion. However, if the HServerJob is provided with a pre-existing IG, it will not be automatically unloaded after the completion of the job. In this case, the unload() method should be called on the IG object to dispose of the IG when it is no longer needed for further MapReduce jobs or other parallel invocations.

Invocation grids are created by configuring a builder object and calling load(). If the IG is intended to be used for hServer invocations, the builder should be created with HServerJob.getInvocationGridBuilder(…) instead of the InvocationGridBuilder constructor.

In the following example, a custom-built IG is configured with custom JARs, class dependencies, and JVM parameters, used to perform multiple jobs in rapid succession, and then explicitly unloaded:

public static void main(String argv[]) throws Exception {

   //Configure and load the invocation grid
   InvocationGrid grid = HServerJob.getInvocationGridBuilder("myGrid").
                                // Add JAR files as IG dependencies
                                addJar("main-job.jar").
                                addJar("first-library.jar").
                                addJar("second-library.jar").
                                // Add classes as IG dependencies
                                addClass(MyMapper.class).
                                addClass(MyReducer.class).
                                // Define custom JVM parameters
                                setJVMParameters("-Xms512M -Xmx1024M").
                                load();

   //Run 10 jobs on the same invocation grid
   for(int i=0; i<10; i++)
   {
       Configuration conf = new Configuration();

       //The preloaded invocation grid is passed as the parameter to the job
       Job job = new HServerJob(conf, "Job number "+i, false, grid);

       //.........Configure the job here.........

       //Run the job
       job.waitForCompletion(true);
   }

   //Unload the invocation grid when we are done
   grid.unload();
}