Bulk Operations

ScaleOut StateServer’s APIs support a different approach to bulk operations when compared to AppFabric with its DataCache.BulkGet() method. Rather than using a single client to retrieve and evaluate a large set of objects, SOSS allows you to employ powerful, distributed strategies to efficiently work with large datasets while minimizing network overhead. Distributed LINQ query support and ScaleOut’s in-memory compute engine allow you to spread bulk operations across the entire server farm.

Common use cases that are addressed by the APIs include:

  • Distributed queries: If you are only retrieving a set of objects so that you can perform a deeper inspection of properties, consider indexing your objects and using ScaleOut’s LINQ query provider instead. This approach allows you to use the service’s built-in support for distributed queries, effectively using all the hosts in your distributed data grid to filter your objects instead of just one client.
  • Distributed analysis/updates: If you need to perform analysis or updates on a large number of objects then ScaleOut’s in-memory compute engine can be used. The compute engine deploys your custom analysis/update logic to the cache hosts where your objects reside, minimizing network overhead and taking advantage of data locality to improve processing time. A straightforward data-parallel programming model is provided through the NamedCache.Invoke method. Invoke() operations can be combined with ScaleOut’s LINQ provider to filter the objects being analyzed.
  • Minimizing "wall clock time": If you are simply trying to retrieve a small set of objects in the least amount of time, using the .NET Framework’s Task Parallel Library is a very effective approach (for example, use a Parallel.ForEach() loop to gather a set of objects). ScaleOut’s client libraries work well in multithreaded contexts—accesses can be performed concurrently and are load balanced across farm’s caching hosts. Parallel operations also work well for bulk insert scenarios where a large number of objects need to be loaded by a single client.
[Note] Note

Using the NamedCache.Invoke method to perform distributed operations with the in-memory compute engine requires a ScaleOut ComputeServer™ license.