Locking Objects

Pessimistic Locking

ScaleOut StateServer allows an object to be exclusively locked on a caller’s behalf across every server in the farm. This per-object lock can be used to ensure that other clients cannot access or alter the object while the lock is being held. All clients' code should take care to acquire a lock prior to performing an operation that could disrupt (or be disrupted by) other clients.

The following methods on the NamedProtobufCache and NamedPrimitiveCache classes can be used to acquire an exclusive lock on an object:

Method Description

lock

Locks an object

insert_and_lock

Atomically inserts an object and locks it

get_and_lock

Atomically retrieves and acquires a lock on an object

The API blocks on these calls and polls the service until the lock is acquired—by default, a locking call will make 20000 attempts to acquire a lock, where each attempt is 5 milliseconds apart. The frequency and number of polls is configurable through each method’s respective Options parameter--the options classes contain max_lock_retry_count and lock_retry_interval_ms fields that allow this behavior to be adjusted. Polling behavior can also be configured for an entire named cache instance using default policies.

If a lock cannot be acquired after the number of retries has been exhausted, the API will either throw a sosscli::exceptions::ObjectLockedException, or, if the call’s throw_on_error option is set to false, return a negative server error code in the result (-112).

The result object returned by a locking call will contain a sosscli::LockTicket object if an exclusive lock was successfully acquired. This LockTicket object is an opaque handle to a lock in the StateServer service, and it must be provided as an argument to subsequent calls that operate on this locked object—this ticket identifies the caller to the service as the owner of the lock. The server will return an error if a ticket is used that does not match up with the owner of the lock.

[Note] Note

The LockTicket class is a reference-counted, RAII-style wrapper around the internal lock ticket identifier that’s returned from the server during a locking call. If the last remaining copy of a returned LockTicket object goes out of scope while the object is still locked in the server then it will automatically be unlocked by the C++ Native Client API. This reduces the likelihood that a client will inadvertently leave an object locked in the server, which would prevent any other clients/threads from being able to acquire an exclusive lock on it.

Once a lock has been acquired, the owner can use the LockTicket with "locked" variations of the named cache’s methods. These methods are able to access the locked object and will result in the lock being either refreshed or released atomically:

Method Description

update_locked_and_release

Atomically updates and releases the lock on an object

update_locked_and_retain

Atomically updates and refreshes the lock on an object

get_locked

Atomically retrieves and refreshes the lock on an object

remove_locked

Removes a locked object

lock_refresh

Refreshes a client’s/thread’s lock on an object

unlock

Unlocks a locked object

[Note] Note

If a lock is not released or refreshed by a client application then the server will automatically release the lock after 90 seconds. If a lock needs to be held for more than 90 seconds, use lock_refresh or get_locked to refresh the lock and reset the timeout in the service.

For example, a common access pattern when doing locked updates is to first perform an atomic read-and-lock operation on an object using the the named cache’s get_and_lock method. After the retrieved object has been changed, a corresponding update-and-release atomic operation can be performed using the update_locked_and_release method:

  // Read & lock the object in the server:
  auto get_result = quote_cache.get_and_lock("GOOG");

  // Change the retrieved object's state to contain the latest info:
  auto quote = get_result.object_ptr();
  quote->set_price(1053.26f);
  quote->set_volume(1657662);

  // Atomically update & unlock the object in the server. The lock ticket
  // returned by the earlier locking get operation must be provided or else
  // an ObjectLockedException will be thrown:
  quote_cache.update_locked_and_release("GOOG", quote, get_result.lock_ticket());

The non-locking calls on the named cache (get, update, put, etc…) do not honor locks held by other clients/threads and should be avoided if other callers expect to have exclusive access to an object.

Optimistic Locking

In situations where collisions are infrequent (for example, in applications that rarely perform updates in the distributed cache, or in applications that rarely update the same objects at the same time), an optimistic locking approach can be taken with updates.

Instead of preventing other clients from accessing the object, the optimistic strategy returns an error only if a collision is detected, allowing the client application to resolve the collision (typically by retrying the operation). If the odds of an update collision are too high then optimistic updates should be avoided, since resolving the collision could become expensive if done frequently.

ScaleOut StateServer supports optimistic locking by performing a version check when an object is updated in the named cache using the update_optimistic method. If the version in the StateServer is newer than the instance being used in the update then a sosscli::exceptions::VersionMismatchException will be thrown (or a negative PutResult::return_code of -120 will be returned to indicate failure if the throw_on_error argument is false), indicating that another client/thread performed an update in the interval between the object’s retrieval and the update.

The version of an object in the StateServer service is always returned to client code when reading, updating, or inserting an object (through the GetResult or PutResult objects that are returned by these calls). Use the result object’s version() accessor to get the version of an object that you have retrieved, and then provide this version to a subsequent update_optimistic call. A version mismatch can be addressed by re-retrieving the object and then retrying the update—the simplified example below illustrates one possible approach:

const int max_update_attempts = 100;
int update_attempts = 0;
bool quote_updated = false;
sosscli::NamedProtobufCache<StockQuote> quote_cache("Stock Quotes");

do
{
  if (update_attempts > max_update_attempts)
    throw std::runtime_error("Max optimistic update attempts exceeded.");

  update_attempts++;

  try
  {
    // Retrieve the quote and change its price. The returned GetResult contains the
    // version number of the object in the server, which will be used later for the update:
    auto get_result = quote_cache.get("GOOG");

    std::cout << get_result.version() << std::endl;

    auto quote = get_result.object_ptr();
    quote->set_price(1052.42f);

    // Perform optimistic update:
    auto update_result = quote_cache.update_optimistic("GOOG", quote, get_result.version());

    // If we get this far without throwing an exception then the update succeeded:
    quote_updated = true;

    // Note the the update result object contains the object's new version, which could be
    // used for a subsequent optimistic update to our local quote object:
    std::cout << update_result.version() << std::endl;
  }
  catch (sosscli::exceptions::VersionMismatchException)
  {
    // A collision occurred because another client/thread updated the object in the
    // period between our get() and our update_optimisitic() calls. We don't want
    // to overwrite the other thread's changes, so we fall through and allow
    // the do..while loop to retry the entire get/update_optimistic sequence.
  }
  catch (sosscli::exceptions::ObjectNotFoundException)
  {
    // Another client or thread must have removed our object.
    // Attempt to recreate it with our desired value.
    auto quote = boost::make_shared<StockQuote>();
    quote->set_ticker("GOOG");
    quote->set_price(1052.42f);
    quote->set_volume(-1);
    quote_cache.insert("GOOG", quote);

    quote_updated = true;
  }

} while (!quote_updated);
[Note] Note

If you’d rather not use exceptions to handle errors like in the example above then set the NamedProtobufCache’s throw_on_error policy to false (see Named Cache Options). You can then use the the return_code() accessor on the GetResult and PutResult objects to check for success. Error codes are always negative and are defined in the soss_svccli.h header that ships with the product. In this case, one could check return_code() for SOSSLIB_ERR_VER_MISMATCH (-120) and SOSSLIB_ERR_OBJ_NOT_FOUND (-109) instead of catching VersionMismatchException and ObjectNotFoundException, respectively.

Optimistic Locking and the Client Cache

The NamedCache may return a shared pointer to the same instance of an object to two different threads that retrieve the object from the distributed data grid. This can occur because the SOSS C++ Native Client API maintains a near cache of recently-accessed objects in order to cut down on network and serialization overhead (see client cache for details).

This client-side caching can have repercussions for applications that use optimistic locking—one thread’s changes to an object that was retrieved from the client cache will be visible to all other threads referencing that same client cache instance, even if the changes have not yet been pushed to the authoritative SOSS server. This can cause trouble if an optimistic update fails: the copy of the object in the client cache will be out of sync with the authoritative version in the server, so all threads in that client app would continue to see the changed object, even after the update failure.

One of three approaches should be taken to avoid the risk of putting the client cache in a state that is inconsistent with the server:

  • Make changes to a deep copy of the object instead of the retrieved instance and then attempt the optimistic update with the copy.
  • If an VersionMismatchException is thrown from the update_optimistic operation, immediately re-retrieve the object from the SOSS server so as to re-synchronize the client cache with the latest version of the object (as illustrated in the sample above). The instance of the object in the client cache will still be briefly out of sync prior to the re-retrieval, so this approach may not be desirable depending the requirement for data consistency in your use case.
  • Disable the client cache for the named cache in question using NamedCache::default_cache_policy()::set_use_client_cache(false)