5.4. Garbage Collector

All stand-alone managed objects are garbage collected. This means that the application does not manage memory allocation and object's lifetime by itself but it is done automatically by the Core. Programmer simply creates objects by a new()-like construct and lets the Core delete them when they are no longer used. The Core periodically scans for such unused objects, basically the objects that can not be reached by strong pointers, and deletes them. The Garbage Collector is not conservative, during the mark phase, when the collector is marking "reachable" objects, meta-information of examined objects is used to determine what references a particular object holds. The Garbage Collector activity is fully transparent and does not require user's assistance.

5.4.1. The Model

This section describes the garbage collector model implemented in the Core.

Basic single-process garbage collectors usually work as follows. When the underlying run-time determines that there is low free memory, a new GC run is triggered. Its purpose is to find and delete "lost" objects, the objects that can not be reached from program stack and global variables through pointers. During this process all threads are stopped so that the GC sees consistent memory image. The implementation is often conservative and does not utilize any meta-information (everything is treated as if it could be a pointer). This has the consequence that the GC is not able to detect all unreachable objects, might be ineffecient or blocks the program for quite a long time (GC runs do not remain unspotted).

In a distributed environment the problem is much more complex and conventional algorithms used by local garbage collectors can not be used for various reasons. Also the GC runs may be quite lengthy and threads can not be blocked during the process. Because of this most distributed systems do not have a garbage collector at all, require a non-trival user's assistance or are too conservative (GC can not tell if an object is unreachable, may cause serious memory leaks).

In the Massiv a special form of a local garbage collector is used. There are no global variables, the whole simulation state is kept in managed objects. Instead of determining what objects are reachable from global variables (which would not have sence because objects can transparently migrate whereas global variables can not), the garbage collector determines what objects are reachable from special GC root objects. An object can be a GC root object either statically (if its class has an IDL gc_root flag set) or dynamically by promoting it to a root object. Objects with scheduled migrations are automatically promoted to GC roots until the migration takes place. Apart from GC roots, the scanning process also originates in pointers residing on the program stack. During the scanning, the Garbage Collector is interested in strong pointers only . Weak pointers are ignored and this is consistent with other GC-based systems that support weak references.

[Note]Note

To conclude, the scanning process originates in GC root objects and stack-strong-referenced objects. Strong pointers keep the referenced objects alive. One might wonder where the local semantics of the GC comes from. Remember that strong pointers are the only pointers that are processed by the GC when it scans for reachable objects and that strong pointers can not point to remote objects.

When the Core starts a new GC run. It first scans for reachable objects from GC roots and the program stack (the mark phase) and then deletes the unreachable objects (sweep phase). Only stand-alone non-replica objects are garbage collected.

[Note]Note

Active objects are not garbage collected too, but they are not considered to be GC roots. Their migration groups won't also be collected if the active objects are referenced by chains of strong pointers originating on the program stack.

5.4.2. GC Roots

GC root objects are stand-alone objects handled by the Garbage Collector in a special way. Such objects are somewhat privileged over non root objects because they will never be collected by the Garbage Collector unless it was explicitly instructed to do so. Moreover the scanning for reachable objects originates in these objects. This has a consequence that objects reachable from GC root objects by strong pointers will not be collected too. In other words GC root objects make other objects alive. For completeness it is worth saying that stack strong pointers have a similar function as the scanning originates also in stack-strong-referenced objects.

The Garbage Collector semantics can be expressed in the language of migration groups as well.

  • Migration groups are deleted at the next GC run if they do not have GC root objects and are not referenced from the stack by strong pointers.

  • Weak-referenced portions of migration groups (objects referenced by weak pointers only) are also deleted.

There are two kinds of GC root objects:

  • Permanent GC roots

    A stand-alone object is a permanent GC root if its class defines gc_root = true IDL class attribute. All stand-alone instances of such a class are GC roots.

    Permanent GC root objects can be turned to non-root objects at the end of its life. This ensures that the migration group will be collected then. Use System::dispose_gc_root().

  • Dynamic GC roots

    A stand-alone object instance can be promoted to a GC root at the run-time. The promotion is on the per-instance basis.

    Object instances are automatically promoted to GC roots if they have pending migrations. They are automatically demoted back to non-root objects when the migrations finish. This allows to form and migrate a migration group without a permanent GC root object. Such migration groups can be used to implement messaging:

    class Message;
    
    class Sender : public Object /* gc_root = true */
      {
    public:
      void send
        (
        WeakPointer< Receiver > receiver,
        Pointer< Message >      message,
        const STime &           delivery_time
        );
      ...
      };
    
    class Receiver : public Object /* gc_root = true */
      {
    public:
      void message_delivered
        (
        Pointer< Message > message
        );
      ...
      PPointer< Message > last_message;
      };
    
    class Message : public Object /* gc_root = false */
      {
    protected: /// Object interface.
      void delivered_to
        (
        WeakPointer< Object > object,
        const STime &         delivery_time
        );
      ...
      };
    
    void Sender::send
      (
      WeakPointer< Receiver > receiver,
      Pointer< Message >      message,
      const STime &           delivery_time
      )
      {
      message->migrate_to( receiver, delivery_time ); 1
      }
    
    void Message::delivered_to
      (
      WeakPointer< Object > object,
      const STime &         delivery_time
      )
      { 2
      Pointer< Receiver > receiver = object.convert();
      receiver->message_delivered( this );
      }
    
    void Receiver::message_delivered
      (
      Pointer< Message > message
      )
      {
      last_message = message; 3
      }

    1

    Requesting a migration promotes the message to a dynamic GC root object.

    2

    Message object was delivered to the migration addressee. This is called by the Core when the migration finishes. The object has already been demoted back to a non GC root object, however it is stack-strong referenced from the Core until the exit from the method.

    3

    If message was not assigned to last_message the Message object would be garbage collected at the next GC run.

5.4.3. The API

This section describes public and semi-public APIs to the Garbage Collector. Although the application mostly should not care, as the Garbage Collector works automatically and need not be controlled by the application at all, the knowledge of the API can be advantageous. However when working with the GC directly non-trivial knowledge related to the Core implementation is required. Reading the Massiv Core Programmer's Documentation is highly recommended.

The Garbage Collector can be controlled either through System class, which defines a public API that abstracts all Core subsystems, or directly through GarbageCollector class. The direct access is semi-public and should be used for diagnostics and debugging purposes only.

Garbage collector related API provided by the System class is summarized in the following table. It discusses how to explicitly force garbage collection, explicitly delete an object or dispose a migration group containing a GC root object:

Table 5.7. Public API to the Garbage Collector
MethodDescription
System::force_gc() Forces immediate garbage collection. Although the GC decides itself when to perform the collection a way to explicitly force the collection might be useful.
System::dispose_gc_root( local_pointer ) Demotes a permanent GC root object referenced by the local_pointer to a non-root object. The object and its migration group will eventually be garbage collected at once.
System::collect_object( local_pointer ) Instructs the GC to collect the object referenced by the local_pointer as soon as possible. This method allows to delete objects explitly. However rest of the migration group remains alive at least to the next GC run.

System::collect_object() can not always delete referenced objects immediatelly. That is because such objects may be active, for example. If this is the case, the object is tagged and will be deleted at the next GC tick:

class X : public Object
  {
public:
  void f()
    {
    Pointer< Object > self = this;
    System::collect_object( self );
    }
  ...
  }

There is no need to worry about unspotted dangling pointer dereferences. The delete operation equals to an immediate object migration (without the rest of its migration group, of course) to a "thrash can". Any attempt to access such an object will fail because the object is no longer local and can not be localized on remote nodes (ObjectIds are not recycled). Thus, pointers with both local and remote dereference semantics pointing to the deleted objects will always fail to dereference. Pointer validity can be tested in the common way.

[Warning]Warning

When deleting an object explicitly keep in mind that other objects from the same migration group will remain alive and their strong pointers to the deleted object will be invalidated (fail to dereference, but won't be reset to NULL_ID). The same steps must be taken as if the application was written in plain C++. However invalid dereferences will be catched.

Suppose that we have a migration group with a single GC root object. If the root object was explicitly deleted by System::collect_object() there would be a risk that the rest of the migration group would be alive until the next GC run. This may cause the problems explained in the previous warning. However if the root object was disposed by System::dispose_gc_root() the migration group would be deleted atomically at a future GC run. In the case of multiple GC roots all the roots will have to be disposed.

[Note]Note

This is valid under a condition that basicaly none of the objects in the migration group is active at the time of the collect. The condition can simply be fulfiled by preventing the GC from running if there might be active objects. This is the default behavior and can be changed via the registry settings.

Direct API to the Garbage Collector is provided by the GarbageCollector global object. The API is semi-public and gains access to the features ranging from diagnostics and debugging functions, statistics, settings, the Garbage Collector state to migration and replication group enumeration. The complete documentation can be found in the Massiv Core Programmer's Documentation and requires the knowledge of the Core internals. Some of the features are explained in the following section.

5.4.4. Running And Configuring GC

Garbage Collector runs are triggered either automatically by the Core logic, when an object limit count (adapted by recent memory use) is reached, or explicitly by calling System::force_gc(). There is a variety of settings that can be used to setup the Garbage Collector. See Section 27.7, “Garbage Collector”.