Skip to content

Extensible, faster ModelArrayRefs

Tim Spain edited this page Mar 3, 2023 · 5 revisions

String-keyed ModelArrayRef

Name: Tim Spain

Affiliation: NERSC

Target/Actual Release Date/Project Milestone (delete as appropriate): Dynamics Merge

Reviewers: Kacper Kornet, Alex Smith

GitHub Issue: #249


  1. Introduction
  2. Requirements
    1. Architectural
    2. Functional
    3. Other
  3. Functional
  4. Architectural
    1. CMake and Build
    2. Design Details
  5. Integration
  6. Test Specification
  7. Documentation Impact

Introduction

The current implementation of ModelArrayRef has two outstanding problems:

  1. The set of possible referents is limited to whatever is listed in the ProtectedArray and SharedArray enums.
  2. There is an additional dereference to get the position of the referenced array in the big array of pointers.

The main issue with problem 1 is that it limits the ability of future researchers to add new fields that can be shared by the ModelArrayRef mechanism. If a future researcher wishes to add young ice parameters, then they would have to add HICE_YOUNG, CICE_YOUNG &c. to the enum in order to use the ModelArrayRef class to share the data throughout the model.

If the key used to access the referenced data was a string, then the value could be much more arbitrary, albeit at the risk of name clashes. Also, if the key is being changed to a string, then since the underlying store is to be changed, the additional dereference could be eliminated.

Requirements

The new implementation should work as similarly to the current implementation as possible, though if the amount of boilerplate code could be reduced, then all the better.

A reference must be able to be created either before or after a data array is registered.

The key for referencing an array should be a string or similar text type, set at the declaration of the variable. This allows rapid inspection of what data a given ModelComponent will have access to.

The data structure should provide access directly to a pointer to the target data, rather than relying directly on a central store of pointers, as the current implementation does.

Functional

Register an array data as the data source for a given TextTag key field with read-write permission RW to a MARStore object store:

store.registerArray(field, &data, RW);

Alternatively, read-only permission can be permitted as either RO or by default:

store.registerArray(field, &data, RO);
store.registerArray(field, &data);

To reference some data, declare a ModelArrayRef dataRef with the correct key, which is a TextTag class field with read-write access RW:

ModelArrayRef<field, RW> dataRef;

As with the registration, read only access can be explicitly specified as RO, or left unspecified as the default:

ModelArrayRef<field, RO> dataRef;
ModelArrayRef<field> dataRef;

Before the data can be accessed, the reference needs to obtain the pointer from the store. This is done by passing a reference to a MARStore object to the constructor. For the above object dataRef and store object store the constructor is:

dataRef(store);

or

ModelArrayRef<field, RW> dataRef(store);

when declared outside an object member variable.

In common with the old implementation, the ModelArrayRef class should provide indexing and basic arithmetic operators to act directly on the data. The indexing can be 1-dimensional using the [] operator, multi-dimensional by using several arguments in the () operator (at least three dimensions supported) or multi-dimensional by using a ModelArray::MultiDim object. Direct access to the ModelArray data will be possible through the data() function. Finally, for all other uses, a reference to the underlying data is obtained using the user cast operator ModelArray& (either const or non-const as appropriate).

The template class should respect the read-only or read-write data access by providing only const reference access through ModelArrayRefs declared as RO and non-const references when declared as RW. This could be achieved by partial template specialization.

Build

The class is templated, and so must be defined in a header. As long as the header is included correct the building of exectuables will change transparently.

Architectural

The ModelArrayRef class itself is rather lightweight. It stores the reference to the data in the form of a pointer to a ModelArray. It also holds a reference to the instance of MARStore that it uses for its references. This is held to allow the removal of dead references within the destructor. The

Most of the detail is handled by the MARStore class, which holds the references and can assign them if a new data source is registered. A MARStore object consists of four std::unordered_maps. Two are of the type std::unordered_map<std::string, ModelArray*>, storeRO storeRW. These hold a mapping from the text name to the pointer to the ModelArray object which holds that data. The RO or RW access specifier depends on which access type the data was registered with.

The second pair of unordered_maps are of a type that allows multiple entries for a given key and have a data type of a pointer to a pointer to a ModelArray. These hold a list of all references referring to a given field name key. When a new array of that type is registered, all values connected to that key have their contents replaced by the new ModelArray address.

The generally accessible interface of MARStore is the registerArray() function. This has signature

void registerArray(const std::string& field, ModelArray* ptr, bool isReadWrite = false)

Calling this function will allow the array at the address ptr to be registered to all waiting and future ModelArrayRefs associated with that key. The access specification controls what any reference can do with the pointed-to array. An array registered as RO will only be accessible by ModelArrayRefs that are themselves RO read-only. Attempting to reference an array registered as RO via a ModelArrayReference that is designated RW will result in a segmentation fault when access is attempted, and no array pointer is ever defined.

To prevent memory problems including leaking memory, the store class provides a method to remove ModelArrayRef reference addresses from the list of references to update when a new array is registered.

Integration

The largest integration concern is changing existing code from the old ModelArrayRef to the new. Each line of code relating to the old class has an equivalent with the new class.

Registering an array was

store[static_cast<size_t>(enum)] = &array;

and becomes

store.registerArray(tag, &array);

Defining a ModelArrayRef was

ModelArrayRef<enum, oldStore> ref;

and becomes

ModelArrayRef<tag> ref;

The constructor in both cases remains

ref(store);

though the store class is very different.

The largest change in integrating the new version of the ModelArrayRef is converting from the enum to the TextTag for indicating which array is to be registered or referenced.

Test Specification

The ModelArrayRef_test also needs to be converted from the old class to the new.

Documentation Impact

The new system of referencing data through the model needs to be documented.