Polymorphic ArrayHandle
A major issue with the current version of VTK-m is that template programming madness is causing major inconveniences with the compilation. First, compiling VTK-m code takes a looooooong time. Second, even the simplest programs generate remarkably large executables.
The most direct cause of both these issues is the DynamicArrayHandle (along with its sibling DynamicCellSet) and the dynamic cast and call that happens within the dispatcher's Invoke.
The basic idea of the DynamicArrayHandle is that you can hold an ArrayHandle without knowing its type and storage at compile time. This allows you to, for example, store fields in a DataSet without carrying around a list of template arguments specifying each field array type.
In the current design of VTK-m all types have to be known when you get to the execution environment. That means when using one of these dynamic storage classes when executing a worklet, the code must figure out what type of arrays are being used by trying every combination of every possible type. These combinations grow very quickly.
Take for example the execution of everyone's favorite, Marching Cubes. At a minimum, this algorithm requires 3 arguments: a cell set, a field to contour, and the point coordinates. If just considering the "default" types that VTK-m expects, there are 4 common types of cell sets (2D structured, 3D structured, explicit, and explicit single type). The field is probably one of 2 types (Float32 or Float64). The point coordinates also have 2 possible types (Vec 3 of Float32 or Float64) but additionally 5 (!) storage mechanisms (standard array, uniform coordinates, 2 types of composite vectors, and cartesian product). This is a total of 4 x 2 x 2 x 5 = 80 possible combinations.
So, yes, for a simple worklet operating on a cell set with one field and point coordinates, the template mechanism creates 80 versions of the worklet functions that must be compiled. Furthermore, the templating to create all these versions is quite slow to resolve. Add to that that several compilers we use for VTK-m require multiple passes for different architectures (e.g. CUDA) or optimizations.
Contents
How polymorphic array handles would work
The proposed solution is to introduce a polymorphic array handle where values can be retrieved without knowing how they are stored. The implementation starts with an array portal with virtual methods.
template<typename T>
class ArrayPortalPolymorphic
{
public:
virtual vtkm::Id GetNumberOfValues() const = 0;
virtual T Get(vtkm::Id index) const = 0;
virtual void Set(vtkm::Id index, const T &value) = 0;
virtual ~ArrayPortalPolymorphic() { }
};
A templated subclass pulls values out of the source array portal where the entire type is known.
template<typename T, typename SrcArrayPortalType>
class ArrayPortalPolymorphicImpl : public ArrayPortalPolymorphic<T>
{
public:
ArrayPortalPolymorphicImpl(const SrcArrayPortalType &portal)
: SourceArrayPortal(portal) { }
vtkm::Id GetNumberOfValues() const
{
return this->SourceArrayPortal.GetNumberOfValues();
}
T Get(vtkm::Id index) const
{
return T(this->SourceArrayPortal.Get(index));
}
void Set(vtkm::Id index, const T &value)
{
this->SourceArrayPortal.Set(index, value);
}
private:
SrcArrayPortalType SourceArrayPortal;
};
- The actual implementation should actually have a wrapper class that holds a pointer to the actual polymorphic class. This is to make sure that the virtual methods point to the right class. --Kmorel (talk) 10:12, 9 June 2016 (EDT)
Once this array portal is designed, an array handle can be built around it that takes an array of any type and wraps it in this one unified type. This should be a straightforward implementation of a derived storage, which is document in the User's Guide draft.
The difference between dynamic array handles and polymorphic array handles
Although they are both designed to hide the type at compile time, the existing dynamic array handles and the new polymorphic array handles resolve the type very differently. To access data in the dynamic array handle, you have to cast it back to an array handle of the same type. That means, to access the data you have to try every storage possibility and generate separate code paths to each. If you are accessing data from many dynamic array handles, the code paths explode combinatorially.
In contrast, the polymorphic array handles defer everything to run time. So there is exactly one code path generated and a function pointer jumps the execution to the right code. This is basically how the zero copy VTK data arrays work.
Changes to dynamic array handle
Once the polymorphic array handle is available, the dynamic array handle should be changed to take advantage of it. It needs one or more methods that will return a polymorphic array handle regardless of the underlying type. This could then be used in place of the template mechanisms to get array handles for the worklet.
Pros and Cons
Pros:
- Much faster compilation
- Much smaller executables and libraries
- More potential for precompiling into libraries
- Do not need to guess types in type lists
Cons:
- Memory fetches require virtual method calls
- Cannot be inlined
- Might interfere with vectorization
Open Issues
What should the exports be?
VTK-m is currently a header-only template library that requires everything to be declared. As such, the "export" macros attempt to inline everything. What is the right "export" for these virtual methods?
Read only array portals
Array portals that are not meant to be read might not implement a Set method. When used in templates this is OK because the Set method simply should not be used in read-only contexts. But ArrayPortalPolymorphic does not know what context it is in, so always uses the source's Set method whether it is needed or not.
Either we will have to be more vigilant about implementing Set in all array portals or have a special polymorphic version for read only.
Managing value types
Implementing automatic casting
The polymorphic array handle (and portal) can effectively hide the array storage, but it cannot hide the type of value in the array because it is part of the Get/Set interface.
In our experience with writing worklets, supporting different types (such as Float32 vs Float64 for fields is more of a hassle than a benefit. Filters will probably want to choose one type and cast everything to that base type. But it can be tricky to set up the polymorphic array handle to support any type to cast to. All possible casts have to be implemented in advanced.
What about cell sets?
This implementation does not help with dynamic cell set. In the end, we might just have the filter iterate over all possibilities. It might be the case that each filter has special paths for the different cell sets.
Optimization for voxels
Accessing data through virtual methods could slow down execution, but likely not by much. However, one place where it will matter is in the specialization of some functions for "voxel" type elements (hexahedral cells with axis aligned point coordinates). Things like interpolation is significantly faster and world-to-parametric coordinate computation. We cannot do this specialization if all arrays are hidden behind polymeric classes. We need a mechanism to reintroduce this specialization.