Making Data Set Creation more User Friendly
Contents
Goals
The goal is to create an API to hide the details of creating a VTK-m dataset. This would be useful for developers and users in general, and make it easier to write importers, tests, etc.
Idea #1: Data Set Builder
A "data set builder" class, which can do several things:
- allows one-at-a-time adding of cells, coordinate values, etc. in a way harder for array handles
- this may be easier for e.g. loading data sets from disk
- (note that we should not enforce ONLY one-at-a-time adding of these values, should also allow e.g. raw pointers to data; have options)
- defines a logical structure for you and helps ensure you've done everything necessary for a proper data set
- has convenience functions for creating e.g. Cartesian rectilinear grids and other multi-step things
- when you're done, it converts e.g. internal STL vectors to array handles and gives you a data set
Question: used for new data sets only, or can you use it to augment a data set? Concrete example: adding a new cell set to an existing data set.
- One solution: not just a "data set builder" but other builders as well, like a "cell set builder", which can be used after having created a data set. Presumably the data set builder would encapsulate/use/interact-with the cell set builder, and so on.
- For the first iteration, I recommend focusing on just creating new data sets. My experience tells me that this is the big use case, and we can always add the modifiers later. --Kmorel (talk) 14:25, 13 August 2015 (EDT)
Question: are there ExplicitDataSetBuilders vs RegularDataSetBuilders, or is it one DataSetBuilder with different helper routines?
- One thought/answer: might be too hard to handle e.g. hybrid cases with the former, so possibly the latter is a better idea
- I'm actually leaning to the former. The point of the data set builders is to make the process easier. I fear that trying to have one class that handles all possibilities will create an inconsistent mess of the interface. I'm also OK with having some of the hybrid cases dropped on the floor. It's not like you won't be able to build hybrid cases, it will just take a few more calls. I suggest creating a mockup of the data builder interface, and that will help us judge the efficacy of the approach. (BTW, the names really should be DataSetBuilderExplicit and DataSetBuilderRegular to match VTK-m's naming convention.) --Kmorel (talk) 14:25, 13 August 2015 (EDT)
Regular DataSet builder
Here's a first pass at some data set builder classes.
template <typename T>
class DataSetBuilderRegular
{
public:
DataSetBuilderRegular() {}
//2D regular grids.
vtkm::cont::DataSet Create(int nx, int ny, const string &coordNm);
vtkm::cont::DataSet Create(int nx, int ny,
T *xRange, T *yRange,
const string &coordNm);
vtkm::cont::DataSet Create(const vector<T> &xVals, const vector<T> &yVals,
const string &coordNm);
vtkm::cont::DataSet Create(int nx, T *xVals, int ny, T *yVals,
const string &coordNm);
//Add more for Vec, ArrayHandle, (X,Y) arrays, etc.
//3D regular grids
vtkm::cont::DataSet Create(int nx, int ny, int nz, const string &coordNm);
vtkm::cont::DataSet Create(int nx, int ny, int nz,
const T *xRange, const T *yRange, const T *zRange,
const string &coordNm);
vtkm::cont::DataSet Create(const vector<T> &xVals, const vector<T> &yVals, const vector<T> &zVals,
const string &coordNm);
vtkm::cont::DataSet Create(int nx, const T *xVals,
int ny, const T *yVals,
int nz, const T *zVals,
const string &coordNm);
//Add more for Vec, ArrayHandle, (X,Y,Z) arrays, etc.
private:
};
Explicit DataSet builder
This first class is for building an explicit dataset from a fully defined set of coordinates, and connectivity.
template <typename T>
class DataSetBuilderExplicit
{
public:
DataSetBuilderExplicit() {}
vtkm::cont::DataSet Create(const vector<T> &xVals, const vector<T> &yVals,
const vector<vtkm::Id> &shapes,
const vector<vtkm::Id> &numIndices,
const vector<vtkm::Id> &connectivity,
string coordName="",
string cellName="");
vtkm::cont::DataSet Create(const vector<T> &xVals, const vector<T> &yVals, const vector<T> &zVals,
const vector<vtkm::Id> &shapes,
const vector<vtkm::Id> &numIndices,
const vector<vtkm::Id> &connectivity,
string coordName="",
string cellName="");
//Add more for T/Id *, Vec, ArrayHandle, etc.
private:
};
This second class is for building an explicit DataSet a step at a time, like the VTK InsertNext()-type method.
template <typename T>
class DataSetIterativeBuilderExplicit
{
public:
DataSetIterativeBuilderExplicit();
void Begin(string coordNm="", string cellNm="");
vtkm::cont::DataSet Create();
vtkm::Id AddPoint(const T &px, const T &py, const T &pz);
//Add more for T*, vec, vector, etc.
void AddCell(vtkm::Id shape, vector<vtkm::Id> &conn);
//Add more for Id*, vec etc.
private:
};
The usage is as follows:
DataSetIterativeBuilderExplicit dsBuilder;
dsBuilder.Begin("coords", "cells");
vtkm::Id id0 = dsBuilder.AddPoint(0,0,0);
vtkm::Id id1 = dsBuilder.AddPoint(1,0,0);
vtkm::Id id2 = dsBuilder.AddPoint(0,1,0);
dsBuilder.AddCell(vtkm::CELL_SHAPE_VERTEX, &id0);
dsBuilder.AddCell(vtkm::CELL_SHAPE_VERTEX, &id1);
dsBuilder.AddCell(vtkm::CELL_SHAPE_VERTEX, &id2);
vtkm::cont::DataSet ds = dsBuilder.Create();
Field adder class
The field adder classes are relevant to both regular and unstructured grids, so perhaps it makes sense to put them in a separate class.
template <typename T>
class DataSetFieldAdder
{
public:
DataSetFieldAdder() {}
void AddPointField(vtkm::cont::DataSet &dataSet,
const string &fieldName,
const vector<T> &field,
int whichCoords=0);
void AddPointField(vtkm::cont::DataSet &dataSet,
const string &fieldName,
const vector<T> &field,
string whichCoords="");
//Add more for T*, Vec, ArrayHandle, etc.
void AddCellField(vtkm::cont::DataSet &dataSet,
string &fieldName,
vector<T> &field,
int whichCells=0);
void AddCellField(vtkm::cont::DataSet &dataSet,
string &fieldName,
vector<T> &field,
string cellName="");
//Add more for T*, Vec, ArrayHandle, etc.
private:
};