Computing on Tensors
Specifying Tensor Algebra Computations
Tensor algebra computations can be expressed in TACO with tensor index notation, which at a high level describes how each element in the output tensor can be computed from elements in the input tensors. As an example, matrix addition can be expressed in index notation as
A(i,j) = B(i,j) + C(i,j)
where A
, B
, and C
denote order-2 tensors (i.e. matrices) while i
and
j
are index variables that represent abstract indices into the corresponding
dimensions of the tensors. In words, the example above essentially states that,
for every i
and j
, the element in the i
-th row and j
-th column of the
A
should be assigned the sum of the corresponding elements in B
and C
.
Similarly, element-wise multiplication of three order-3 tensors can be
expressed in index notation as follows
A(i,j,k) = B(i,j,k) * C(i,j,k) * D(i,j,k)
The syntax shown above corresponds to exactly what you would have to write in C++ with TACO to define tensor algebra computations. Note, however, that prior to defining a tensor algebra computation, all index variables have to be declared. This can be done as shown below:
IndexVar i, j, k; // Declare index variables for previous example
Expressing Reductions
In both of the previous examples, all of the index variables are used to index into both the output and the inputs. However, it is possible for an index variable to be used to index into the inputs only, in which case the index variable is reduced (summed) over. For instance, the following example
y(i) = A(i,j) * x(j)
can be rewritten with the summation more explicit as and demonstrates how matrix-vector multiplication can be expressed in index notation.
Note that, in TACO, reductions are assumed to be over the smallest subexpression that captures all uses of the corresponding reduction variable. For instance, the following computation
y(i) = A(i,j) * x(j) + z(i)
can be rewritten with the summation more explicit as
whereas the following computation
y(i) = A(i,j) * x(j) + z(j)
can be rewritten with the summation more explicit as
Performing the Computation
Once a tensor algebra computation has been defined (and all of the inputs have
been initialized), you can simply invoke the
output tensor's evaluate
method to perform the actual computation:
A.evaluate(); // Perform the computation defined previously for output tensor A
Under the hood, when you invoke the evaluate
method, TACO first invokes the
output tensor's compile
method to generate kernels that assembles the output
indices (if the tensor contains any sparse dimensions) and that performs the
actual computation. TACO would then call the two generated kernels by invoking
the output tensor's assemble
and compute
methods. You can manually invoke
these methods instead of calling evaluate
as demonstrated below:
A.compile(); // Generate output assembly and compute kernels
A.assemble(); // Invoke the output assembly kernel to assemble the output indices
A.compute(); // Invoke the compute kernel to perform the actual computation
This can be useful if you want to perform the same computation multiple times,
in which case it suffices to invoke compile
once before the first time the
computation is performed.
Lazy Execution
It is also possible to compute on tensors without having to explicitly invoke
compile
, assemble
, or compute
. Once you attempt to modify or view the
output of a computation, TACO would automatically invoke those methods if
necessary in order to compute the values in the output tensor. If the input to
a computation is itself the output of another computation, then TACO would also
automatically ensure that the latter computation is fully executed first.