Mapping and Reduction#

Coalesced Reduction#

#include <raft/linalg/coalesced_reduction.cuh>

namespace raft::linalg

template<typename InValueType, typename LayoutPolicy, typename OutValueType, typename IdxType, typename MainLambda = raft::identity_op, typename ReduceLambda = raft::add_op, typename FinalLambda = raft::identity_op> void coalesced_reduction(raft::resources const &handle, raft::device_matrix_view<const InValueType, IdxType, LayoutPolicy> data, raft::device_vector_view<OutValueType, IdxType> dots, OutValueType init, bool inplace = false, MainLambda main_op = raft::identity_op(), ReduceLambda reduce_op = raft::add_op(), FinalLambda final_op = raft::identity_op())#

Compute reduction of the input matrix along the leading dimension This API is to be used when the desired reduction is along the dimension of the memory layout. For example, a row-major matrix will be reduced along the columns whereas a column-major matrix will be reduced along the rows.

Template Parameters:

InValueType – the input data-type of underlying raft::matrix_view
LayoutPolicy – The layout of Input/Output (row or col major)
OutValueType – the output data-type of underlying raft::matrix_view and reduction
IndexType – Integer type used to for addressing
MainLambda – Unary lambda applied while acculumation (eg: L1 or L2 norm) It must be a ‘callable’ supporting the following input and output:
ReduceLambda – Binary lambda applied for reduction (eg: addition(+) for L2 norm) It must be a ‘callable’ supporting the following input and output:
FinalLambda – the final lambda applied before STG (eg: Sqrt for L2 norm) It must be a ‘callable’ supporting the following input and output:

Parameters:

handle – raft::resources
data – [in] Input of type raft::device_matrix_view
dots – [out] Output of type raft::device_matrix_view
init – [in] initial value to use for the reduction
inplace – [in] reduction result added inplace or overwrites old values?
main_op – [in] fused elementwise operation to apply before reduction
reduce_op – [in] fused binary reduction operation
final_op – [in] fused elementwise operation to apply before storing results

Map#

#include <raft/linalg/map.cuh>

namespace raft::linalg

template<typename OutType, typename Func, typename ...InTypes, typename = raft::enable_if_output_device_mdspan<OutType>, typename = raft::enable_if_input_device_mdspan<InTypes...>> void map(const raft::resources &res, OutType out, Func f, InTypes... ins)#

Map a function over zero or more input mdspans of the same size.

The algorithm applied on k inputs can be described in a following pseudo-code:

for (auto i: [0 ... out.size()]) {
  out[i] = f(in_0[i], in_1[i], ..., in_k[i])
}

Performance note: when possible, this function loads the argument arrays and stores the output array using vectorized cuda load/store instructions. The size of the vectorization depends on the size of the largest input/output element type and on the alignment of all pointers.

Usage example:

#include <raft/core/device_mdarray.hpp>
#include <raft/core/resources.hpp>
#include <raft/core/operators.hpp>
#include <raft/linalg/map.cuh>

auto input = raft::make_device_vector<int>(res, n);
... fill input ..
auto squares = raft::make_device_vector<int>(res, n);
raft::linalg::map_offset(res, squares.view(), raft::sq_op{}, input.view());

Template Parameters:

OutType – data-type of the result (device_mdspan)
Func – the device-lambda performing the actual operation
InTypes – data-types of the inputs (device_mdspan)

Parameters:

res – [in] raft::resources
out – [out] the output of the map operation (device_mdspan)
f – [in] device lambda (InTypes::value_type xs…) -> OutType::value_type
ins – [in] the inputs (each of the same size as the output) (device_mdspan)

template<typename InType1, typename OutType, typename Func, typename = raft::enable_if_output_device_mdspan<OutType>, typename = raft::enable_if_input_device_mdspan<InType1>> void map(const raft::resources &res, InType1 in1, OutType out, Func f)#

Map a function over one mdspan.

Template Parameters:

InType1 – data-type of the input (device_mdspan)
OutType – data-type of the result (device_mdspan)
Func – the device-lambda performing the actual operation

Parameters:

res – [in] raft::resources
in1 – [in] the input (the same size as the output) (device_mdspan)
out – [out] the output of the map operation (device_mdspan)
f – [in] device lambda (InType1::value_type x) -> OutType::value_type

template<typename InType1, typename InType2, typename OutType, typename Func, typename = raft::enable_if_output_device_mdspan<OutType>, typename = raft::enable_if_input_device_mdspan<InType1, InType2>> void map(const raft::resources &res, InType1 in1, InType2 in2, OutType out, Func f)#

Map a function over two mdspans.

Template Parameters:

InType1 – data-type of the input (device_mdspan)
InType2 – data-type of the input (device_mdspan)
OutType – data-type of the result (device_mdspan)
Func – the device-lambda performing the actual operation

Parameters:

res – [in] raft::resources
in1 – [in] the input (the same size as the output) (device_mdspan)
in2 – [in] the input (the same size as the output) (device_mdspan)
out – [out] the output of the map operation (device_mdspan)
f – [in] device lambda (InType1::value_type x1, InType2::value_type x2) -> OutType::value_type

template<typename InType1, typename InType2, typename InType3, typename OutType, typename Func, typename = raft::enable_if_output_device_mdspan<OutType>, typename = raft::enable_if_input_device_mdspan<InType1, InType2, InType3>> void map(const raft::resources &res, InType1 in1, InType2 in2, InType3 in3, OutType out, Func f)#

Map a function over three mdspans.

Template Parameters:

InType1 – data-type of the input 1 (device_mdspan)
InType2 – data-type of the input 2 (device_mdspan)
InType3 – data-type of the input 3 (device_mdspan)
OutType – data-type of the result (device_mdspan)
Func – the device-lambda performing the actual operation

Parameters:

res – [in] raft::resources
in1 – [in] the input 1 (the same size as the output) (device_mdspan)
in2 – [in] the input 2 (the same size as the output) (device_mdspan)
in3 – [in] the input 3 (the same size as the output) (device_mdspan)
out – [out] the output of the map operation (device_mdspan)
f – [in] device lambda (InType1::value_type x1, InType2::value_type x2, InType3::value_type x3) -> OutType::value_type

template<typename OutType, typename Func, typename ...InTypes, typename = raft::enable_if_output_device_mdspan<OutType>, typename = raft::enable_if_input_device_mdspan<InTypes...>> void map_offset(const raft::resources &res, OutType out, Func f, InTypes... ins)#

Map a function over zero-based flat index (element offset) and zero or more inputs.

The algorithm applied on k inputs can be described in a following pseudo-code:

for (auto i: [0 ... out.size()]) {
  out[i] = f(i, in_0[i], in_1[i], ..., in_k[i])
}

Usage example:

#include <raft/core/device_mdarray.hpp>
#include <raft/core/resources.hpp>
#include <raft/core/operators.hpp>
#include <raft/linalg/map.cuh>

auto squares = raft::make_device_vector<int>(handle, n);
raft::linalg::map_offset(res, squares.view(), raft::sq_op{});

Template Parameters:

OutType – data-type of the result (device_mdspan)
Func – the device-lambda performing the actual operation
InTypes – data-types of the inputs (device_mdspan)

Parameters:

res – [in] raft::resources
out – [out] the output of the map operation (device_mdspan)
f – [in] device lambda (auto offset, InTypes::value_type xs…) -> OutType::value_type
ins – [in] the inputs (each of the same size as the output) (device_mdspan)

template<typename InType1, typename OutType, typename Func, typename = raft::enable_if_output_device_mdspan<OutType>, typename = raft::enable_if_input_device_mdspan<InType1>> void map_offset(const raft::resources &res, InType1 in1, OutType out, Func f)#

Map a function over zero-based flat index (element offset) and one mdspan.

Template Parameters:

InType1 – data-type of the input (device_mdspan)
OutType – data-type of the result (device_mdspan)
Func – the device-lambda performing the actual operation

Parameters:

res – [in] raft::resources
in1 – [in] the input (the same size as the output) (device_mdspan)
out – [out] the output of the map operation (device_mdspan)
f – [in] device lambda (auto offset, InType1::value_type x) -> OutType::value_type

template<typename InType1, typename InType2, typename OutType, typename Func, typename = raft::enable_if_output_device_mdspan<OutType>, typename = raft::enable_if_input_device_mdspan<InType1, InType2>> void map_offset(const raft::resources &res, InType1 in1, InType2 in2, OutType out, Func f)#

Map a function over zero-based flat index (element offset) and two mdspans.

Template Parameters:

InType1 – data-type of the input (device_mdspan)
InType2 – data-type of the input (device_mdspan)
OutType – data-type of the result (device_mdspan)
Func – the device-lambda performing the actual operation

Parameters:

res – [in] raft::resources
in1 – [in] the input (the same size as the output) (device_mdspan)
in2 – [in] the input (the same size as the output) (device_mdspan)
out – [out] the output of the map operation (device_mdspan)
f – [in] device lambda (auto offset, InType1::value_type x1, InType2::value_type x2) -> OutType::value_type

template<typename InType1, typename InType2, typename InType3, typename OutType, typename Func, typename = raft::enable_if_output_device_mdspan<OutType>, typename = raft::enable_if_input_device_mdspan<InType1, InType2, InType3>> void map_offset(const raft::resources &res, InType1 in1, InType2 in2, InType3 in3, OutType out, Func f)#

Map a function over zero-based flat index (element offset) and three mdspans.

Template Parameters:

InType1 – data-type of the input 1 (device_mdspan)
InType2 – data-type of the input 2 (device_mdspan)
InType3 – data-type of the input 3 (device_mdspan)
OutType – data-type of the result (device_mdspan)
Func – the device-lambda performing the actual operation

Parameters:

res – [in] raft::resources
in1 – [in] the input 1 (the same size as the output) (device_mdspan)
in2 – [in] the input 2 (the same size as the output) (device_mdspan)
in3 – [in] the input 3 (the same size as the output) (device_mdspan)
out – [out] the output of the map operation (device_mdspan)
f – [in] device lambda (auto offset, InType1::value_type x1, InType2::value_type x2, InType3::value_type x3) -> OutType::value_type

Map Reduce#

#include <raft/linalg/map_reduce.cuh>

namespace raft::linalg

template<typename InValueType, typename MapOp, typename ReduceLambda, typename IndexType, typename OutValueType, typename ScalarIdxType, typename ...Args> void map_reduce(raft::resources const &handle, raft::device_vector_view<const InValueType, IndexType> in, raft::device_scalar_view<OutValueType, ScalarIdxType> out, OutValueType neutral, MapOp map, ReduceLambda op, Args... args)#

CUDA version of map and then generic reduction operation.

Template Parameters:

InValueType – the data-type of the input
MapOp – the device-lambda performing the actual map operation
ReduceLambda – the device-lambda performing the actual reduction
IndexType – the index type
OutValueType – the data-type of the output
ScalarIdxType – index type of scalar
Args – additional parameters

Parameters:

handle – [in] raft::resources
in – [in] the input of type raft::device_vector_view
neutral – [in] The neutral element of the reduction operation. For example: 0 for sum, 1 for multiply, +Inf for Min, -Inf for Max
out – [out] the output reduced value assumed to be a raft::device_scalar_view
map – [in] the fused device-lambda
op – [in] the fused reduction device lambda
args – [in] additional input arrays

Mean Squared Error#

#include <raft/linalg/mean_squared_error.cuh>

namespace raft::linalg

template<typename InValueType, typename IndexType, typename OutValueType> void mean_squared_error(raft::resources const &handle, raft::device_vector_view<const InValueType, IndexType> A, raft::device_vector_view<const InValueType, IndexType> B, raft::device_scalar_view<OutValueType, IndexType> out, OutValueType weight)#

CUDA version mean squared error function mean((A-B)**2)

Template Parameters:

InValueType – Input data-type
IndexType – Input/Output index type
OutValueType – Output data-type
TPB – threads-per-block

Parameters:

handle – [in] raft::resources
A – [in] input raft::device_vector_view
B – [in] input raft::device_vector_view
out – [out] the output mean squared error value of type raft::device_scalar_view
weight – [in] weight to apply to every term in the mean squared error calculation

Norm#

#include <raft/linalg/norm.cuh>

namespace raft::linalg

template<typename ElementType, typename LayoutPolicy, typename IndexType, typename Lambda = raft::identity_op> void norm(raft::resources const &handle, raft::device_matrix_view<const ElementType, IndexType, LayoutPolicy> in, raft::device_vector_view<ElementType, IndexType> out, NormType type, Apply apply, Lambda fin_op = raft::identity_op())#

Compute norm of the input matrix and perform fin_op.

Template Parameters:

ElementType – Input/Output data type
LayoutPolicy – the layout of input (raft::row_major or raft::col_major)
IdxType – Integer type used to for addressing
Lambda – device final lambda

Parameters:

handle – [in] raft::resources
in – [in] the input raft::device_matrix_view
out – [out] the output raft::device_vector_view
type – [in] the type of norm to be applied
apply – [in] Whether to apply the norm along rows (raft::linalg::Apply::ALONG_ROWS) or along columns (raft::linalg::Apply::ALONG_COLUMNS)
fin_op – [in] the final lambda op

Normalize#

#include <raft/linalg/normalize.cuh>

namespace raft::linalg

template<typename ElementType, typename IndexType, typename MainLambda, typename ReduceLambda, typename FinalLambda> void row_normalize(raft::resources const &handle, raft::device_matrix_view<const ElementType, IndexType, row_major> in, raft::device_matrix_view<ElementType, IndexType, row_major> out, ElementType init, MainLambda main_op, ReduceLambda reduce_op, FinalLambda fin_op, ElementType eps = ElementType(1e-8))#

Divide rows by their norm defined by main_op, reduce_op and fin_op.

Template Parameters:

ElementType – Input/Output data type
IndexType – Integer type used to for addressing
MainLambda – Type of main_op
ReduceLambda – Type of reduce_op
FinalLambda – Type of fin_op

Parameters:

handle – [in] raft::resources
in – [in] the input raft::device_matrix_view
out – [out] the output raft::device_matrix_view
init – [in] Initialization value, i.e identity element for the reduction operation
main_op – [in] Operation to apply to the elements before reducing them (e.g square for L2)
reduce_op – [in] Operation to reduce a pair of elements (e.g sum for L2)
fin_op – [in] Operation to apply once to the reduction result to finalize the norm computation (e.g sqrt for L2)
eps – [in] If the norm is below eps, the row is considered zero and no division is applied

template<typename ElementType, typename IndexType> void row_normalize(raft::resources const &handle, raft::device_matrix_view<const ElementType, IndexType, row_major> in, raft::device_matrix_view<ElementType, IndexType, row_major> out, NormType norm_type, ElementType eps = ElementType(1e-8))#

Divide rows by their norm.

Template Parameters:

ElementType – Input/Output data type
IndexType – Integer type used to for addressing

Parameters:

handle – [in] raft::resources
in – [in] the input raft::device_matrix_view
out – [out] the output raft::device_matrix_view
norm_type – [in] the type of norm to be applied
eps – [in] If the norm is below eps, the row is considered zero and no division is applied

Reduction#

#include <raft/linalg/reduce.cuh>

namespace raft::linalg

template<typename InElementType, typename LayoutPolicy, typename OutElementType = InElementType, typename IdxType = std::uint32_t, typename MainLambda = raft::identity_op, typename ReduceLambda = raft::add_op, typename FinalLambda = raft::identity_op> void reduce(raft::resources const &handle, raft::device_matrix_view<const InElementType, IdxType, LayoutPolicy> data, raft::device_vector_view<OutElementType, IdxType> dots, OutElementType init, Apply apply, bool inplace = false, MainLambda main_op = raft::identity_op(), ReduceLambda reduce_op = raft::add_op(), FinalLambda final_op = raft::identity_op())#

Compute reduction of the input matrix along the requested dimension This API computes a reduction of a matrix whose underlying storage is either row-major or column-major, while allowing the choose the dimension for reduction. Depending upon the dimension chosen for reduction, the memory accesses may be coalesced or strided.

Template Parameters:

InElementType – the input data-type of underlying raft::matrix_view
LayoutPolicy – The layout of Input/Output (row or col major)
OutElementType – the output data-type of underlying raft::matrix_view and reduction
IndexType – Integer type used to for addressing
MainLambda – Unary lambda applied while acculumation (eg: L1 or L2 norm) It must be a ‘callable’ supporting the following input and output:
ReduceLambda – Binary lambda applied for reduction (eg: addition(+) for L2 norm) It must be a ‘callable’ supporting the following input and output:
FinalLambda – the final lambda applied before STG (eg: Sqrt for L2 norm) It must be a ‘callable’ supporting the following input and output:

Parameters:

handle – [in] raft::resources
data – [in] Input of type raft::device_matrix_view
dots – [out] Output of type raft::device_matrix_view
init – [in] initial value to use for the reduction
apply – [in] whether to reduce along rows or along columns (using raft::linalg::Apply)
main_op – [in] fused elementwise operation to apply before reduction
reduce_op – [in] fused binary reduction operation
final_op – [in] fused elementwise operation to apply before storing results
inplace – [in] reduction result added inplace or overwrites old values?

Reduce Cols By Key#

#include <raft/linalg/reduce_cols_by_key.cuh>

namespace raft::linalg

template<typename ElementType, typename KeyType = ElementType, typename IndexType = std::uint32_t> void reduce_cols_by_key(raft::resources const &handle, raft::device_matrix_view<const ElementType, IndexType, raft::row_major> data, raft::device_vector_view<const KeyType, IndexType> keys, raft::device_matrix_view<ElementType, IndexType, raft::row_major> out, IndexType nkeys = 0, bool reset_sums = true)#

Computes the sum-reduction of matrix columns for each given key TODO: Support generic reduction lambdas rapidsai/raft#860.

Template Parameters:

ElementType – the input data type (as well as the output reduced matrix)
KeyType – data type of the keys
IndexType – indexing arithmetic type

Parameters:

handle – [in] raft::resources
data – [in] the input data (dim = nrows x ncols). This is assumed to be in row-major layout of type raft::device_matrix_view
keys – [in] keys raft::device_vector_view (len = ncols). It is assumed that each key in this array is between [0, nkeys). In case this is not true, the caller is expected to have called make_monotonic primitive to prepare such a contiguous and monotonically increasing keys array.
out – [out] the output reduced raft::device_matrix_view along columns (dim = nrows x nkeys). This will be assumed to be in row-major layout
nkeys – [in] Number of unique keys in the keys array. By default, inferred from the number of columns of out
reset_sums – [in] Whether to reset the output sums to zero before reducing

Reduce Rows By Key#

#include <raft/linalg/reduce_rows_by_key.cuh>

namespace raft::linalg

template<typename ElementType, typename KeyType, typename WeightType, typename IndexType> void reduce_rows_by_key(raft::resources const &handle, raft::device_matrix_view<const ElementType, IndexType, raft::row_major> d_A, raft::device_vector_view<const KeyType, IndexType> d_keys, raft::device_matrix_view<ElementType, IndexType, raft::row_major> d_sums, IndexType n_unique_keys, raft::device_vector_view<char, IndexType> d_keys_char, std::optional<raft::device_vector_view<const WeightType, IndexType>> d_weights = std::nullopt, bool reset_sums = true)#

Computes the weighted sum-reduction of matrix rows for each given key TODO: Support generic reduction lambdas rapidsai/raft#860.

Template Parameters:

ElementType – data-type of input and output
KeyType – data-type of keys
WeightType – data-type of weights
IndexType – index type

Parameters:

handle – [in] raft::resources
d_A – [in] Input raft::device_mdspan (ncols * nrows)
d_keys – [in] Keys for each row raft::device_vector_view (1 x nrows)
d_sums – [out] Row sums by key raft::device_matrix_view (ncols x d_keys)
n_unique_keys – [in] Number of unique keys in d_keys
d_keys_char – [out] Scratch memory for conversion of keys to char, raft::device_vector_view
d_weights – [in] Weights for each observation in d_A raft::device_vector_view optional (1 x nrows)
reset_sums – [in] Whether to reset the output sums to zero before reducing

Strided Reduction#

#include <raft/linalg/strided_reduction.cuh>

namespace raft::linalg

template<typename InValueType, typename LayoutPolicy, typename OutValueType, typename IndexType, typename MainLambda = raft::identity_op, typename ReduceLambda = raft::add_op, typename FinalLambda = raft::identity_op> void strided_reduction(raft::resources const &handle, raft::device_matrix_view<const InValueType, IndexType, LayoutPolicy> data, raft::device_vector_view<OutValueType, IndexType> dots, OutValueType init, bool inplace = false, MainLambda main_op = raft::identity_op(), ReduceLambda reduce_op = raft::add_op(), FinalLambda final_op = raft::identity_op())#

Compute reduction of the input matrix along the strided dimension This API is to be used when the desired reduction is NOT along the dimension of the memory layout. For example, a row-major matrix will be reduced along the rows whereas a column-major matrix will be reduced along the columns.

Template Parameters:

InValueType – the input data-type of underlying raft::matrix_view
LayoutPolicy – The layout of Input/Output (row or col major)
OutValueType – the output data-type of underlying raft::matrix_view and reduction
IndexType – Integer type used to for addressing
MainLambda – Unary lambda applied while acculumation (eg: L1 or L2 norm) It must be a ‘callable’ supporting the following input and output:
ReduceLambda – Binary lambda applied for reduction (eg: addition(+) for L2 norm) It must be a ‘callable’ supporting the following input and output:
FinalLambda – the final lambda applied before STG (eg: Sqrt for L2 norm) It must be a ‘callable’ supporting the following input and output:

Parameters:

handle – [in] raft::resources
data – [in] Input of type raft::device_matrix_view
dots – [out] Output of type raft::device_matrix_view
init – [in] initial value to use for the reduction
main_op – [in] fused elementwise operation to apply before reduction
reduce_op – [in] fused binary reduction operation
final_op – [in] fused elementwise operation to apply before storing results
inplace – [in] reduction result added inplace or overwrites old values?