CAGRA#

CAGRA is a graph-based nearest neighbors algorithm that was built from the ground up for GPU acceleration. CAGRA demonstrates state-of-the art index build and query performance for both small- and large-batch sized search.

#include <cuvs/neighbors/cagra.hpp>

namespace cuvs::neighbors::cagra

Index build parameters#

enum class graph_build_algo#

ANN algorithm used by CAGRA to build knn graph.

Values:

enumerator IVF_PQ#
enumerator NN_DESCENT#
struct vpq_params#
#include <cagra.hpp>

Parameters for VPQ compression.

Public Functions

inline operator raft::neighbors::vpq_params() const#

Build a raft CAGRA index params from an existing cuvs CAGRA index params.

Public Members

uint32_t pq_bits = 8#

The bit length of the vector element after compression by PQ.

Possible values: [4, 5, 6, 7, 8].

Hint: the smaller the ‘pq_bits’, the smaller the index size and the better the search performance, but the lower the recall.

uint32_t pq_dim = 0#

The dimensionality of the vector after compression by PQ. When zero, an optimal value is selected using a heuristic.

TODO: at the moment dim must be a multiple pq_dim.

uint32_t vq_n_centers = 0#

Vector Quantization (VQ) codebook size - number of “coarse cluster centers”. When zero, an optimal value is selected using a heuristic.

uint32_t kmeans_n_iters = 25#

The number of iterations searching for kmeans centers (both VQ & PQ phases).

double vq_kmeans_trainset_fraction = 0#

The fraction of data to use during iterative kmeans building (VQ phase). When zero, an optimal value is selected using a heuristic.

double pq_kmeans_trainset_fraction = 0#

The fraction of data to use during iterative kmeans building (PQ phase). When zero, an optimal value is selected using a heuristic.

struct index_params : public cuvs::neighbors::ann::index_params#
#include <cagra.hpp>

Public Functions

inline operator raft::neighbors::cagra::index_params() const#

Build a raft CAGRA index params from an existing cuvs CAGRA index params.

Public Members

size_t intermediate_graph_degree = 128#

Degree of input graph for pruning.

size_t graph_degree = 64#

Degree of output graph.

graph_build_algo build_algo = graph_build_algo::IVF_PQ#

ANN algorithm to build knn graph.

size_t nn_descent_niter = 20#

Number of Iterations to run if building with NN_DESCENT

std::optional<vpq_params> compression = std::nullopt#

Specify compression parameters if compression is desired.

NOTE: this is experimental new API, consider it unsafe.

Index search parameters#

enum class search_algo#

Values:

enumerator SINGLE_CTA#

For large batch sizes.

enumerator MULTI_CTA#

For small batch sizes.

enumerator MULTI_KERNEL#
enumerator AUTO#
enum class hash_mode#

Values:

enumerator HASH#
enumerator SMALL#
enumerator AUTO#
struct search_params : public cuvs::neighbors::ann::search_params#
#include <cagra.hpp>

Public Functions

inline operator raft::neighbors::cagra::search_params() const#

Build a raft CAGRA search params from an existing cuvs CAGRA search params.

Public Members

size_t max_queries = 0#

Maximum number of queries to search at the same time (batch size). Auto select when 0.

size_t itopk_size = 64#

Number of intermediate search results retained during the search.

This is the main knob to adjust trade off between accuracy and search speed. Higher values improve the search accuracy.

size_t max_iterations = 0#

Upper limit of search iterations. Auto select when 0.

search_algo algo = search_algo::AUTO#

Which search implementation to use.

size_t team_size = 0#

Number of threads used to calculate a single distance. 4, 8, 16, or 32.

size_t search_width = 1#

Number of graph nodes to select as the starting point for the search in each iteration. aka search width?

size_t min_iterations = 0#

Lower limit of search iterations.

size_t thread_block_size = 0#

Thread block size. 0, 64, 128, 256, 512, 1024. Auto selection when 0.

hash_mode hashmap_mode = hash_mode::AUTO#

Hashmap type. Auto selection when AUTO.

size_t hashmap_min_bitlen = 0#

Lower limit of hashmap bit length. More than 8.

float hashmap_max_fill_rate = 0.5#

Upper limit of hashmap fill rate. More than 0.1, less than 0.9.

uint32_t num_random_samplings = 1#

Number of iterations of initial random seed node selection. 1 or more.

uint64_t rand_xor_mask = 0x128394#

Bit mask used for initial random seed node selection.

Index#

template<typename T, typename IdxT>
struct index : public cuvs::neighbors::ann::index#
#include <cagra.hpp>

CAGRA index.

The index stores the dataset and a kNN graph in device memory.

Template Parameters:
  • T – data element type

  • IdxT – type of the vector indices (represent dataset.extent(0))

Public Functions

inline index(raft::neighbors::cagra::index<T, IdxT> &&raft_idx)#

Build a cuvs CAGRA index from an existing RAFT CAGRA index.

inline constexpr auto metric() const noexcept -> cuvs::distance::DistanceType#

Distance metric used for clustering.

inline constexpr auto size() const noexcept -> IdxT#

Total length of the index (number of vectors).

inline constexpr auto dim() const noexcept -> uint32_t#

Dimensionality of the data.

inline constexpr auto graph_degree() const noexcept -> uint32_t#

Graph degree

inline auto dataset() const noexcept -> raft::device_matrix_view<const T, int64_t, raft::layout_stride>#

Dataset [size, dim]

inline auto graph() const noexcept -> raft::device_matrix_view<const IdxT, int64_t, raft::row_major>#

neighborhood graph [size, graph-degree]

inline index(raft::resources const &res, cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Expanded)#

Construct an empty index.

template<typename data_accessor, typename graph_accessor>
inline index(raft::resources const &res, cuvs::distance::DistanceType metric, raft::mdspan<const T, raft::matrix_extent<int64_t>, raft::row_major, data_accessor> dataset, raft::mdspan<const IdxT, raft::matrix_extent<int64_t>, raft::row_major, graph_accessor> knn_graph)#

Construct an index from dataset and knn_graph arrays

If the dataset and graph is already in GPU memory, then the index is just a thin wrapper around these that stores a non-owning a reference to the arrays.

The constructor also accepts host arrays. In that case they are copied to the device, and the device arrays will be owned by the index.

In case the dasates rows are not 16 bytes aligned, then we create a padded copy in device memory to ensure alignment for vectorized load.

Usage examples:

  • Cagra index is normally created by the cagra::build

    using namespace cuvs::neighbors::experimental;
    auto dataset = raft::make_host_matrix<float, int64_t>(n_rows, n_cols);
    load_dataset(dataset.view());
    // use default index parameters
    cagra::index_params index_params;
    // create and fill the index from a [N, D] dataset
    auto index = cagra::build(res, index_params, dataset);
    // use default search parameters
    cagra::search_params search_params;
    // search K nearest neighbours
    auto neighbors = raft::make_device_matrix<uint32_t, int64_t>(res, n_queries, k);
    auto distances = raft::make_device_matrix<float, int64_t>(res, n_queries, k);
    cagra::search(res, search_params, index, queries, neighbors, distances);
    
    In the above example, we have passed a host dataset to build. The returned index will own a device copy of the dataset and the knn_graph. In contrast, if we pass the dataset as a raft::device_mdspan to build, then it will only store a reference to it.

  • Constructing index using existing knn-graph

    using namespace cuvs::neighbors::experimental;
    
    auto dataset = raft::make_device_matrix<float, int64_t>(res, n_rows, n_cols);
    auto knn_graph = raft::make_device_matrix<uint32_n, int64_t>(res, n_rows, graph_degree);
    
    // custom loading and graph creation
    // load_dataset(dataset.view());
    // create_knn_graph(knn_graph.view());
    
    // Wrap the existing device arrays into an index structure
    cagra::index<T, IdxT> index(res, metric, raft::make_const_mdspan(dataset.view()),
                                raft::make_const_mdspan(knn_graph.view()));
    
    // Both knn_graph and dataset objects have to be in scope while the index is used because
    // the index only stores a reference to these.
    cagra::search(res, search_params, index, queries, neighbors, distances);
    

inline void update_dataset(raft::resources const &res, raft::device_matrix_view<const T, int64_t, raft::row_major> dataset)#

Replace the dataset with a new dataset.

If the new dataset rows are aligned on 16 bytes, then only a reference is stored to the dataset. It is the caller’s responsibility to ensure that dataset stays alive as long as the index.

inline void update_dataset(raft::resources const &res, raft::host_matrix_view<const T, int64_t, raft::row_major> dataset)#

Replace the dataset with a new dataset.

We create a copy of the dataset on the device. The index manages the lifetime of this copy.

inline void update_graph(raft::resources const &res, raft::device_matrix_view<const IdxT, int64_t, raft::row_major> knn_graph)#

Replace the graph with a new graph.

Since the new graph is a device array, we store a reference to that, and it is the caller’s responsibility to ensure that knn_graph stays alive as long as the index.

inline void update_graph(raft::resources const &res, raft::host_matrix_view<const IdxT, int64_t, raft::row_major> knn_graph)#

Replace the graph with a new graph.

We create a copy of the graph on the device. The index manages the lifetime of this copy.

Index build#

auto build(raft::resources const &handle, const cuvs::neighbors::cagra::index_params &params, raft::device_matrix_view<const float, int64_t, raft::row_major> dataset) -> cuvs::neighbors::cagra::index<float, uint32_t>#
auto build(raft::resources const &handle, const cuvs::neighbors::cagra::index_params &params, raft::host_matrix_view<const float, int64_t, raft::row_major> dataset) -> cuvs::neighbors::cagra::index<float, uint32_t>#
auto build(raft::resources const &handle, const cuvs::neighbors::cagra::index_params &params, raft::device_matrix_view<const int8_t, int64_t, raft::row_major> dataset) -> cuvs::neighbors::cagra::index<int8_t, uint32_t>#
auto build(raft::resources const &handle, const cuvs::neighbors::cagra::index_params &params, raft::host_matrix_view<const int8_t, int64_t, raft::row_major> dataset) -> cuvs::neighbors::cagra::index<int8_t, uint32_t>#
auto build(raft::resources const &handle, const cuvs::neighbors::cagra::index_params &params, raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> dataset) -> cuvs::neighbors::cagra::index<uint8_t, uint32_t>#
auto build(raft::resources const &handle, const cuvs::neighbors::cagra::index_params &params, raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset) -> cuvs::neighbors::cagra::index<uint8_t, uint32_t>#