libkvikio
23.12.00
|
KvikIO is a Python and C++ library for high performance file IO. It provides C++ and Python bindings to cuFile which enables GPUDirect Storage (GDS). KvikIO also works efficiently when GDS isn't available and can read/write both host and device data seamlessly.
KvikIO C++ is a header-only library that is part of the RAPIDS suite of open-source software libraries for GPU-accelerated data science.
Notice this is the documentation for the C++ library. For the Python documentation, see under kvikio.
KvikIO is a header-only library and as such doesn't need installation. However, for convenience we release Conda packages that makes it easy to include KvikIO in your CMake projects.
We strongly recommend using mamba in place of conda, which we will do throughout the documentation.
Install the stable release from the rapidsai
channel with the following:
Install the nightly release from the rapidsai-nightly
channel with the following:
Notice if the nightly install doesn't work, set channel_priority: flexible
in your .condarc
.
An example of how to include KvikIO in an existing CMake project can be found here: https://github.com/rapidsai/kvikio/blob/HEAD/cpp/examples/downstream/.
To build the C++ example run:
Then run the example:
When KvikIO is running in compatibility mode, it doesn't load libcufile.so
. Instead, reads and writes are done using POSIX. Notice, this is not the same as the compatibility mode in cuFile. That is cuFile can run in compatibility mode while KvikIO is not.
Set the environment variable KVIKIO_COMPAT_MODE
to enable/disable compatibility mode. By default, compatibility mode is enabled:
libcufile.so
cannot be found./run/udev
isn't readable, which typically happens when running inside a docker image not launched with --volume /run/udev:/run/udev:ro
.KvikIO can use multiple threads for IO automatically. Set the environment variable KVIKIO_NTHREADS
to the number of threads in the thread pool. If not set, the default value is 1.
KvikIO splits parallel IO operations into multiple tasks. Set the environment variable KVIKIO_TASK_SIZE
to the maximum task size (in bytes). If not set, the default value is 4194304 (4 MiB).
In order to improve performance of small IO, .pread()
and .pwrite()
implement a shortcut that circumvent the threadpool and use the POSIX backend directly. Set the environment variable KVIKIO_GDS_THRESHOLD
to the minimum size (in bytes) to use GDS. If not set, the default value is 1048576 (1 MiB).
For a full runnable example see https://github.com/rapidsai/kvikio/blob/HEAD/cpp/examples/basic_io.cpp.