API Reference
- class infinistore.DisableTorchCaching
Context manager to disable PyTorch CUDA memory caching.
When this context manager is entered, it sets the environment variable “PYTORCH_NO_CUDA_MEMORY_CACHING” to “1”, which disables CUDA memory caching in PyTorch. When the context manager is exited, the environment variable is deleted, restoring the default behavior.
- Usage:
- with DisableTorchCaching():
# Your code here
- exception infinistore.InfiniStoreException
- exception infinistore.InfiniStoreKeyNotFound
- class infinistore.InfinityConnection
A class to manage connections and data transfers with an Infinistore instance using either local or RDMA connections.
- conn
The connection object to the Infinistore instance.
- Type:
_infinistore.Connection
- local_connected
Indicates if connected to a local instance.
- Type:
bool
- rdma_connected
Indicates if connected to a remote instance via RDMA.
- Type:
bool
- config
Configuration object for the connection.
- Type:
ClientConfig
- allocate_rdma(keys: List[str], page_size_in_bytes: int) List[Tuple]
Allocates RDMA memory for the given keys. For RDMA writes, user must first allocate RDMA memory. and then use the allocated RDMA memory address to write data to the remote memory.
- Parameters:
keys (List[str]) – A list of keys for which RDMA memory is to be allocated.
page_size_in_bytes (int) – The size of each page in bytes.
- Returns:
A list of allocated RDMA memory addresses.
- Return type:
List
- Raises:
Exception – If RDMA is not connected.
Exception – If memory allocation fails.
- async allocate_rdma_async(keys: List[str], page_size_in_bytes: int)
Asynchronously allocate RDMA (Remote Direct Memory Access) resources for the given keys.
This function initiates an asynchronous RDMA allocation request and returns a future that will be completed when the allocation is done. The allocation is performed by invoking a callback function from the C++ code.
- Parameters:
keys (List[str]) – A list of keys for which RDMA resources are to be allocated.
page_size_in_bytes (int) – The size of each page in bytes.
- Raises:
Exception – If the RDMA connection is not established.
- Returns:
A future that will be set with the remote addresses once the allocation is complete.
- Return type:
Awaitable
- check_exist(key: str)
Check if a given key exists in the store.
- Parameters:
key (str) – The key to check for existence.
- Returns:
True if the key exists, False otherwise.
- Return type:
bool
- Raises:
Exception – If there is an error checking the key’s existence.
- close()
Closes the connection to the Infinistore instance.
- connect()
Establishes a connection to the Infinistore instance based on the configuration.
- Raises:
Exception – If already connected to a local instance.
Exception – If already connected to a remote instance.
Exception – If failed to initialize remote connection.
Exception – If local GPU connection is not to localhost.
Exception – If failed to setup RDMA connection.
- async connect_async()
Asynchronously establishes a connection based on the configuration.
- Raises:
Exception – If the connection type is local GPU, as it is not supported in async mode.
Exception – If the initialization of the remote connection fails.
Exception – If the setup of the RDMA connection fails.
- Logs:
A warning indicating that the async connect may have bugs.
This method runs the blocking connection setup in an executor to avoid blocking the event loop.
- delete_keys(keys: List[str])
Delete a list of keys
- Parameters:
keys (List[str]) – The list of string keys to delete
- Returns:
The count of the deleted keys
- Return type:
int
- Raises:
Exception – If there is something wrong(return value is -1)
- get_match_last_index(keys: List[str])
Retrieve the last index of a match for the given keys.
- Parameters:
keys (List[str]) – A list of string keys to search for matches.
- Returns:
The last index of a match.
- Return type:
int
- Raises:
Exception – If no match is found (i.e., if the return value is negative).
- local_gpu_write_cache(cache: Tensor, blocks: List[Tuple[str, int]], page_size: int)
Writes a tensor to the local GPU cache. :param cache: The tensor to be written to the cache. :type cache: torch.Tensor :param blocks: A list of tuples where each tuple contains a key and an offset. :type blocks: List[Tuple[str, int]] :param page_size: The size of each page in the cache. :type page_size: int
- Raises:
Exception – If writing to infinistore fails.
- Returns:
Returns 0 on success.
- Return type:
int
- local_gpu_write_cache_single(key: str, ptr: int, size: int, **kwargs)
Writes data to the local GPU cache.
This function writes data to the local GPU cache using the provided key, pointer, and size. It requires a connected local GPU and a valid device ID.
- Parameters:
key (str) – The key associated with the data to be written.
ptr (int) – The pointer to the data in memory.
size (int) – The size of the data to be written.
**kwargs – Additional keyword arguments. device_id (int): The ID of the GPU device to use.
- Raises:
Exception – If the local GPU is not connected.
Exception – If the key is empty.
Exception – If the size is 0.
Exception – If the pointer is 0.
Exception – If the device_id is not provided in kwargs.
Exception – If writing to infinistore fails.
- rdma_write_cache(cache: Tensor, offsets: List[int], page_size, remote_blocks: List)
Writes the given cache tensor to remote memory using RDMA (Remote Direct Memory Access).
- Parameters:
cache (torch.Tensor) – The tensor containing the data to be written to remote memory.
offsets (List[int]) – A list of offsets (in elements) where the data should be written.
page_size (int) – The size of each page to be written, in elements.
remote_blocks (List) – A list of remote memory blocks where the data should be written.
- Raises:
AssertionError – If RDMA is not connected.
Exception – If the RDMA write operation fails.
- Returns:
Returns 0 on success.
- Return type:
int
- async rdma_write_cache_async(cache: Tensor, offsets: List[int], page_size, remote_blocks: List)
Asynchronously writes a cache tensor to remote memory using RDMA.
- Parameters:
cache (torch.Tensor) – The tensor to be written to remote memory.
offsets (List[int]) – List of offsets where the tensor data should be written.
page_size (int) – The size of each page in the remote memory.
remote_blocks (List) – List of remote memory blocks where the data will be written.
- Raises:
Exception – If RDMA is not connected.
- Returns:
A future that will be set to 0 when the write operation is complete.
- Return type:
asyncio.Future
- rdma_write_cache_single(key: str, ptr: int, size: int, **kwargs)
Perform an RDMA write operation to cache a single item in the remote memory.
- Parameters:
key (str) – The key associated with the data to be written.
ptr (int) – The local memory pointer to the data to be written.
size (int) – The size of the data to be written.
**kwargs – Additional keyword arguments.
- Raises:
Exception – If the key is empty.
Exception – If the size is 0.
Exception – If the ptr is 0.
Exception – If the RDMA write operation fails.
- Returns:
None
- async rdma_write_cache_single_async(key: str, ptr: int, size: int, **kwargs)
Asynchronously writes data to the RDMA cache.
This function writes data to the RDMA cache using the provided key, pointer, and size. It ensures that the RDMA connection is established and the input parameters are valid.
- Parameters:
key (str) – The key associated with the data to be written.
ptr (int) – The memory address of the data to be written.
size (int) – The size of the data to be written.
**kwargs – Additional keyword arguments.
- Raises:
Exception – If the RDMA connection is not established.
Exception – If the key is empty.
Exception – If the size is 0.
Exception – If the pointer is 0.
Exception – If writing to Infinistore fails.
- Returns:
A future that resolves to 0 upon successful completion of the write operation.
- Return type:
int
- read_cache(cache: Tensor, blocks: List[Tuple[str, int]], page_size: int)
Reads data from the cache using either local or RDMA connection.
- Parameters:
cache (torch.Tensor) – The tensor containing the cache data.
blocks (List[Tuple[str, int]]) – A list of tuples where each tuple contains a key and an offset.
parameter. (each pair represents a page to be written to. The page is fixed size and is specified by the page_size) –
page_size (int) – The size of the page to read.
- Raises:
Exception – If the read operation fails or if not connected to any instance.
- async read_cache_async(cache: Tensor, blocks: List[Tuple[str, int]], page_size: int)
Asynchronously reads data from the RDMA cache into the provided tensor.
- Parameters:
cache (torch.Tensor) – The tensor to read data into.
blocks (List[Tuple[str, int]]) – A list of tuples where each tuple contains a key and an offset.
page_size (int) – The size of each page to read.
- Raises:
Exception – If RDMA is not connected or if reading from Infinistore fails.
Exception – If the tensor is not contiguous.
- Returns:
This function returns None but completes the future when the read operation is done.
- Return type:
None
- read_cache_single(key: str, ptr: int, size: int, **kwargs)
Reads a single cache entry from the infinistore.
Parameters: key (str): The key of the cache entry to read. ptr (int): The pointer to the memory location where the data should be read. size (int): The size of the data to read. kwargs: Additional keyword arguments.
Keyword Arguments: device_id (int): The ID of the device to use for local GPU connection (required if local_connected is True).
Raises: Exception: If the key is empty. Exception: If the size is 0. Exception: If the ptr is 0. Exception: If device_id is not provided when local_connected is True. Exception: If not connected to any instance. Exception: If the read operation fails.
- async read_cache_single_async(key: str, ptr: int, size: int, **kwargs)
Asynchronously reads a single cache entry from the InfiniStore.
- Parameters:
key (str) – The key of the cache entry to read.
ptr (int) – The pointer to the memory location where the data should be read.
size (int) – The size of the data to read.
**kwargs – Additional keyword arguments.
- Raises:
Exception – If the key is empty.
Exception – If the size is 0.
Exception – If the ptr is 0.
Exception – If async read for local GPU is not supported.
InfiniStoreKeyNotFound – If the key is not found in the InfiniStore.
Exception – If there is a failure in reading from the InfiniStore.
- Returns:
The result code of the read operation.
- Return type:
int
- register_mr(arg: Tensor | int, size: int | None = None)
- register_mr(ptr: int, size)
- register_mr(cache: Tensor, size: int | None = None)
Registers a memory region (MR) for the given argument.
- Parameters:
arg (Union[torch.Tensor, int]) – The argument for which the memory region is to be registered. It can be either a torch.Tensor or an pointer.
size (Optional[int], optional) – The size of the memory region to be registered. Defaults to None.
- Raises:
NotImplementedError – If the type of the argument is not supported.
- sync()
Synchronizes the current instance with the connected infinistore instance. This method attempts to synchronize the current instance using either a local connection or an RDMA connection. If neither connection is available, it raises an exception. :raises Exception: If not connected to any instance. :raises Exception: If synchronization fails with a negative return code.