RAPIDS v24.10 ??? ??? ???? ????? ?? ???? ????, ?? ??? ??? ??? ?? ? ?? ??????. ? ??? ?????? ??? ?? ??? ??? ?? ????? ????:
- ?? ?? ?? ???? NetworkX: ?? ?? ??(GA) ??
- Polars GPU ??: ?? ?? ?? ??
- GPU ????? ? ??? ??? UMAP ??
- NumPy ? PyArrow?? cuDF ??? ??? ??
- GitHub ?? CI ???? GPU ??? ?? ??
- Python 3.12 ? NumPy 2.x? ?? RAPIDS? ?? ??
?? ?? ?? ???? NetworkX
NetworkX ???? ???? cuGraph? NetworkX 3.4?? v24.10 ????? GA? ?????. ?? ???? GPU ?? ??? ??, ??? ??? ??, ??? ??? ???? ?? ???????.
??? ?? ???? NetworkX ????? ??? ??? ???, ?? CPU? GPU ?? ???? ??? ??? ? ?? ??? ??? ??????? ?????.
NX_CUGRAPH_AUTOCONFIG
?? ??? True
? ???? NetworkX ??? ??? ???? ? ????.
%env NX_CURGAPH_AUTOCONFIG=True
import pandas as pd
import networkx as nx
url = "https://data.rapids.ai/cugraph/datasets/cit-Patents.csv"
df = pd.read_csv(url, sep=" ", names=["src", "dst"], dtype="int32")
G = nx.from_pandas_edgelist(df, source="src", target="dst")
%time result = nx.betweenness_centrality(G, k=10)
????? ???? ?? betweenness centrality, PageRank ?? ?? ????? ? ? ????? ?? 10?, 50?, ??? 500??? ??? ?? ? ????.


cuGraph? ???? NetworkX? ?? ? ??? ??? ???? ??? ? ???, ?? ???? ??? ???? ?????.
?? ?? ?? ???? Polars ?? ??
cuDF? ???? Polars GPU ??? 9?? ?? ??? ???????. ?? ?? ???? ?? ?? ??? CPU ?? ???? ?? 13? ? ?? ?????? ??? ? ????.

PDS-H ???? ??? ?? 80 | GPU: NVIDIA H100 | CPU: Intel Xeon W9-3495X (Sapphire Rapids) | ????: ?? NVMe. ??: PDS-H? TPC-H?? ??????, ? ??? TPC-H ??? ??? ? ????.
Polars Lazy API? ?? ????, ???? ??? ??? ? `engine
` ???? ???? GPU? `collect
`??? Polars? ??? ? ????.
import polars as pl
df = pl.LazyFrame({"a": [1.242, 1.535]})
q = df.select(pl.col("a").round(1))
result = q.collect(engine="gpu")
??? ??? NVIDIA? Polars ?? ???? Polars GPU ?? ??? ????? Google Colab ????? ?? ???? ???.
GPU ????? ? ??? ??? UMAP ??
v24.10 ????? cuML? UMAP ????? ?? ???? ??? ?? ??? ????? GPU ????? ? ??? ?? ??? ?????. ??? ?? ?? ?? ?? ????? ???? ????? ?? ??? ??? CPU ???? ??????, ??? ??? GPU?? ???? ?? ??? ????? ?? KNN ???? ??? ? ????.
???? ??? `nnd_n_clusters
` ???? 1?? ? ?(???)?? ???? (??? ??) `data_on_host=True
` ???? `fit
` ?? `fit_transform
`? ???? ? ??? ??? ??? ??? ? ????.
from cuml.manifold import UMAP
import numpy as np
# Generate synthetic data using numpy (random float32 matrix)
X = np.random.rand(n_samples, n_features).astype(np.float32)
# UMAP parameters
num_clusters = 4 # Number of clusters for NN Descent batching, 1 means no clustering
data_on_host = True # Whether the data is stored on the host (CPU)
# UMAP model configuration
reducer = UMAP(
n_neighbors=10,
min_dist=0.01,
build_algo="nn_descent",
build_kwds={"nnd_n_clusters": num_clusters},
)
# Fit and transform the data
embeddings = reducer.fit_transform(X, data_on_host=data_on_host)
???? n_clusters? ???(?: 4)?? ???? GPU ??? ???? ???? ? ??? ?? ?? ? ????. ?? ?? ?? ???? ??? ??? ?? ???? ?? ?? ????? ??? ? ????, ??? ?? ??? ?? ??? GPU ???? ???? ??? ?? ?? ????.
cuDF ?? pandas ??? ??? ??
?? ??? ??
cuDF? pandas ?? ??? ?? NumPy ??? ???? ?????. ???? Python isinstance
??? ???? cuDF pandas? ??? ?? NumPy ??? ?? False? ????? ?? pandas? ??? ?? True? ??????. ??? ???? ?? ??? ???? ??? ?? ??? ?????? ???? ???? ?? ?? ??? ?????.
v24.10??, ??? ??? ????? ?? ???? DataFrame ?? ?? ??? ????? ? ?, cudf.pandas? ?? ????? ??? NumPy ??? ???? ? ??? ?????. ?? ?? ??? ????.
%load_ext cudf.pandas
import pandas as pd
import numpy as np
arr = pd.Series([1, 2, 3]).values # now returns a true numpy array
isinstance(arr, np.ndarray) # returns True
? ???? ?? NumPy C API? ???? ??? cuDF pandas? ???? ??? ? ?? ?????.
Arrow ??? ??
cuDF? ?? ??? PyArrow ??? ?????. Arrow ???? cuDF ????? ???? ????????. ???? cuDF? ?? ???? Arrow C++ API ??? ??? ?? ???? ??? ?? ?? ??? ?? ??? Arrow ???? ???? ?????.
?? ?????? ??? ??? Arrow C ??? ????? ???? ?? ???? Arrow C++ ??? ??? ??? ? ?????. ??? ??? ??, cuDF Python? ?? PyArrow 14 ??? ?? PyArrow ??? ??? ? ?? ?????.
GitHub ?? CI ???? GPU? ???? ?? ??
?????? GPU? GitHub ?? CI ???? ???? ???? ???? ??? ?? ?? ???? ??? ?????. scikit-learn ?? ??? ???? ?? ????? ???? ?? ??? ??? RAPIDS Deployment ??? ???????.
GitHub Actions? ?? ???? GPU ??? ?????. ??? GitHub? ?? ????? ???? ?? CI ?? ???? NVIDIA GPU? ??? ? ??? ?? ?????. ??? ????? RAPIDS ?????? ???? ?? ??? ?? GPU ???? ?? ????? ????? ?? ?? ??????.
GPU ??? ??? GitHub Action ?? ??? ???? ?? ????. GPU? ???? ??? ????? ?? ? ??? ??? ??, ????? ??? ???? ?? ?? ?? ??? ??? ? ????.
GPU ??? ????? ?? ??? GitHub Actions ???? ???? ? ??? ?????. ?? ?? NVIDIA ??? ???? ???? ??? GPU ?? VM?? ???? ??? GPU? ?????.

?? ?? runs-on
??? ???? ??? ??? ????? ?????? ??? ? ????.
name: GitHub Actions GPU Demo
run-name: ${{ github.actor }} is testing out GPU GitHub Actions
on: [push]
jobs:
gpu-workflow:
runs-on: linux-nvidia-gpu
steps:
- name: Check GPU is available
run: nvidia-smi
GitHub Actions GPU ?? ????? ??? ?? ??? ??? RAPIDS ?? ??? ?????. ? ???? GPU CI? ??? ?? ?? ??? ???? ????.
scikit-learn ????? ?? GitHub Actions? GPU ??? ???? ???? ???? ??? PR?? GPU ?????? ???? ???????. ??? ???? ???? ??? ??? ?????.
RAPIDS ??? ????
2024? 10?, RAPIDS ???? ?? ?? ??? ?????? ?? ??? ?? ??? ? ??? ? ?? ??? ????? ??????. ? ???? ?? Python 3.10-3.12? NumPy 1.x ? 2.x? ?????. ??, ?? fmt 11? spdlog 1.14? ?????. ? ????? ??? ?? conda-forge? ????? ???? ????. ??? ??? ????, ?? ?????? Python 3.9 ?? NCCL 2.19 ?? ??? ?? ??? ?????.
??
RAPIDS 24.10 ???? ??? ???? ????? ?? ???? ?? ?? ??? ? ??? ?? NVIDIA? ??? ? ?? ? ??? ? ?? ?????.
RAPIDS? ?? ????? ??? ? ???? ???? ??? ???.
?? ??
- DLI ??: RAPIDS cuDF? ??? ?????? ?? ?? ??
- GTC ??: RAPIDS cuDF? ??? ?? ?? ???? ??? ???
- GTC ??: NetworkX ???: ??? ??? ??? ??
- NGC ????: NVIDIA RAPIDS PB 2024? 10?(PB 24h2)
- NGC ????: NVIDIA RAPIDS PB 2024? 5?(PB 24h1)
- SDK: RAPIDS