The NVIDIA RAPIDS Accelerator for Apache Spark software plug-in pioneered a zero code change user experience (UX) for GPU-accelerated data processing. It accelerates existing Apache Spark SQL and DataFrame-based applications on NVIDIA GPUs by over 9x without requiring a change to your queries or source code. This led to the new Spark RAPIDS ML Python library, which can speed up…
]]>Spark RAPIDS ML is an open-source Python package enabling NVIDIA GPU acceleration of PySpark MLlib. It offers PySpark MLlib DataFrame API compatibility and speedups when training with the supported algorithms. See New GPU Library Lowers Compute Costs for Apache Spark ML for more details. PySpark MLlib DataFrame API compatibility means easier incorporation into existing PySpark ML applications…
]]>Spark MLlib is a key component of Apache Spark for large-scale machine learning and provides built-in implementations of many popular machine learning algorithms. These implementations were created a decade ago, but do not leverage modern computing accelerators, such as NVIDIA GPUs. To address this gap, we have recently open-sourced Spark RAPIDS ML (NVIDIA/spark-rapids-ml)…
]]>