• <xmp id="om0om">
  • <table id="om0om"><noscript id="om0om"></noscript></table>
  • Apache Spark ?? v21? RAPIDS ???.10

    Reading Time: 3 minutes

    ?? RAPIDS Accelerator for Apache Spark v21.10? ???? ? ????! ?? ?? ??????, NVIDIA? ????? ???, ??? ??? ????? ? ??? GPU ???? ?? ????? ??? ???? ???????.

    ? ??? ?? ???:

    • Speed up – ?? ?? ? ?? ??
    • New Functionality – ??? I/O ? ??? ??? ?? ?? ? ?? ?? ??
    • Community Updatesspark-examples repository ????.

    Speed up

    RAPIDS Accelerator for Apache Spark? ??? ?? ??? ?? ??? ???? ????. ?? ?? ????? ?? ?? ??? ??? ???? ?? ?????, ??? ???? ? ?? ????? ??? ??? ??? ??? ??? ???? ?? ???? ??? ???? ????.

    ?? ??? ??? ? ?? ??? ???????.

    • Count Distinct : ?? ??? ???? ??? ?? ??? ? ?? ?? ??? ?? ???? ? ???? ?????.
    • Window:???? ?? ?? ???? ?????? ?? ??? ???? ???? ? ??? ????? ?? ???? ? ??? ??? ??????.
    • Intersect: ??? ????? ?? ??? ???? ??????.
    • Cross-join: ?????? ???? ??? ????? ?? ??? ???? ????.

    ? ??? ?? 104GB RAM? ??? 2xT4 GPU? ??? ?? ???? ???(GCP) ???? ?????. ??? ??? ??? ??? 3TB?? ?? ??? ??? ????. ?? ? ??? ?? ??? ??? GitHub? spark-rapids-examples ????? ???? ? ????. ? 4?? ??? ??? ???? ???? ??? ??? ??? ?? ?? ?? ??(27?~1.5?)? ????? ?? ?????. ??? ??? ??? ?? ??? ?? ?? ??? ???? ?? ? ???? ???? ?????.

    A bar chart showing GPU vs CPU runtime for four microbenchmarks (Apache Spark Operators) 1. Cross-join 2. Intersect 3.Windowing (with & without data skew) 4.Count Distinct.. The preceding graph is a little sneak peek into the speed-up one can expect while using Spark-Rapids. A detailed performance analysis will be provided in the next release blog.
    ?? 1: Google ???? ??? Dataproc ????? ???????? ??? ???? ??: GPU ? CPU.

    ??? ??

    ????

    ???? Apache Spark ????? Spark 3.2? ?? 10?? ?????? ?? ?? ????. v2.10 ???? Spark 3.2 ? CUDA 11.4? ?????. ?? ?????? I/O, ?? ??? ?? ? ?? ?? ??? ?? ????? ??? ??????. RAPIDS Accelerator for Apache Spark v21.10? ???? ????? ???? ?? ??? ???? jar? ??????.

    ?? ? jar? ??? ?? ????? ??? ?????. ETL jar? ?Parquet ? ORC? ?? ?? ?? ??? ??????. ?? ??? ???? ?? HashAggregate, Sort, JoinSHJ? Join BHJ? ??? ? ?? ??? ????? ?????. ??? ??? ??? ?? ?? ??? ?? ???? ?? ???????.

    ?? ????? ??? ??? ?? ??? ???? ? ??? ?? ?? ??? ???? ??? ? ????. v21.10? ??? ?? ???? ????? pos_explode, create_map ?? ????. ??? ??? ?? ??? ??? RAPIDS Accelerator for Apache Spark ???? ?????.

    A bar chart showing GPU vs CPU runtime for two microbenchmarks (Apache Spark Operators) 1. Count Distinct 2. Windowing.
    ??2: Microbenchmark? ??? ? Google ???? ??? Dataproc ????? ??? ??? ??? ?????: GPU ? CPU.

    ?? ? ?? ?

    ???? ??? RAPIDS Accelerator for Apache Spark? ?? ? ?? ?? ?? ??? ??? ???????. ?? ?? ?? ?? ??? ?? ??? ??? ?? ??? ??? ??? ? ????. ?? ?? ? ?? ??, ?? ?? ?? ???? ??? ??? ??? ? ????.

    ?? ?? ?? ???? ?? ??? ??? ??? ??? ???? ??? ? ?? ??? ?????.

    ???? ????

    Azure? ?? ???? ???????. Azure ????? ?? Azure Synapse?? RAPIDS Accelerated for Apache Spark ??? ??????.

    ?? 2021? 11? 8??? 11??? ??? ????? ?? ??? GTC? ?? AI? ??? ??? ????? ??? ??? ???. RAPIDS Accelerator ?? ? ?? ??? ?????, Accelerated Apache Spark??? ??? ???? ?? ??? ??? ?? ???? ?????. ?? RAPIDS ? NVIDIA GPU? ??? Discover Common Apache Spark Operations Turbocharge? Apache Spark? ?? ????????? ?? ??? ???? ????.

    Coming soon

    ? ??? ???? 128?? 10? ??? ??? ?? ??, ?? ???? ?? ????? ?? ?? ??, ?? ??? ?? ?? ???? ?? ??? ?? ??? ?????.

    ?? A100?? ?? ?? ??? ??? ???? ? ??? ???? ??? ? ?? NVIDIA Ampere Architecture ?? GPU(A100/A30)? ?? MIG ??? ??? ???. ????? Apache Spark? RAPIDS Accelerator? ??? ?? ??? ???? ?????, ??? ??? ???????. GitHub? ???? Apache Spark?? RAPIDS Accelerator? ???? ??? ??? ????? ??? ? ?? ??? ?????.

    Discuss (0)
    +1

    Tags

    ?? ???

    人人超碰97caoporen国产