JSON is a widely adopted format for text-based information working interoperably between systems, most commonly in web applications and large language models (LLMs). While the JSON format is human-readable, it is complex to process with data science and data engineering tools. JSON data often takes the form of newline-delimited JSON Lines (also known as NDJSON) to represent multiple records…
]]>Nested data types are a convenient way to represent hierarchical relationships within columnar data. They are frequently used as part of extract, transform, load (ETL) workloads in business intelligence, recommender systems, cybersecurity, geospatial, and other applications. List types can be used to easily attach multiple transactions to a user without creating a new lookup table…
]]>JSON is a widely adopted format for text-based information working interoperably between systems, most commonly in web applications. While the JSON format is human-readable, it is complex to process with data science and data engineering tools. To bridge that gap, RAPIDS cuDF provides a GPU-accelerated JSON reader (cudf.read_json) that is efficient and robust for many JSON data structures.
]]>