I would like to create a large Polars DataFrame
using Rust, building it up row by row using data scraped from web pages. What is an efficient way to do this?
It looks like the DataFrame
should be created from a Vec
of Series
rather than adding rows to an empty DataFrame. However, how should a Series
be built up efficiently? I could create a Vec
and then create a Series
from the Vec
, but that sounds like it will end up copying all elements. Is there a way to build up a Series
element-by-element, and then build a DataFrame
from those?
I will actually be building up several DataFrames in parallel using Rayon, then combining them, but it looks like vstack does what I want there. It's the creation of the individual DataFrames that I can't find out how to do efficiently.
I did look at the source of the CSV parser but that is very complicated, and probably highly optimised, but is there a simple approach that is still reasonably efficient?
Utf8ChunkedBuilder
which looks like it can. – Quartziferous