I am trying to find out the size/shape of a DataFrame in PySpark. I do not see a single function that can do this.
In Python, I can do this:
data.shape()
Is there a similar function in PySpark? This is my current solution, but I am looking for an element one
row_number = data.count()
column_number = len(data.dtypes)
The computation of the number of columns is not ideal...
data.shape
for NumPy and Pandas?shape
is not a function. – Tailored