I have a large datarframe with 1739 rows and 1455 columns. I want to find the 150 lowest values for each row (Not the the 150 th value but 150 values).
I iterate over rows with a basic for loop.
I tried df.min(axis=1)
but it only gives out one min. And also the rolling_min
function without success.
Is there any existing function where i can enter the number of values i want to find witn .min?
My ultimate goal is to take the 150 lowest values and create a slope then calculate the area under the curve. Do this for each row and add the areas to obtain a volume.
Example of the the dataframe, I have a df that looks like this:
-218.7 -218.4 ... 217.2 217.5
0 56.632706 13.638315 ... 76.543000 76.543000
1 56.633455 13.576762 ... 76.543000 76.543000
2 -18.432203 -18.384091 ... 76.543000 76.543000
3 -18.476594 -18.439804 ... 76.543000 76.543000
The header is the '-218.7 ...' which are the coordinates in the x axis of a scan. The data is the height of the scan the y axis. What i need is the 150 lowest values for each rows and there associated column header as i want to make a curve for each row then calculate the area under the curve.
So i need for each line something like this :
-218.7 -218.4 ... for 150 columns
4 -18.532035 -18.497517 ... for 150 values
I don't think i need to store the header info for each line, a for loop would go trough each row one at a time.
np.sort(df.values, 1)[:, 0:150]
– Plaice