BaselineRemoval package for background fluorescence/noise removal
Asked Answered
U

1

1

I'm trying to use the BaselineRemoval package to remove background fluorescence from some Raman spectra. In the code documentation, it states the preferred format for the input as input_array: A pandas dataframe column provided in input as dataframe['input_df_column']. It can also be a Python list object

My example-

df = pd.DataFrame(
    {'Patient': [1, 2, 3, 4, 5, 6],
     'Group': [1, 1, 1, 2, 2, 2],
     'Samples': [list(np.random.randn(3).round(2)) for i in range(6)]
    }
)

input_array = df['Samples']
polynomial_degree = 2

baseObj = BaselineRemoval(input_array)
Modpoly_output = baseObj.ModPoly(polynomial_degree)

However, this gives the error ValueError: setting an array element with a sequence.

Not sure how to proceed.

Upshot answered 28/8, 2020 at 12:46 Comment(4)
I checked the value of first row for Samples column. It seems like a list object [-0.89, 0.09, 1.23]. Raman spectra can be calculated for an array of values. for example Temperature or Pressure or Wavelength. BUT separately. What this pandas column instead has is [Temperature, pressure, wavelength]. Baseline removal can be done for an array object, not a multi-dimensional matrix. Instead, if you store these values in separate arrays and process each array separately then it will give you the desired baseline removed spectra.Crenation
Check and fix dimension of the array and split into multiple arrays and process for each separately.Crenation
Hmmm. The data frame above is just a dummy example to be taken at face value. They're all amplitudes recorded by a probe across the Raman scale which does not output a multi-dimensional matrix. So the column is [amplitue_at_wavenumber1, amplitue_at_wavenumber2, amplitue_at_wavenumber3]. In my original dataset, the samples column has over 2000 fluorescence amplitudes in a list per patient. Not sure sure how storing them as separate values will help.Upshot
I see that the problem you are facing is about how to oranize data. The library can help you for doing baseline removal, if its in an array form. In that case, a simple for loop should do the job for you, as you rightly identified it. Wish you all the best for your project.Crenation
U
1

A simple for loop should do it.

df = pd.DataFrame(
    {'Patient': [1, 2, 3, 4, 5, 6],
     'Group': [1, 1, 1, 2, 2, 2],
     'Samples': [list(np.random.randn(3).round(2)) for i in range(6)]
    }
)

input_array = df['Samples']
polynomial_degree = 2

for row in input_array:
    print(BaselineRemoval(row).ModPoly(polynomial_degree))
Upshot answered 28/8, 2020 at 14:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.