Apologies. Whenever I try to make them into tables rather than into code it seems to think I have a code embedded and won't let me post this.
So here's an example of What I have
ID | File | Period | Begin | End | Laser1 | Laser2 | Lead |
---|---|---|---|---|---|---|---|
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
A01 | longname.zip | Recovery | 68500 | 158000 | qfin | plethh | plethi |
A01 | longname2.zip | Baseline | 2000 | 43000 | qfin | plethh | plethi |
A01 | longname2.zip | Run | 45000 | 135000 | qfin | plethh | plethi |
A01 | longname2.zip | Recovery | 135000 | 305000 | qfin | plethh | plethi |
Here's an example of What I want
ID | File | Period | Begin | End | Laser1 | Laser2 | Lead |
---|---|---|---|---|---|---|---|
A01 | longname.zip | Baseline | 0 | 6000 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 1000 | 7000 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 2000 | 8000 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 3000 | 9000 | qfin | plethh | plethi |
etc.
ID | File | Period | Begin | End | Laser1 | Laser2 | Lead |
---|---|---|---|---|---|---|---|
A01 | longname.zip | Baseline | 24000 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 36500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 31500 | 37500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 32500 | 38500 | qfin | plethh | plethi |
I've managed to filter by the unique file names and duplicate the rows required
What I can't seem to do is change the Begin and End values and segment them by Period. What I currently end up with, likely due to my row duplication is something like this
ID | File | Period | Begin | End | Laser1 | Laser2 | Lead |
---|---|---|---|---|---|---|---|
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Baseline | 0 | 30500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
A01 | longname.zip | Run | 30500 | 68500 | qfin | plethh | plethi |
In both Python and R I seem to get stuck in the same place. I'm more comfortable with R at the moment but started trying with Python.
I can't seem to fix the numbers in the Begin and End columns.
In R it thinks I want it to loop over 1000 columns which i don't have rather than adding 1000 to every row. Unfortunately not all files start at 0 and there may be a gap between End and Begin columns.
R
Period = dupdf$Period
for (period in Period) {
End_Final = max(dupdf$End)
dupdf_period <- dupdf%>%
filter(Period == period)
for (i in 2:nrow(dupdf_period)){
dupdf_period[i,Begin ] <- dupdf_period[i,Begin ] + 1000
dupdf_period[i,End ] <- dupdf_period[i,Begin ] + 6000
if (dupdf_period$End < End_Final){
dupdf_period$End
} else {
End_Final
break
}
}
dupdf_period[1,End ] <- dupdf_period[1,Begin ] + 6000
dupdf <- rbind(dupdf_period)
}
write.csv(dupdf, filename)
}
In Python
for period in Period:
row_index = 2
for row_index in concat_df.index:
#for row in concat_df.itertuples:
concat_df.at[row_index , "Begin"] += 1000
row_index2 = 1
for row_index2 in concat_df.index:
concat_df.at[row_index2, "End"] += (Begin + 6000)
concat_df['End'] = np.where((concat_df.End >= End_Final), concat_df.End.replace(End_Final), concat_df.End)