How do I change column content based on previous row?
Asked Answered
S

1

1

Apologies. Whenever I try to make them into tables rather than into code it seems to think I have a code embedded and won't let me post this.

So here's an example of What I have

ID File Period Begin End Laser1 Laser2 Lead
A01 longname.zip Baseline 0 30500 qfin plethh plethi
A01 longname.zip Run 30500 68500 qfin plethh plethi
A01 longname.zip Recovery 68500 158000 qfin plethh plethi
A01 longname2.zip Baseline 2000 43000 qfin plethh plethi
A01 longname2.zip Run 45000 135000 qfin plethh plethi
A01 longname2.zip Recovery 135000 305000 qfin plethh plethi

Here's an example of What I want

ID File Period Begin End Laser1 Laser2 Lead
A01 longname.zip Baseline 0 6000 qfin plethh plethi
A01 longname.zip Baseline 1000 7000 qfin plethh plethi
A01 longname.zip Baseline 2000 8000 qfin plethh plethi
A01 longname.zip Baseline 3000 9000 qfin plethh plethi

etc.

ID File Period Begin End Laser1 Laser2 Lead
A01 longname.zip Baseline 24000 30500 qfin plethh plethi
A01 longname.zip Run 30500 36500 qfin plethh plethi
A01 longname.zip Run 31500 37500 qfin plethh plethi
A01 longname.zip Run 32500 38500 qfin plethh plethi

I've managed to filter by the unique file names and duplicate the rows required

What I can't seem to do is change the Begin and End values and segment them by Period. What I currently end up with, likely due to my row duplication is something like this

ID File Period Begin End Laser1 Laser2 Lead
A01 longname.zip Baseline 0 30500 qfin plethh plethi
A01 longname.zip Baseline 0 30500 qfin plethh plethi
A01 longname.zip Baseline 0 30500 qfin plethh plethi
A01 longname.zip Baseline 0 30500 qfin plethh plethi
A01 longname.zip Baseline 0 30500 qfin plethh plethi
A01 longname.zip Baseline 0 30500 qfin plethh plethi
A01 longname.zip Run 30500 68500 qfin plethh plethi
A01 longname.zip Run 30500 68500 qfin plethh plethi
A01 longname.zip Run 30500 68500 qfin plethh plethi
A01 longname.zip Run 30500 68500 qfin plethh plethi
A01 longname.zip Run 30500 68500 qfin plethh plethi

In both Python and R I seem to get stuck in the same place. I'm more comfortable with R at the moment but started trying with Python.

I can't seem to fix the numbers in the Begin and End columns.

In R it thinks I want it to loop over 1000 columns which i don't have rather than adding 1000 to every row. Unfortunately not all files start at 0 and there may be a gap between End and Begin columns.

R

 Period = dupdf$Period
 
 for (period in Period) {
   
   End_Final = max(dupdf$End)
   
   dupdf_period <- dupdf%>%
     filter(Period == period)
   
   for (i in 2:nrow(dupdf_period)){
   
     dupdf_period[i,Begin ] <- dupdf_period[i,Begin ] + 1000
     dupdf_period[i,End ] <- dupdf_period[i,Begin ] + 6000
     
     if (dupdf_period$End < End_Final){
       dupdf_period$End
     } else {
       End_Final
       break
       }
     } 
   dupdf_period[1,End ] <- dupdf_period[1,Begin ] + 6000
   
   dupdf <-  rbind(dupdf_period)
   }
 write.csv(dupdf, filename)
 }

In Python

for period in Period:

                row_index = 2

                for row_index in concat_df.index:
                    #for row in concat_df.itertuples:
                    concat_df.at[row_index , "Begin"] += 1000

                    row_index2 = 1
                    for row_index2 in concat_df.index:
                        concat_df.at[row_index2, "End"] += (Begin + 6000)

                        concat_df['End'] = np.where((concat_df.End >= End_Final), concat_df.End.replace(End_Final), concat_df.End)
Symphysis answered 1/9, 2021 at 9:52 Comment(4)
This can help: #69001396Entrench
Thanks. Helps with shifting the first one but not in creating the new columns.Symphysis
How do you get from (1) Begin = 0 to End = 6000 (2) Begin = 1000 to End = 7000 and finally (3) Begin = 7000 to End = 8000? (1) and (2) have difference 6000 and (3)+ have 1000? Could you explain that?Popple
Thanks for pointing that out. Begin should be 2000, End 8000 then 3000, 9000. If I try to change it on post it tells me I have code embedded again.Symphysis
P
2

Edit Thanks to r2evans now without rowwise().

Perhaps this is what you are looking for:

library(dplyr)
library(tidyr)

df %>% 
  mutate(Begin_New = Map(seq, Begin, End - 6000, list(by = 1000))) %>% 
  unnest(Begin_New) %>% 
  group_by(ID, File, Period) %>% 
  mutate(End_New = ifelse(Begin_New + 7000 > End, End, Begin_New + 6000))

returns

# A tibble: 428 x 10
   ID    File         Period   Begin   End Laser1 Laser2 Lead   Begin_New End_New
   <chr> <chr>        <chr>    <dbl> <dbl> <chr>  <chr>  <chr>      <dbl>   <dbl>
 1 A01   longname.zip Baseline     0 30500 qfin   plethh plethi         0    6000
 2 A01   longname.zip Baseline     0 30500 qfin   plethh plethi      1000    7000
 3 A01   longname.zip Baseline     0 30500 qfin   plethh plethi      2000    8000
 4 A01   longname.zip Baseline     0 30500 qfin   plethh plethi      3000    9000
 5 A01   longname.zip Baseline     0 30500 qfin   plethh plethi      4000   10000
 6 A01   longname.zip Baseline     0 30500 qfin   plethh plethi      5000   11000
 7 A01   longname.zip Baseline     0 30500 qfin   plethh plethi      6000   12000
 8 A01   longname.zip Baseline     0 30500 qfin   plethh plethi      7000   13000
 9 A01   longname.zip Baseline     0 30500 qfin   plethh plethi      8000   14000
10 A01   longname.zip Baseline     0 30500 qfin   plethh plethi      9000   15000
11 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     10000   16000
12 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     11000   17000
13 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     12000   18000
14 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     13000   19000
15 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     14000   20000
16 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     15000   21000
17 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     16000   22000
18 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     17000   23000
19 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     18000   24000
20 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     19000   25000
21 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     20000   26000
22 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     21000   27000
23 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     22000   28000
24 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     23000   29000
25 A01   longname.zip Baseline     0 30500 qfin   plethh plethi     24000   30500
26 A01   longname.zip Run      30500 68500 qfin   plethh plethi     30500   36500
27 A01   longname.zip Run      30500 68500 qfin   plethh plethi     31500   37500
28 A01   longname.zip Run      30500 68500 qfin   plethh plethi     32500   38500
29 A01   longname.zip Run      30500 68500 qfin   plethh plethi     33500   39500
30 A01   longname.zip Run      30500 68500 qfin   plethh plethi     34500   40500

I named the columns Begin_New and End_New, you could change that easily into Begin and End.

Popple answered 1/9, 2021 at 11:11 Comment(5)
You can remove rowwise() if you change the first mutate to mutate(Begin_New = Map(seq, Begin, End - 6000, list(by = 1000))).Fruma
Ah, thank you! The rowwise() bothered me, but I couldn't find a way to get rid of it. :-)Popple
(... or mapply works too ... over to you :-)Fruma
Thanks! That was really helpful. I'm having a problem though with some files. Error: Problem with mutate() input Begin_New. x wrong sign in 'by' argument ℹ Input Begin_New is Map(seq, Begin, End - 6000, list(by = 1000)). This is probably when End -Begin is < 6000Symphysis
Fixed it with an ifelse statement: Map(seq, Begin, ifelse (Diff >= 6000, End - 6000, Begin), list(by = 1000))Symphysis

© 2022 - 2024 — McMap. All rights reserved.