I've been trying to resolve an exercise from a textbook where I am faced with the challenge of count different events between sequential stages of an industrial process.
Information related to the process: A test Subject is made to go through a 3 stage process with stages A, B and C respectively the first one being A, second B and finally C; a test Subject may abandon the process at stages A or B and then start again from point A, each time the process takes place a dataset is created with the IDENTIFICATION of the test subject, the TIMESTAMP in which the stage took place and a unique VISIT_CODE During any stage, a test subject may trigger an "ALERT" and this will be recorded with the TIMESTAMP, ALERT_CODE and test subject IDENTIFICATION.
What is to be calculated: I have to create a code in R to count how many ALERTS where generated by the test subjects between stages A and B, between stages B and C and finally how many ALERTS where generated after C. Please note that a test subject may at some point abandon the process to later on start again from point A.
The textbook gives a HINT: "Take a closer look at the stage a test subject is currently to then determine if the ALERT is generated from stage A and before a stage B and also if the test subject triggers an ALERT after stage B and before stage C, but keep in mind that if a test subject abandons at stage A and triggers an ALERT if the TIMESTAMP of that ALERT is smaller than their next attempt at stage A then it should be credited as an ALERT after a stage A"
as another hint, the textbook reveals that the ALERTS after Stage C where only 1 and that it was trigger by test subject W-6 with ALTER_CODE AYUJ-3915716168. The datasets are:
Stage Process
TableA<-tribble(~STAGE, ~TEST_SUBJECT,~TIMESTAMP,~VISIT_CODE,
"A", "XYU-1", "10", "BKO",
"A", "XYU-1", "15", "JUJD",
"B", "XYU-1", "20", "DUDH",
"A", "FF-09", "25", "KSIWJD",
"B", "FF-09", "30", "AJAKAM",
"C", "FF-09", "35", "ZISKS",
"A", "UU-89", "40", "NNXJD",
"B", "UU-89", "45", "DDUWO",
"A", "I-44", "50", "JIWIW",
"A", "W-6", "55", "SHDN",
"B", "W-6", "60", "IWOLS",
"C", "W-6", "65", "JDDD",
"A", "U-90", "70", "DJDKSMS",
"B", "U-90", "75", "NDJSM",
"A", "T-87", "80", "DNDJDK")
Alerts dataset
TableB<-tribble(~TEST_SUBJECT,~TIMESTAMP,~ALERT_CODE,
"XYU-1", "11", "AYUJ-151571406",
"XYU-1", "12", "AYUJ-487008829",
"XYU-1", "28", "AYUJ-211990388",
"FF-09", "32", "AYUJ-4177221842",
"W-6", "56", "AYUJ-1300211351",
"W-6", "63", "AYUJ-3014305494",
"I-44", "67", "AYUJ-4454800551",
"U-90", "73", "AYUJ-1079921935",
"U-90", "76", "AYUJ-3348911727",
"U-90", "79", "AYUJ-2381219626",
"T-87", "82", "AYUJ-4778326278",
"W-6", "89", "AYUJ-3915716168")
SOLUTION:
The textbook states that the proper solution for this problem is:
Alerts between Stages A & B including alerts from test subjects that abandoned the process in the attempt nth at stage A | Alerts between Stages B & C including alerts from test subjects that abandoned the process in the attempt nth at stage B | Alerts after stage C |
---|---|---|
AYUJ-151571406 | AYUJ-211990388 | AYUJ-3915716168 |
AYUJ-487008829 | AYUJ-3014305494 | |
AYUJ-1300211351 | AYUJ-3348911727 | |
AYUJ-1079921935 | AYUJ-4177221842 | |
AYUJ-4778326278 | AYUJ-2381219626 | |
AYUJ-4454800551 |
What have I done? : I have tried grouping and "joining" the data by TEST_SUBJECT and stage and timestamp to make the counts but I and finding a really hard to set the conditions to make the code count events between stages because I am finding difficult to corelate which events took plave before and after each timespan by test subject and stage.
Thank you so much to all of you talented and cool people for your help or recomandations