Ok, so I have searched a lot and want to run arules on sales data. I just need to properly get the data in the right format and set up with the correct "factors" or "variables" and in basket form.
Right now I have sales data with the Order# and then the items inside that. Each order is unique (every new order, a new # gets created and includes the part#), but the same items obviously can appear in many orders.
Currently, my data is set up like this:
Order# Part# PartDescription
1 A PartA 1 B PartB 1 G PartG 2 R PartR 3 A PartA 3 B PartB 4 E PartE 5 Y PartY 6 A PartA 6 B PartB 6 F PartF 6 V PartV
So, R doesn't like it in this form, and I have to get it in the form that arules and data analysis will accept.
Yes I save it as a text file and have tried a .csv file, but if I can get step by step instructions on how to prep it or manipulate it in RStudio that'd be great.
I read that it's suppose to be in a basket form such as..
1 (A, B, G)
2 (R)
3 (A, B)
4 (E)
5 (Y)
6 (A, B, F, V)
If that's not accurate please correct me. I get the idea but I just need step by step instructions which I can't seem to find anywhere. I've tried using dplyr and tidyr. I have a good understanding of data analysis but need more direct help on RStudio, so if I could just have that step by step I will understand this further.
data <- read.csv("myfile.csv", comment.char="")
– Foliardata.frame
. When you import your data using rstudio import, the command to redo it turns up in the console - it should be something similar to what I had above. – Foliar