In R it could be done by using findInterval
.
classA = data.frame(id = c("id1", "id2", "id3", "id4", "id5"),
mark = c(50, 69.5, 45.5, 88.0, 98.5))
grades = data.frame(mark = c(0, 49.5, 59.5, 69.5, 79.5, 89.5, 95.5),
grade = c("F", "P", "C", "B", "A-", "A", "A+"))
cbind(classA, grade = grades$grade[findInterval(classA$mark, grades$mark)])
# id mark grade
#1 id1 50.0 P
#2 id2 69.5 B
#3 id3 45.5 F
#4 id4 88.0 A-
#5 id5 98.5 A+
cbind(classA, grade = grades$grade[findInterval(classA$mark, c(-Inf, grades$mark), all.inside = TRUE, left.open = TRUE)])
id mark grade
#1 id1 50.0 C
#2 id2 69.5 B
#3 id3 45.5 P
#4 id4 88.0 A
#5 id5 98.5 A+
In Julia you can use searchsortedlast
and searchsortedfirst
.
using DataFrames
classA = DataFrame(id = ["id1", "id2", "id3", "id4", "id5"],
mark = [50, 69.5, 45.5, 88.0, 98.5]);
grades = DataFrame(mark = [0, 49.5, 59.5, 69.5, 79.5, 89.5, 95.5],
grade = ["F", "P", "C", "B", "A-", "A", "A+"]);
classA[!, "Grade"] = grades.grade[[searchsortedlast(grades.mark, x) for x in classA.mark]]
classA
#5×3 DataFrame
# Row │ id mark Grade
# │ String Float64 String
#─────┼─────────────────────────
# 1 │ id1 50.0 P
# 2 │ id2 69.5 B
# 3 │ id3 45.5 F
# 4 │ id4 88.0 A-
# 5 │ id5 98.5 A+
classA[!, "Grade"] = grades.grade[min.(length(grades.grade), [searchsortedfirst(grades.mark, x) for x in classA.mark])]
classA
#5×3 DataFrame
# Row │ id mark Grade
# │ String Float64 String
#─────┼─────────────────────────
# 1 │ id1 50.0 C
# 2 │ id2 69.5 B
# 3 │ id3 45.5 P
# 4 │ id4 88.0 A
# 5 │ id5 98.5 A+
The same with InMemoryDatasets
in Julia as given in the question including results for comparison.
using InMemoryDatasets
classA = Dataset(id = ["id1", "id2", "id3", "id4", "id5"],
mark = [50, 69.5, 45.5, 88.0, 98.5]);
grades = Dataset(mark = [0, 49.5, 59.5, 69.5, 79.5, 89.5, 95.5],
grade = ["F", "P", "C", "B", "A-", "A", "A+"]);
closejoin(classA, grades, on = :mark)
#5×3 Dataset
# Row │ id mark grade
# │ identity identity identity
# │ String? Float64? String?
#─────┼──────────────────────────────
# 1 │ id1 50.0 P
# 2 │ id2 69.5 B
# 3 │ id3 45.5 F
# 4 │ id4 88.0 A-
# 5 │ id5 98.5 A+
closejoin(classA, grades, on = :mark, direction=:forward, border=:nearest)
#5×3 Dataset
# Row │ id mark grade
# │ identity identity identity
# │ String? Float64? String?
#─────┼──────────────────────────────
# 1 │ id1 50.0 C
# 2 │ id2 69.5 B
# 3 │ id3 45.5 P
# 4 │ id4 88.0 A
# 5 │ id5 98.5 A+