Let me start by saying I have no experience with R, KNN or data science in general. I recently found Kaggle and have been playing around with the Digit Recognition competition/tutorial.
In this tutorial they provide some sample code to get you started with a basic submission:
# makes the KNN submission
library(FNN)
train <- read.csv("c:/Development/data/digits/train.csv", header=TRUE)
test <- read.csv("c:/Development/data/digits/test.csv", header=TRUE)
labels <- train[,1]
train <- train[,-1]
results <- (0:9)[knn(train, test, labels, k = 10, algorithm="cover_tree")]
write(results, file="knn_benchmark.csv", ncolumns=1)
My questions are:
- How can I view the nearest neighbors that have been selected for a particular test row?
- How can I modify which of those ten is selected
for my
results
?
These questions may be too broad. If so, I would welcome any links that could point me down the right road.
It is very possible that I have said something that doesn't make sense here. If this is the case, please correct me.
indices
it returns null, should I be doing anything different from your example? Can you recommend any resources for researching more on a creating a custom weighting scheme? Or examples of someone creating one that I can look at? – Suntan