Function to calculate Euclidean distance in R -
i trying implement knn classifier in r scratch on iris data set , part of have written function calculate euclidean distance. here code.
known_data <- iris[1:15,c("sepal.length", "petal.length", "class")] unknown_data <- iris[16,c("sepal.length", "petal.length")] # euclidean distance euclidean_dist <- function(k,unk) { distance <- 0 for(i in 1:nrow(k)) distance[i] <- sqrt((k[,1][i] - unk[,1][i])^2 + (k[,2][i] - unk[,2][i])^2) return(distance) } euclidean_dist(known_data, unknown_data)
however, when call function it's returning first value correctly , rest na. show have gone wrong code? in advance.
the aim calculate distance between ith row of known_data, , single unknown_data point.
how fix code
when calculate distance[i]
, you're trying access ith row of unknown data point, doesn't exits, , hence na
. believe code should run fine if make following edits:
known_data <- iris[1:15,c("sepal.length", "petal.length", "class")] unknown_data <- iris[16,c("sepal.length", "petal.length")] # euclidean distance euclidean_dist <- function(k,unk) { # make distance vector [although not technically required] distance <- rep(0, nrow(k)) for(i in 1:nrow(k)) # change unk[,1][i] unk[1,1] , unk[,2][i] distance[i] <- sqrt((k[,1][i] - unk[1,1])^2 + (k[,2][i] - unk[1,2])^2) return(distance) } euclidean_dist(known_data, unknown_data)
one final note - in version of r i'm using, known dataset uses species
opposed class
column
an alternative method
as suggested @roman luštrik, entire aim of getting euclidean distances can achieved simple one-liner:
sqrt((known_data[, 1] - unknown_data[, 1])^2 + (known_data[, 2] - unknown_data[, 2])^2)
this similar function wrote, in vectorised form, rather through loop, preferable way of doing things in r.
Comments
Post a Comment