k means - Normalising Data to use Cosine Distance in Kmeans (Python) -
i solving problem have use cosine distance similarity measure kmeans clustering. however, standard kmeans clustering package (from sklearn package) uses euclidean distance standard, , not allow change this.
therefor understanding normalising original dataset through the code below. can run kmeans package (using euclidean distance) , same if had changed distance metric cosine distance?
from sklearn import preprocessing # normalise existing x x_norm = preprocessing.normalize(x) km2 = cluster.kmeans(n_clusters=5,init='random').fit(x_norm)
please let me know if mathematical understanding of incorrect?
Comments
Post a Comment