i have 2 csv files training , testing data. both of them (i show 1 of them, both of them same form of data , same attributes name) : full,id,id & ppdb,id & words sequence,id & synonyms,id & hypernyms,id & hyponyms,gold standard 1.667,0.476,0.952,0.476,1.429,0.952,0.476,2.345 3.056,1.111,1.667,1.111,3.056,1.389,1.111,1.9 1.765,1.176,1.176,1.176,1.765,1.176,1.176,2.2 0.714,0.714,0.714,0.714,0.714,0.714,0.714,0.0 1.538,0.769,0.769,0.769,1.538,0.769,0.769,2.586 2.188,1.875,1.875,1.875,1.875,2.188,1.875,1.667 3.333,1.333,1.333,1.333,3.333,2.0,1.333,2.8 2.5,1.667,1.667,1.667,2.222,1.944,1.667,2.481 i'm newbie in scikit-learn. learn example of training+label , testing+target data input : x_train = np.array(["new york hell of town", "new york dutch", "the big apple great", "new york called big apple", "nyc nice", ...