python - Nltk classify based on single parameter -


i trying use naivebayesclassifier classify times spent in areas of smart home.

my training data looks this:

[[{'time': '00:00'}, 'in'], [{'time': '00:01'}, 'in'], [{'time': '00:02'}, 'out'], [{'time': '00:03'}, 'out'], [{'time': '00:04'}, 'out'], [{'time': '00:05'}, 'out'], [{'time': '00:06'}, 'out'], ......,  [{'time': '08:06'}, 'in'], [{'time': '08:07'}, 'in'], [{'time': '08:08'}, 'in'], ... ] 

this code:

classifier = nltk.naivebayesclassifier.train(training_data)  start_date = datetime.strptime('2010-11-19 00:00', '%y-%m-%d %h:%m') end_date = datetime.strptime('2010-11-19 23:59', '%y-%m-%d %h:%m')  test_data = [] while start_date < end_date:     test_data.append(dict(time=start_date.strftime('%h:%m')))     start_date += timedelta(0, 60)  test = classifier.classify_many(test_data) print(test) 

result looks this:

['out', 'out', 'out', 'out', 'out', 'out', 'out', 'out', 'out',....] 

i never 'in' result. can see wrong classifier?

as medali suggested, problem in dataset has 11% of in, had adjust dataset according to: http://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/

i changed dataset, having hourly based data (if sensor activated during hour, added in).

this not perfect solution, enough case.


Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -