Binary classification with Softmax -
i training binary classifier using sigmoid activation function binary crossentropy gives accuracy around 98%.
same when train using softmax categorical_crossentropy gives low accuracy (< 40%).
passing targets binary_crossentropy list of 0s , 1s eg; [0,1,1,1,0].
any idea why happening?
right now, second model answers "class 0" can choose between 1 class (number of outputs of last layer).
as have 2 classes, need compute softmax + categorical_crossentropy on 2 outputs pick probable one.
hence, last layer should be:
model.add(dense(2, activation='softmax') model.compile(...) your sigmoid + binary_crossentropy model, computes probability of "class 0" being true analyzing single output number, correct.
edit: here small explanation sigmoid function
sigmoid can viewed mapping between real numbers space , probability space.
notice that:
sigmoid(-infinity) = 0 sigmoid(0) = 0.5 sigmoid(+infinity) = 1 so if real number, output of network, low, sigmoid decide probability of "class 0" close 0, , decide "class 1"
on contrary, if output of network high, sigmoid decide probability of "class 0" close 1, , decide "class 0"
its decision similar deciding class looking sign of output. however, not allow model learn! indeed, gradient of binary loss null everywhere, making impossible model learn error, not quantified properly.
that's why sigmoid , "binary_crossentropy" used:
surrogate binary loss, has nice smooth properties, , enables learning.
also, please find more info softmax function , cross entropy


Comments
Post a Comment