πSampling Metrics
Data Sampling
"Not all data is equal"
Expert anecdote by Jennifer Prendki, founder and CEO of Alectio
If you care about your nutrition, you donβt go to the supermarket and randomly select items from the shelves. You might eventually get the nutrients you need by eating random items from supermarket shelves, but you will eat a lot of junk food in the process. I think it is weird that in machine learning, people still think itβs better to sample the supermarket randomly than figure out what they need and focus their efforts there.
Credits: Human in the Loop Machine Learning by Robert Munroby
The assumption is that some data points are more valuable for the model than others. The focus is on how to identify these valuable data points.
Uncertainty Sampling Techniques
Least Confidence Sampling
Marginal Confidence
Ratio Confidence
Entropy Confidence
Least Confidence Sampling
Least Confidence Sampling is the most common method for uncertainty sampling, which takes the difference between 100% confidence and the most confidently predicted label for each item. Least confidence is sensitive to the base used for the softmax algorithm. Least confidence sampling is in the range of 0-1 where 1 is most uncertain.
Marginal Confidence
The most intuitive form of uncertainty sampling is the difference between the two most confident predictions. Margin of confidence is less sensitive than least confidence sampling to the base used for the softmax algorithm, but it is still sensitive. Marginal confidence sampling in 0-1 range where 1 is most uncertain
Ratio Confidence
Ratio of confidence is a variation on margin of confidence, looking at the ratio between the top two scores instead of the difference. Ratio of confidence is invariant across any base used in softmax. Ratio of confidence in 0-1 range , where 1 is most uncertain.
Entropy Confidence
Entropy mesaures the information (surpise) element of the model . High entropy occurs when the probabilities are almost likely . So in our case hight enropy 1 means model is most confused.
Last updated