oasis.stratification
.stratify_by_scores¶
- oasis.stratification.stratify_by_scores(scores, goal_n_strata='auto', method='cum_sqrt_F', n_bins='auto')¶
Stratify by binning the items based on their scores
- Parameters
- scoresarray-like, shape=(n_items,)
ordered array of scores which quantify the classifier confidence for the items in the pool. High scores indicate a high confidence that the true label is a “1” (and vice versa for label “0”).
- goal_n_strataint or ‘auto’, optional, default ‘auto’
desired number of strata. If set to ‘auto’, the number is selected using the Freedman-Diaconis rule. Note that for the ‘cum_sqrt_F’ method this number is a goal – the actual number of strata created may be less than the goal.
- method{‘cum_sqrt_F’ or ‘equal_size’}, optional, default ‘cum_sqrt_F’
stratification method to use. ‘equal_size’ aims to create s
- Returns
- Strata instance
- Other Parameters
- n_binsint or ‘auto’, optional, default ‘auto’
specify the number of bins to use when estimating the distribution of the score function. This is used when
goal_n_strata = 'auto'
and/or whenmethod = 'cum_sqrt_F'
. If set to ‘auto’, the number is selected using the Freedman-Diaconis rule.