sklearn.utils.class_weight.compute_sample_weight¶

sklearn.utils.class_weight.compute_sample_weight(class_weight, y, *, indices=None)

对不平衡数据集按类别估计样本的权重。

参数	说明
class_weight	dict, list of dicts, “balanced”, or None, optional 与类相关的权重，格式为{class_label：weight}。如果未给出，则所有类权重均为一。对于多输出问题，可以按与y列相同的顺序提供字典列表。请注意，对于多输出（包括多标签），应在其自己的字典中为每列的每个类定义权重。例如，对于四类多标签分类，权重应为[{0：1、1：1：1]，{0：1、1：5}，{0：1、1：1：1}，{0：1、1： 1}]，而不是[{1：1}，{2：5}，{3：1}，{4：1}]。 “平衡”模式使用y的值来自动调整与输入数据中与类频率成反比的权重：n_samples /（n_classes * np.bincount（y））。对于多输出，y的每一列的权重将相乘。
y	array-like of shape (n_samples,) or (n_samples, n_outputs) 每个样本的原始类别标签数组。
indices	array-like, shape (n_subsample,), or None 子样本中使用的索引数组。对于子样本，长度可以小于n_samples；对于具有重复索引的子样本，长度可以等于n_samples。如果没有，则将在整个样本中计算样本重量。如果提供了class_weight，则仅支持设置“ balanced”。

返回值	说明
sample_weight_vect	ndarray, shape (n_samples,) 与应用于原始y的样本权重相同的数组