mlreco.utils.metrics module

Various metrics used for evaluating clustering

mlreco.utils.metrics.unique_with_batch(label, bid)[source]

merge 1D arrays of label and bid into array of new labels for unique (label, bid) pairs

Parameters
  • label (array_like) – input labels

  • bid (array_like) – input batch ids

Returns

labels2 – new unique labels

Return type

ndarray

mlreco.utils.metrics.unique_label(label)[source]

transform label array into new label array where labels are between 0 and nlabels

mlreco.utils.metrics.ARI(pred, truth, bid=None)[source]

Compute the Adjusted Rand Index (ARI) score for two clusterings

mlreco.utils.metrics.AMI(pred, truth, bid=None)[source]

Compute the Adjusted Mutual Information (AMI) score for two clusterings

mlreco.utils.metrics.BD(data_sum, clusters_sum, clusters_sum_counts, data_fixed, clusters_fixed, clusters_fixed_counts)[source]

Helper function for SBD function.

mlreco.utils.metrics.SBD(pred, truth, bid=None)[source]

Compute the Symmetric Best Dice (SBD) Score for Instance Segmentation.

mlreco.utils.metrics.contingency_table(a, b, na=None, nb=None)[source]

build contingency table for a and b assume a and b have labels between 0 and na and 0 and nb respectively

mlreco.utils.metrics.purity(pred, truth, bid=None)[source]

cluster purity: intersection(pred, truth)/pred number in [0,1] - 1 indicates everything in the cluster is in the same ground-truth cluster

mlreco.utils.metrics.global_purity(pred, truth, bid=None)[source]

cluster purity as defined in https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html: intersection(pred, truth)/pred number in [0,1] - 1 indicates everything in the cluster is in the same ground-truth cluster

mlreco.utils.metrics.efficiency(pred, truth, bid=None)[source]

cluster efficiency: intersection(pred, truth)/truth number in [0,1] - 1 indicates everything is found in cluster

mlreco.utils.metrics.global_efficiency(pred, truth, bid=None)[source]

cluster efficiency as defined in https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html: intersection(pred, truth)/truth number in [0,1] - 1 indicates everything is found in cluster

mlreco.utils.metrics.purity_efficiency(pred, truth, bid=None, mean=True)[source]

function that combines purity and efficiency calculation into one go