mlreco.iotools.datasets module¶
-
class
mlreco.iotools.datasets.LArCVDataset(data_schema, data_keys, limit_num_files=0, limit_num_samples=0, event_list=None, skip_event_list=None)[source]¶ Bases:
Generic[torch.utils.data.dataset.T_co]A generic interface for LArCV data files.
This Dataset is designed to produce a batch of arbitrary number of data chunks (e.g. input data matrix, segmentation label, point proposal target, clustering labels, etc.). Each data chunk is processed by parser functions defined in the iotools.parsers module. LArCVDataset object can be configured with arbitrary number of parser functions where each function can take arbitrary number of LArCV event data objects. The assumption is that each data chunk respects the LArCV event boundary.
-
__init__(data_schema, data_keys, limit_num_files=0, limit_num_samples=0, event_list=None, skip_event_list=None)[source]¶ Instantiates the LArCVDataset.
- Parameters
data_schema (dict) –
A dictionary of (string, dictionary) pairs. The key is a unique name of a data chunk in a batch and the associated dictionary must include:
parser: name of the parser
args: (key, value) pairs that correspond to parser argument names and their values
The nested dictionaries can replaced be lists, in which case they will be considered as parser argument values, in order.
data_keys (list) – a list of strings that is required to be present in the file paths
limit_num_files (int) – an integer limiting number of files to be taken per data directory
limit_num_samples (int) – an integer limiting number of samples to be taken per data
event_list (list) – a list of integers to specify which event (ttree index) to process
skip_event_list (list) – a list of integers to specify which events (ttree index) to skip
-
__module__= 'mlreco.iotools.datasets'¶
-
__parameters__= ()¶
-