io
- deimos.io._load_hdf(path, level='ms1')[source]
Deprecated version. Loads data frame from HDF5 container.
- Parameters:
path (str) – Path to input HDF5 file.
level (str) – Access this level (group) of the HDF5 container. E.g., “ms1” or “ms2” for MS levels 1 or 2, respectively.
- Returns:
Feature coordinates and intensities for the specified level.
- Return type:
DataFrame
- deimos.io._save_hdf(path, data, dtype={}, compression_level=5)[source]
Deprecated version. Saves dictionary of
DataFrameto HDF5 container.- Parameters:
path (str) – Path to output file.
data (dict of
DataFrame) – Dictionary of feature coordinates and intensities to be saved. Dictionary keys are saved as “groups” (e.g., MS level) and data frame columns are saved as “datasets” in the HDF5 container.dtype (dict) – Specifies what data type to save each column, provided as column:dtype pairs. Defaults to 32-bit float if unspecified.
compression_level (int) – A value from 0-9 signaling the number of compression operations to apply. Higher values result in greater compression at the expense of computational overhead.
- deimos.io.build_factors(data, dims='detect')[source]
Determine sorted unique elements (factors) for each dimension in data.
- Parameters:
data (
DataFrame) – Feature coordinates and intensities.dims (str or list) – Dimensions to determine factors for. Attempts to autodetect by default.
- Returns:
Unique sorted values per dimension.
- Return type:
dictofarray
- deimos.io.build_index(data, factors)[source]
Construct data index from precomputed factors.
- Parameters:
data (
DataFrame) – Feature coordinates and intensities.factors (dict) – Per-dimension arrays of unique values.
- Returns:
Index per dimension.
- Return type:
dictofarray
- deimos.io.get_accessions(path)[source]
Determines accession fields available in the mzML file.
- Parameters:
path (str) – Path to mzML file.
- Returns:
Dictionary of accession fields.
- Return type:
dictof str
- deimos.io.load(path, key='ms1', columns=None, chunksize=10000000.0, meta=None, accession={}, dtype=<class 'numpy.float32'>)[source]
Loads data from HDF5 or mzML file.
- Parameters:
path (str or list of str) – Path to input file (or files if HDF5).
key (str) – Access this level (group) of the HDF5 container. E.g., “ms1” or “ms2” for MS levels 1 or 2, respectively. HDF5 format only.
columns (list) – A list of columns names to return. HDF5 format only.
chunksize (int) – Dask partition chunksize. HDF5 format only. Unused when loading single file.
meta (dict) – Dictionary of meta data per path. HDF5 format only. Unused when loading single file.
accession (dict) – Key-value pairs signaling which features to parse for in the mzML file. mzML format only. See
get_accessions()to obtain available values.dtype (data type) – Data type to encode values. mzML format only.
- Returns:
Feature coordinates and intensities for the specified level. Pandas is used when loading a single file, Dask for multiple files. Loading an mzML file returns a dictionary with keys per MS level.
- Return type:
DataFrameordictofDataFrame
- deimos.io.load_hdf(path, key='ms1', columns=None, chunksize=10000000.0, meta=None)[source]
Loads data frame from HDF5 container(s).
- Parameters:
path (str or list of str) – Path to input HDF5 file or files.
key (str) – Access this level (group) of the HDF5 container. E.g., “ms1” or “ms2” for MS levels 1 or 2, respectively.
columns (list) – A list of columns names to return.
chunksize (int) – Dask partition chunksize. Unused when loading single file.
meta (dict) – Dictionary of meta data per path. Unused when loading single file.
- Returns:
Feature coordinates and intensities for the specified level. Pandas is used when loading a single file, Dask for multiple files.
- Return type:
DataFrame
- deimos.io.load_hdf_multi(paths, key='ms1', columns=None, chunksize=10000000.0, meta=None)[source]
Loads data frame from HDF5 containers using Dask. Appends column to indicate source filenames.
- Parameters:
paths (list of str) – Paths to input HDF5 files.
key (str) – Access this level (group) of the HDF5 container. E.g., “ms1” or “ms2” for MS levels 1 or 2, respectively.
columns (list) – A list of columns names to return.
chunksize (int) – Dask partition chunksize.
meta (dict) – Dictionary of meta data per path.
- Returns:
Feature coordinates and intensities for the specified level.
- Return type:
DataFrame
- deimos.io.load_hdf_single(path, key='ms1', columns=None)[source]
Loads data frame from HDF5 container.
- Parameters:
path (str) – Path to input HDF5 file.
key (str) – Access this level (group) of the HDF5 container. E.g., “ms1” or “ms2” for MS levels 1 or 2, respectively.
columns (list) – A list of columns names to return.
- Returns:
Feature coordinates and intensities for the specified level.
- Return type:
DataFrame
- deimos.io.load_mzml(path, accession={}, dtype=<class 'numpy.float32'>)[source]
Loads in an mzML file, parsing for accession values, to yield a
DataFrame.- Parameters:
path (str) – Path to input mzML file.
accession (dict) – Key-value pairs signaling which features to parse for in the mzML file. See
get_accessions()to obtain available values. Scan, frame, m/z, and intensity are parsed by default.dtype (data type) – Data type to encode values.
- Returns:
Dictionary containing parsed feature coordinates and intensities, indexed by keys per MS level.
- Return type:
dictofDataFrame
- deimos.io.save(path, data, key='ms1', **kwargs)[source]
Saves
DataFrameto HDF5 or MGF container.- Parameters:
path (str) – Path to output file.
data (
DataFrame) – Feature coordinates and intensities to be saved. Precursor m/z and intensities should be paired to MS2 spectra for MGF format.key (str) – Save to this level (group) of the HDF5 container. E.g., “ms1” or “ms2” for MS levels 1 or 2, respectively. HDF5 format only.
kwargs – Keyword arguments exposed by
to_hdf()orsave_mgf().
- deimos.io.save_hdf(path, data, key='ms1', complevel=5, **kwargs)[source]
Saves
DataFrameto HDF5 container.- Parameters:
path (str) – Path to output file.
data (
DataFrame) – Feature coordinates and intensities to be saved.key (str) – Save to this level (group) of the HDF5 container. E.g., “ms1” or “ms2” for MS levels 1 or 2, respectively.
kwargs – Keyword arguments exposed by
to_hdf().
- deimos.io.save_mgf(path, features, groupby='index_ms1', precursor_mz='mz_ms1', fragment_mz='mz_ms2', fragment_intensity='intensity_ms2', precursor_metadata=None, sample_metadata=None)[source]
Saves data to MGF format.
- Parameters:
path (str) – Path to output file.
features (
DataFrame) – Precursor m/z and intensities paired to MS2 spectra.groupby (str or list of str) – Column(s) to group fragments by.
precursor_mz (str) – Column containing precursor m/z values.
fragment_mz (str) – Column containing fragment m/z values.
fragment_intensity (str) – Column containing fragment intensity values.
precursor_metadata (dict) – Precursor metadata key:value pairs of {MGF entry name}:{column name}.
sample_metadata (dict) – Sample metadata key:value pairs of {MGF entry name}:{value}.
- deimos.io.save_msp(path, features, groupby='index_ms1', precursor_mz='mz_ms1', fragment_mz='mz_ms2', fragment_intensity='intensity_ms2', precursor_metadata=None, sample_metadata=None)[source]
Saves data to MSP format.
- Parameters:
path (str) – Path to output file.
features (
DataFrame) – Precursor m/z and intensities paired to MS2 spectra.groupby (str or list of str) – Column(s) to group fragments by.
precursor_mz (str) – Column containing precursor m/z values.
fragment_mz (str) – Column containing fragment m/z values.
fragment_intensity (str) – Column containing fragment intensity values.
precursor_metadata (dict) – Precursor metadata key:value pairs of {MSP entry name}:{column name}.
sample_metadata (dict) – Sample metadata key:value pairs of {MSP entry name}:{value}.