Untargeted high-resolution paired mass distance data mining for retrieving general chemical relationships
Untargeted high-resolution paired mass distance data mining for retrieving general chemical relationships"
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:
ABSTRACT Untargeted metabolomics analysis captures chemical reactions among small molecules. Common mass spectrometry-based metabolomics workflows first identify the small molecules
significantly associated with the outcome of interest, then begin exploring their biochemical relationships to understand biological fate or impact. We suggest an alternative by which
general chemical relationships including abiotic reactions can be directly retrieved through untargeted high-resolution paired mass distance (PMD) analysis without a priori knowledge of the
identities of participating compounds. PMDs calculated from the mass spectrometry data are linked to chemical reactions obtained via data mining of small molecule and reaction databases,
i.e. ‘PMD-based reactomics’. We demonstrate applications of PMD-based reactomics including PMD network analysis, source appointment of unknown compounds, and biomarker reaction discovery as
complements to compound discovery analyses used in traditional untargeted workflows. An R implementation of reactomics analysis and the reaction/PMD databases is available as the pmd
package. SIMILAR CONTENT BEING VIEWED BY OTHERS METABOLITE DISCOVERY THROUGH GLOBAL ANNOTATION OF UNTARGETED METABOLOMICS DATA Article 28 October 2021 REUSABILITY REPORT: ANNOTATING
METABOLITE MASS SPECTRA WITH DOMAIN-INSPIRED CHEMICAL FORMULA TRANSFORMERS Article 27 September 2024 DISCOVERING ORGANIC REACTIONS WITH A MACHINE-LEARNING-POWERED DECIPHERING OF TERA-SCALE
MASS SPECTROMETRY DATA Article Open access 16 March 2025 INTRODUCTION Untargeted metabolomics or nontargeted analysis using high resolution mass spectrometry (HRMS) is one of the most
popular analysis methods for unbiased measurement of organic compounds1,2. A typical metabolomics sample analysis workflow will follow a detection, annotation, MS/MS validation, and/or
standards validation process, from which interpretation of the relationships between these annotated or identified compounds can then be linked to biological pathways or disease development,
for example. However, difficulty annotating or identifying unknown compounds always limits the interpretation of findings3. One practical solution to this is matching experimentally
obtained fragment ions to a mass spectral database4, but many compounds remain unreported/absent, thereby preventing annotation. Rules or data mining-based prediction of in silico fragment
ions is successful in many applications2,5, but these approaches are prone to overfitting the known compounds, leading to false positives. Ultimately such workflows require final validation
with commercially available or synthetically generated analytical standards, which may not be available, for unequivocal identification. Potential molecular structures could be discerned
using biochemical knowledge, through the integration of known relationships between biochemical reactions (e.g., pathway analysis)3. Such methods are readily used to annotate compounds by
chemical class. For example, the referenced Kendrick mass defect (RKMD) was able to predict lipid class using specific mass distances for lipids and heteroatoms6, and isotope patterns in
combination with specific mass distances characteristic of halogenated compounds such as +Cl/−H, +Br/−H were used to screen halogenated chemical compounds in environmental samples7. For
these examples, known relationships among compounds were used to annotate unknown compounds, as a complementary approach to obtaining compound identifications. The most common relationships
among compounds are chemical reactions. Substrate-product pairs in a reaction form by exchanging functional groups or atoms. Almost all organic compounds originate from biochemical
processes, such as carbon fixation8,9. Like base pairing in DNA10, organic compounds follow biochemical reaction rules, resulting in characteristic mass differences between the paired
substrates and their products. Here, we build on our paired mass distance (PMD) concept11, that reflects such reaction rules by calculating the mass differences between two compounds or
charged ions. By expanding the PMD framework, it can be used to extract biological inference without identifying unknown compounds. Exploiting mass differences for compound identification is
not new. Mass distances have been used to reveal isotopologue information when peaks show a PMD of 1 Da12, identifying adducts from the same compound11 such as PMD 22.98 Da between adducts
[M+Na]+ and [M+H]+, or adducts formed via complex in-source reactions13 from mass spectrometry data. Such between-compound information has also been used to make annotations of unknown
compounds4,14, to classify compounds15, or to perform pathway-independent metabolomic network analysis16. However, these calculations of PMD were used to identify the compounds or pathways
and ultimately facilitate interpretations of the relationships between these predefined important compounds. Here, we propose that PMD can be used directly, skipping the step for annotation
or identification of individual compounds, to aggregate information at the reaction level, called “PMD-based Reactomics”. Noticing “reactomics” has been used in previous studies based on
chromatic patterns17 or NMR spectroscopy18, reactomics in this work will actually be PMD-based reactomics. HRMS can directly measure PMDs with the mass accuracy needed to provide reaction
level specificity. Therefore, HRMS has the potential to be used as a reaction detector to enable reaction level study investigations. Here, we use multiple databases and experimental data to
provide a proof-of-concept for using mass spectrometry in PMD-based reactomics. We also discuss potential applications such as PMD network analysis, biomarker reaction discovery, and source
appointment of unknown compounds. We envision that these applications will reveal the measurable reaction level changes without the need to assign molecular structure to unknown compounds.
Though the applications demonstrated here focused on biological processes, databases or reactions, abiotic reactions such as photochemical or pyrolysis reactions could also be studied using
PMD-based reactomics as long as the compounds can be measured by HRMS. RESULTS AND DISCUSSION Definition of concepts in PMD-based reactomics, qualitative, and relative quantitative PMD
analysis are provided in the “Methods” section. PMD NETWORK ANALYSIS Using the proposed PMD network analysis (see “Methods” section and Supplementary Methods for details), we can identify
metabolites associated with a known biomarker of interest. In fact, PMD network analysis can also be used in combination with classic identification techniques to enhance associated networks
with targeted biomarkers. As a proof of concept, we re-analyzed data from a published study to detect the biological metabolites of exposure to Tetrabromobisphenol A (TBBPA) in pumpkin19,20
using a local, recursive search strategy (see Fig. 1). Using TBBPA as a target of interest, we searched for PMDs linked with the debromination process, glycosylation, malonylation,
methylation, and hydroxylation, which are phase II reactions (e.g., primary metabolites) found in the original paper. Using this PMD network analysis, we identified 22 unique _m_/_z_ ions of
potential TBBPA metabolites, confirmed by the presence of brominated isotopologue mass spectral patterns (Supplementary Fig. 1). This total was 15 more than the seven unique ions that were
described in the original publication. Such a network was built based on the experimental data and our local fast recursive search algorithms as shown in Supplementary Methods: PMD network
analysis. As shown in Fig. 1, most of the potential metabolites of TBBPA were found as higher-generation TBBPA metabolites, which are too computationally intensive to be identified using in
silico prediction and matching protocols21. Similar applications and methodology have been reported for fourier transform mass spectrometry data to build a metabolic network in biological
samples22,23. However, based on accuracy analysis (see Supplementary Results and Discussion: PMD requires HRMS), we show that quadrupole time-of-flight mass spectrometry also has the
capability to perform such analysis for small molecules. In addition, our analysis considers the relationship among all paired ions to screen all of the possible metabolites of metabolites,
while the previous study only considers the peaks correlated with the parent compounds22. Furthermore, PMD-based reactomics as described here, can be implemented beyond biochemical analysis
to explore abiotic reactions such as photochemical or pyrolysis reactions. Using the same workflow, network analysis can be used to track the environmental/abiotic fate of chemical compounds
as long as their corresponding PMDs show high frequency in the data (such a feature is also available in the pmd package). SOURCE APPOINTMENT OF UNKNOWN COMPOUNDS When an unknown compound
is identified as a potential biomarker, determining whether it is associated with endogenous biochemical pathways or exogenous exposures can provide important information toward
identification. High frequency PMDs from Human Metabolome Database (HMDB) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are dominated by reactions with carbon, hydrogen, and oxygen
suggesting links to metabolism pathways (See Supplementary Tables 1, 2, and 3). Therefore, if an unknown biomarker is mapped using a PMD network, connection to these high frequency PMDs
would suggest an endogenous link. However, separation from this network is expected for an exogenous biomarker in which the reactive enzyme is not in the database. The exogenous compound is
secreted in the parent form, or can undergo changes in functional groups such as during phase I and phase II xenobiotic metabolism processes. In this case, endogenous and exogenous compounds
should be separated by their PMD network in samples. Topological differences in PMD networks for endogenous and exogenous compounds were explored using compounds from The Toxin and Toxin
Target Database (T3DB)24. As shown in Fig. 2, the PMD network of compounds was generated based on the top ten high frequency PMDs of 255 endogenous compounds with 223 unique masses, and 705
exogenous compounds with 394 unique masses and carcinogenic 1, 2A, or 2B classifications. Most endogenous compounds (Fig. 2, orange) were connected into a large network, while the exogenous
compounds’ networks were much smaller (Fig. 2, blue). Interestingly, most carcinogenic compounds were not connected by high frequency PMDs. Expanding this beyond just carcinogenic compounds,
we randomly sampled 255 exogenous compounds from a total of 2491 exogenous compounds available in T3DB, and built a PMD network with the top ten high frequency PMDs of those 510 compounds
(255 exogenous compounds and 255 endogenous compounds). This step was repeated 1000 times, and the average degree of connection with other nodes was calculated as 4.5 (95% Confidence
Interval, CI [4.3, 4.8]) for endogenous compounds and 1.7 (95% CI [1.2, 2.2]) for exogenous compounds. Similar findings were observed for known compounds. For demonstration, we selected
caffeine, glucose, bromophenol, and 5-cholestene as well characterized chemicals that are commonly observed with mass spectrometry, and paired them with other metabolites in the KEGG
reaction database using the top ten high frequency PMDs from Supplementary Table 1. As shown in Fig. 3, different topological properties (e.g., number of nodes, average distances, degree,
communities, etc.) of compounds’ PMD network were observed for each selected target metabolite. Endogenous compounds such as glucose or 5-cholestene were highly connected (average degree of
node is 3.4 and 3.2, respectively) while exogenous compounds such as caffeine and bromophenol have more simple networks (average degree of node is 2.2 and 2.4, respectively). Further, the
average PMD edge numbers between all nodes (edges end-to-end) in glucose and 5-cholestene networks are 9.7 and 6.6, respectively, while the average PMD edge numbers for caffeine and
bromophenol are 3.3 and 1.8, respectively. Larger average PMD edge numbers mean a complex network structure with lots of nodes, while smaller average PMD edge numbers mean a simple network
structure with a few nodes. Based on these estimates, we proposed that unknown metabolites with average network node degree more than three would be likely endogenous compounds. Similarly,
if the unknown compound belongs to a network with longer average PMD edge numbers, such compounds might also be of endogenous origin. The R code to generate compound networks for any
compound in the KEGG database is available in the Supplementary Methods. BIOMARKER REACTIONS PMD-based reactomics can be used to discover biomarker “reactions” instead of biomarker
“compounds”. Unlike typical biomarkers that are a specific chemical compound, biomarker reactions contain all peaks within a fixed PMD relationship and correlation cutoff. Thus, relative
quantitative PMD analysis (see “Methods” section for details) can be used to determine if there are differences between groups (e.g., control or treatment, exposed or not-exposed) on a
reaction level. Such differences are described as a biomarker “reaction”. We used publicly available metabolomics data (MetaboLight ID: MTBLS28) collected on urine from a study on lung
cancer in adults25. Four peaks out of 1807 features from 1005 blood samples (469 cases and 536 controls) generated the quantitative responses of PMD 2.02 Da. This biomarker reaction (e.g.
+2H from our annotated database) was significantly decreased in case samples compared with the control group (_t_-test, _p_ < 0.05, see Fig. 4). The original publication associated with
this dataset did not report any molecular biomarker associated with this reaction25, or the metabolites linked with this reaction, suggesting that relative quantitative PMD analysis offers
additional information on biological differences between the groups on the reaction level that may be lost when focused on analysis at the chemical level. PMD-level investigations directly
reduce the high dimensional analysis typically performed on a peaks or features level into low dimensional analysis on the chemical reaction level with explainable elemental compositions.
Furthermore, these results suggest that follow-up analysis in this population should include targeted analysis of proteins or enzymes linked with +2H changes. In summary, we provide the
theoretical basis and empirical evidence that high resolution mass spectrometry can be used as a reaction detector through calculation of high resolution paired mass distances and linkage to
reaction databases such as KEGG. PMD-based reactomics, as a new concept in bioinformatics, can be used to find biomarker reactions or develop PMD networks. The major limitation of PMD-based
reactomics analysis is that mass spectrometry software is designed for analysis of compounds instead of reactions. In this case, the uncertainty in PMD measurements can not be captured
directly from the instrument, and instead are calculated after data acquisition. Furthermore, while PMD-based reactomics can be applied to analyze environmental samples, the absence of
publicly available reaction databases for environmental processes currently limits PMD-based applications to the analysis of biological samples. Nevertheless, PMD-based reactomics techniques
provide information on biological changes for new biological inferences, that may not be observed through classic chemical biomarker discovery strategies. METHODS DEFINITIONS We first
define a reaction PMD (PMDR) using a theoretic framework. Then we demonstrate how a PMDR can be calculated using KEGG reaction R00025 as an example (see Eq. (1)). There are three KEGG
reaction classes (RC00126, RC02541, and RC02759) associated with this reaction, which is catalyzed by enzyme 1.13.12.16. $${\mathrm{Ethylnitronate}} + {\mathrm{Oxygen}} +
{\mathrm{Reduced}}\,{\mathrm{FMN}} < = > {\mathrm{Acetaldehyde}} + {\mathrm{Nitrite}} + {\mathrm{FMN}} + {\mathrm{Water}}$$ (1) In general, we define a chemical reaction (PMDR) as
follows Eq. (2): $$S_1 + S_2 + \ldots + S_{\mathrm{n}} < = > P_1 + P_2 + \ldots + P_{\mathrm{m}}({{n}} \geq 1,\,{{m}} \geq 1),$$ (2) where _S_ means substrates and _P_ mean products,
and _n_ and _m_ the number of substrates and products, respectively. A PMD matrix [3] for this reaction is generated: $$\begin{array}{*{20}{c}} {\,} & {{{S}}_1} & {{{S}}_2} &
\ldots & {{{S}}_{\mathrm{n}}} \\ {{{P}}_1} & {\left| {{{S}}_1{{\,-\,P}}_1} \right|} & {\left| {{{S}}_2\,-\,{\mathrm{P}}_2} \right|} & \ldots & {\left|
{{{S}}_{\mathrm{n}}{{\,-\,P}}_1} \right|} \\ {{{P}}_2} & {\left| {{{S}}_1{{\,-\,P}}_1} \right|} & {\left| {{{S}}_2{{\,-\,P}}_2} \right|} & \ldots & {\left|
{{{S}}_{\mathrm{n}}\,-\,{\mathrm{P}}_2} \right|} \\ \ldots & \ldots & \ldots & \ldots & \ldots \\ {{{P}}_{\mathrm{m}}} & {\left| {{{S}}_1\,-\,{\mathrm{P}}_{\mathrm{m}}}
\right|} & {\left| {{{S}}_2{{\,-\,P}}_{\mathrm{m}}} \right|} & \ldots & {\left| {{{S}}_{\mathrm{n}}{{\,-\,P}}_{\mathrm{m}}} \right|} \end{array}$$ (3) For each substrate, _S_k,
and each product, Pi, we calculate a PMD (|_S_n − _P_m|). Assuming that the minimum PMD would have a similar structure or molecular framework between substrate and products, we select the
minimum numeric PMD for each substrate as the substrate PMD (PMDSk) of the reaction (Eq. (4)). $${\mathrm{PMD}}_{{\mathrm{Sk}}} = {\mathrm{min}}\left( {\left|
{{{S}}_{\mathrm{k}}\,-\,{{P}}_{\mathrm{1}}} \right|,\left| {{{S}}_{\mathrm{k}}\,-\,{{P}}_{\mathrm{2}}} \right|, \ldots ,\left| {{{S}}_{\mathrm{k}}\,-\,{{P}}_{\mathrm{m}}} \right|}
\right)\left( {1 < = {{k}} < = {{n}}} \right)$$ (4) Then, the PMDR, or overall reaction PMD, is defined as the set of substrates’ PMD(s) (Eq. (5)): $${\mathrm{PMD}}_{\mathrm{R}} =
\left\{ {{\mathrm{PMD}}_{{\mathrm{S}}1},\,{\mathrm{PMD}}_{{\mathrm{S}}2}, \ldots ,{\mathrm{PMD}}_{{\mathrm{Sn}}}} \right\}$$ (5) For KEGG reaction R00025, S1 is ethylnitronate, S2 is oxygen,
S3 is reduced FMN, _P_1 is acetylaldehyde, _P_2 is nitrite, _P_3 is FMN, _P_4 is water, _n_ = 4, and _m_ = 3. A PMD matrix (6) for this reaction can be seen below (absolute value
calculations indicated in italics, corresponding formula matrix can be found in Supplementary Note 1), where we define PMDEthylnitronate = 27.023 Da, PMDOxygen = 12.036 Da, and PMDReduced
FMN = 2.016 Da. $$\begin{array}{*{20}{c}} {\,} & {{\mathrm{Ethylnitronate}}} & {{\mathrm{Oxygen}}} & {{\mathrm{Reduced}}\,{\mathrm{FMN}}} \\ {{\mathrm{Acetaldehyde}}} &
{29.998\,{\mathrm{Da}}\left| {{\it{74}}{\it{.0242}} - {\it{44}}{\it{.0262}}} \right|Da} & {12.036\,{\mathrm{Da}}\left| {{\it{31}}{\it{.9898}} - {\it{44}}{\it{.0262}}} \right|Da} &
{414.094\,{\mathrm{Da}}\left| {{\it{458}}{\it{.1202}} - {\it{44}}{\it{.0262}}} \right|Da} \\ {{\mathrm{Nitrite}}} & {27.024\,{\mathrm{Da}}\left| {{\it{74}}{\it{.0242}} -
{\it{47}}{\it{.0007}}} \right|Da} & {15.011\,{\mathrm{Da}}\left| {{\it{31}}{\it{.9898}} - {\it{47}}{\it{.0007}}} \right|Da} & {411.120\,{\mathrm{Da}}\left| {{\it{458}}{\it{.1202}} -
{\it{47}}{\it{.0007}}} \right|Da} \\ {{\mathrm{FMN}}} & {382.080\,{\mathrm{Da}}\left| {{\it{74}}{\it{.0242}} - {\it{456}}{\it{.1046}}} \right|Da} & {424.115\,{\mathrm{Da}}\left|
{{\it{31}}{\it{.9898}} - {\it{456}}{\it{.1046}}} \right|Da} & {2.016\,{\mathrm{Da}}\left| {{\it{458}}{\it{.1202}} - {\it{456}}{\it{.1046}}} \right|Da} \\
{{\mathrm{H}}_{\mathrm{2}}{\mathrm{O}}} & {56.014\,{\mathrm{Da}}\left| {{\it{74}}{\it{.0242}} - {\it{18}}{\it{.0105}}} \right|Da} & {13.979\,{\mathrm{Da}}\left|
{{\it{31}}{\it{.9898}} - {\it{18}}{\it{.0105}}} \right|Da} & {440.110\,{\mathrm{Da}}\left| {{\it{458}}{\it{.1202}} - {\it{18}}{\it{.0105}}} \right|Da} \end{array}$$ (6) In our example,
there are three PMDR calculated from three PMDS: PMDR is 27.023 Da, which is equivalent to the mass difference between two carbon atoms and three hydrogen atoms: PMDR is 12.036 Da for the
additions of two carbon atoms and four hydrogen atoms and loss of one oxygen atom: and PMDR is 2.016 Da for the addition of two hydrogen atoms. However, other reactions may have multiple
PMDS that generate the same PMDR value, such as certain combination reactions or replacement reactions. In this case, only one value will be kept as reaction PMD as long as it is the minimum
PMD for all of the involved substrates. In addition, each PMDR has two notations. One is shown as an absolute mass difference of the substrate-product pairs’ exact masses or monoisotopic
masses with unit Da. Another notation is using elemental compositions as the differences between two chemical formulas. Here, we describe it as an elemental composition instead of chemical
formula, because it also describes the gain and loss of elements, and therefore the neat mass change. In our example reaction, the PMDR can also be written as +2C3H, +2C4H/−O, and +2H,
respectively. This elemental composition can be linked to known chemical processes retrieved from a reaction database, i.e., KEGG. For example, +2H represents the elemental composition
change of a reaction involving a double bond breaking such as KEGG example RC00126, and +2C3H indicates reaction with nitronate monooxygenase (EC:1.13.12.16) or reaction class RC02541.
However, some elemental compositions, such as +2C4H/−H in our example, might not have a clear mechanism (e.g., no suggested KEGG reaction selection). By this definition, PMDR can be
generated automatically in terms of elemental compositions or mass units in Da. We used these definitions to establish reference databases of PMDs. We used KEGG as a “reaction database”
representing common reactions in human endogenous pathways, and we used HMDB19 as the “compound database” representing common reactions between chemicals measured in human biofluids (see
Supplementary Methods for data mining details). QUALITATIVE AND RELATIVE QUANTITATIVE PMD ANALYSIS PMD can be determined in biological or environmental samples from peaks observed in mass
spectrometry. Mathematically, a PMD of uncharged compounds is equivalent to the PMD of their charged species observed with a mass spectrometer, as long as both compounds share the same
adducts, neutral losses, and charges. In example reaction [1], reduced FMN has a monoisotopic mass of 458.1203 Da, while FMN has a monoisotopic mass of 456.1046 Da. Spectra from HMDB19
showed that common ions for reduced FMN and FMN using liquid chromatography (LC)-HRMS in negative mode are typically [M−H]− with _m_/_z_ 457.1124 and 455.0968, respectively. The mass
distance of the monoisotopic masses is 2.016 Da and the mass distance of the observed adducts is also 2.016 Da. In cases such as this, mass spectrometry can be used to detect the PMD of
paired compounds, but only for HRMS (see Supplementary Results and discussion: redundant peaks and fragments in PMD-based reactomics). In addition to qualitative analysis, peaks that share
the same PMD can be summed and used as a relative quantitative group measure of that specific “reaction” in the sample, thereby providing a description of chemical reaction level changes
across samples without annotating individual compounds. We define two types of PMD across samples: static PMD in which intensity ratios between the pairs are stable across samples, and
dynamic PMD in which the intensity ratios between pairs change across samples. Only static PMDs, those with similar instrument response, can be used for relative quantitative analysis (see
Supplementary Table 4 for theoretical example). Similar to other nontargeted analysis20, a relative standard deviation (RSD) between quantitative pair ratios <30% and a high correlation
between the paired peaks’ intensity (>0.6) are suggested to be considered a static PMD. DATA AVAILABILITY All of the dataset (Supplementary Data 1 for HMDB, Supplementary Data 2 for KEGG,
Supplementary Data 3 for T3DB and Supplementary Data 4 for MTBLS28) and reproducible R script (Supplementary Data 5) for all of the figures, tables and calculations are supplied in
Supplementary Information. CODE AVAILABILITY An R implementation of PMD-based reactomics analysis and the reaction/PMD databases is available as the pmd package
(https://yufree.github.io/pmd/). The stable version of the pmd package can also be accessed from CRAN (https://cran.r-project.org/web/packages/pmd/index.html). CHANGE HISTORY * _ 15 MARCH
2021 A Correction to this paper has been published: https://doi.org/10.1038/s42004-021-00479-1 _ REFERENCES * Zhang, A., Sun, H., Wang, P., Han, Y. & Wang, X. Modern analytical
techniques in metabolomics analysis. _Analyst_ 137, 293–300 (2012). Article CAS Google Scholar * Hooft, J. J. J., van der, Wandy, J., Barrett, M. P., Burgess, K. E. V. & Rogers, S.
Topic modeling for untargeted substructure exploration in metabolomics. _Proc. Natl Acad. Sci._ 113, 13738–13743 (2016). Article Google Scholar * Domingo-Almenara, X., Montenegro-Burke, J.
R., Benton, H. P. & Siuzdak, G. Annotation: a computational solution for streamlining metabolomics analysis. _Anal. Chem._ 90, 480–489 (2018). Article CAS Google Scholar * Guijas, C.
et al. METLIN: a technology platform for identifying knowns and unknowns. _Anal. Chem._ 90, 3156–3164 (2018). Article CAS Google Scholar * Wolf, S., Schmidt, S., Müller-Hannemann, M.
& Neumann, S. In silico fragmentation for computer assisted identification of metabolite mass spectra. _BMC Bioinform._ 11, 148 (2010). Article Google Scholar * Lerno, L. A., German,
J. B. & Lebrilla, C. B. Method for the identification of lipid classes based on referenced Kendrick mass analysis. _Anal. Chem._ 82, 4236–4245 (2010). Article CAS Google Scholar *
Jobst, K. J. et al. The use of mass defect plots for the identification of (novel) halogenated contaminants in the environment. _Anal. Bioanal. Chem._ 405, 3289–3297 (2013). Article CAS
Google Scholar * Bar-Even, A., Noor, E., Lewis, N. E. & Milo, R. Design and analysis of synthetic carbon fixation pathways. _Proc. Natl Acad. Sci._ 107, 8889–8894 (2010). Article CAS
Google Scholar * Normile, D. Round and round: a guide to the carbon cycle. _Science_ 325, 1642–1643 (2009). Article CAS Google Scholar * Donohue, J. & Trueblood, K. N. Base pairing
in DNA. _J. Mol. Biol._ 2, 363–371 (1960). Article CAS Google Scholar * Yu, M., Olkowicz, M. & Pawliszyn, J. Structure/reaction directed analysis for LC-MS based untargeted analysis.
_Anal. Chim. Acta_ 1050, 16–24 (2019). Article CAS Google Scholar * Chokkathukalam, A. et al. mzMatch–ISO: an R tool for the annotation and relative quantification of isotope-labelled
mass spectrometry data. _Bioinformatics_ 29, 281–283 (2013). Article CAS Google Scholar * Mahieu, N. G. & Patti, G. J. Systems-level annotation of a metabolomics data set reduces
25/000 features to fewer than 1000 unique metabolites. _Anal. Chem._ 89, 10397–10406 (2017). Article CAS Google Scholar * Shen, X. et al. Metabolic reaction network-based recursive
metabolite annotation for untargeted metabolomics. _Nat. Commun._ 10, 1–14 (2019). Article Google Scholar * Burgess, K. E. V., Borutzki, Y., Rankin, N., Daly, R. & Jourdan, F.
MetaNetter 2: a cytoscape plugin for ab initio network analysis and metabolite feature classification. _J. Chromatogr. B_ 1071, 68–74 (2017). Article CAS Google Scholar * Grapov, D.,
Wanichthanarak, K. & Fiehn, O. MetaMapR: pathway independent metabolomic network analysis incorporating unknowns. _Bioinformatics_ 31, 2757–2760 (2015). Article CAS Google Scholar *
Kolusheva, S. et al. A novel ‘reactomics’ approach for cancer diagnostics. _Sensors_ 12, 5572–5585 (2012). Article CAS Google Scholar * Sundekilde, U. K., Jarno, L., Eggers, N. &
Bertram, H. C. Real-time monitoring of enzyme-assisted animal protein hydrolysis by NMR spectroscopy—an NMR reactomics concept. _LWT_ 95, 9–16 (2018). Article CAS Google Scholar * Hou, X.
et al. Glycosylation of tetrabromobisphenol A in pumpkin. _Environ. Sci. Technol_. https://doi.org/10.1021/acs.est.9b02122 (2019). * Yu, M. et al. Evaluation and reduction of the analytical
uncertainties in GC-MS analysis using a boundary regression model. _Talanta_ 164, 141–147 (2017). Article CAS Google Scholar * Djoumbou-Feunang, Y. et al. BioTransformer: a comprehensive
computational tool for small molecule metabolism prediction and metabolite identification. _J. Cheminform._ 11, 2 (2019). Article Google Scholar * Breitling, R., Ritchie, S., Goodenowe,
D., Stewart, M. L. & Barrett, M. P. Ab initio prediction of metabolic networks using Fourier transform mass spectrometry data. _Metabolomics_ 2, 155–164 (2006). Article CAS Google
Scholar * Breitling, R., Pitt, A. R. & Barrett, M. P. Precision mapping of the metabolome. _Trends Biotechnol._ 24, 543–548 (2006). Article CAS Google Scholar * Wishart, D. et al.
T3DB: the toxic exposome database. _Nucleic Acids Res._ 43, D928–D934 (2015). Article CAS Google Scholar * Mathé, E. A. et al. Noninvasive urinary metabolomic profiling identifies
diagnostic and prognostic markers in lung cancer. _Cancer Res._ 74, 3259–3270 (2014). Article Google Scholar Download references ACKNOWLEDGEMENTS This research was financially supported by
NIEHS grants P30ES23515, 1U2CES030859, R21ES030882, and R01ES031117. We thank the Sanchez group (Gordon Luu, Alanna Condren, Jessica Cleary, Katherine Zink, Cynthia Grim, and Laura Sanchez)
for their comments in open review for the preprint of this manuscript. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Department of Environmental Medicine and Public Health, Icahn School of
Medicine at Mount Sinai, New York, NY, 10029, USA Miao Yu & Lauren Petrick * Institute for Exposomic Research, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA Lauren
Petrick Authors * Miao Yu View author publications You can also search for this author inPubMed Google Scholar * Lauren Petrick View author publications You can also search for this author
inPubMed Google Scholar CONTRIBUTIONS Miao Yu: Conceptualization, software development, data curation, visualization, writing-original draft, writing-review and editing; L.P. writing review
and editing, supervision, project administration, funding acquisition. All authors read, reviewed, and accepted the final manuscript. CORRESPONDING AUTHOR Correspondence to Lauren Petrick.
ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION DESCRIPTION OF ADDITIONAL SUPPLEMENTARY FILES SUPPLEMENTARY DATA 1 SUPPLEMENTARY
DATA 2 SUPPLEMENTARY DATA 3 SUPPLEMENTARY DATA 4 SUPPLEMENTARY DATA 5 PEER REVIEW FILE RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0
International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative
Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit
http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Yu, M., Petrick, L. Untargeted high-resolution paired mass distance data mining for
retrieving general chemical relationships. _Commun Chem_ 3, 157 (2020). https://doi.org/10.1038/s42004-020-00403-z Download citation * Received: 06 July 2020 * Accepted: 05 October 2020 *
Published: 06 November 2020 * DOI: https://doi.org/10.1038/s42004-020-00403-z SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable
link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative
Trending News
Flag wars: does the confederate flag symbolise identity or slavery? | thearticleThe Confederate flag has become both a weapon and a battleground in America’s culture wars: either an object of pride, e...
Shakespeare, alastair stewart and the right to be offended | thearticleIt was only a matter of time before the woke brigades came for Shakespeare. His 20th sonnet, for example, has a worrying...
Exploring differentiated instructionCAROLTOMLINSON: Differentiation would suggest that fairness happens not when we treat everyone as though they were the s...
Tory cuts are causing a poverty and unemployment crisis — scottish national partyCONTACT Scottish National Party Gordon Lamb House 3 Jackson's Entry Edinburgh, Scotland EH8 8PJ tel: 0800 633 5432 ...
8th Annual Archipelago Rally >> Scuttlebutt Sailing News: Providing sailing news for sailors_Does your sailing area have an event like this? Chris Museler explains…_ It was 2006 when Olympic silver Medalist Bob M...
Latests News
Untargeted high-resolution paired mass distance data mining for retrieving general chemical relationshipsABSTRACT Untargeted metabolomics analysis captures chemical reactions among small molecules. Common mass spectrometry-ba...
Ac milan: latest news, news articles, photos, videos - newsbytesManchester United forward Marcus Rashford has set pulses racing across Europe after declaring he is ready for a "ne...
Working time regulations: impact on uk labour marketResearch and analysis WORKING TIME REGULATIONS: IMPACT ON UK LABOUR MARKET This paper presents the evidence relating to ...
Local news in brief : irvine : christ college irvine adds master's programChrist College Irvine has added a master’s program to its curriculum, officials announced. The new classes for a master’...
Banking on art to draw in new businessThe lobby of the Ivanhoe branch of Imperial Savings in La Jolla looks more like an upscale gallery than a temple of mone...