Beilstein J. Org. Chem.2024,20, 1614–1622, doi:10.3762/bjoc.20.144
. Furthermore, we employ our model on 1043 pKa-dependent reactions (aldol, Claisen, and Michael) and successfully indicate the reaction sites with a Matthew’s correlation coefficient (MCC) of 0.82.
Keywords: C–H pKa values; pKapredictor; Introduction
Over the years, the ability to selectively break a C–H
, for example, steric strain and charge delocalisation. We discuss this further in Supporting Information File 1, section “Outliers for the test set”.
We then compare our ML model with previously reported ML models for predicting pKa values, namely, the GCNN C–H pKapredictor by Roszak et al. [3] and
the XGBoost pKapredictor by Yang et al. [5]. Roszak et al. [3] used a mix of experimental data (414 compounds) [7], manually curated DFT data (212 compounds), and previously reported DFT data (194 C–H sites) [11]; they obtained a MAE of 2.18 pKa units for their test set. Yang et al. [5] used filtered
PDF
Graphical Abstract
Figure 1:
Correlating computed values and experimental pKa values for 695 compounds. r: Pearson correlation ...