COMPUTATIONAL QSAR-BASED MACHINE LEARNING APPROACH FOR PREDICTING ACTIVITY OF SGLT2 INHIBITORS USING THE KNIME PLATFORM

Authors

  • ADHA DASTU ILLAHI Laboratory of Biomedical Computation and Drug Design, Faculty of Pharmacy, Universitas Indonesia, Depok-16424, Jawa Barat, Indonesia https://orcid.org/0009-0004-9940-7640
  • GATOT FATWANTO HERTONO Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Indonesia, Depok, Indonesia Barat, Indonesia
  • ARRY YANUAR Laboratory of Biomedical Computation and Drug Design, Faculty of Pharmacy, Universitas Indonesia, Depok-16424, Jawa Barat, Indonesia

DOI:

https://doi.org/10.22159/ijap.2025v17i1.51726

Keywords:

QSAR, SGLT2 inhibitor, Machine Learning, KNIME, Artificial Intelligent, In silico

Abstract

Objective: This study aims to identify optimal predictive models and key molecular fragments by preparing a dataset and using machine learning techniques within the Konstanz Information Miner (KNIME) platform.

Methods: The human Sodium-glucose Cotransporter 2 (SGLT2) target dataset was obtained from the ChEMBL database and refined by removing salts, incomplete/incorrect data, and duplicates. The data was classified into active and inactive compounds, and fingerprints and descriptors were calculated. Christian Borgelt's Molecular Substructure Miner (MoSS) was employed to identify frequent molecular fragments. Following data partitioning, various ‘classification’ and ‘regression’ machine learning (ML) based Quantitative Structure-Activity Relationship (QSAR) models were developed and evaluated using different techniques, including sensitivity and Mean Squared Error (MSE).

Results: In QSAR classification, the Support Vector Machine (SVM) model demonstrated the best performance with an accuracy of 81.66%, while in QSAR Regression, the Extreme Gradient Boosting (XGB) model exhibited the best coefficient of determination (R2) and Mean Absolute Error (MAE) values of 0.69 and 0.47 respectively. The identification of frequent Molecular Fragments highlighted common characteristics in active SGLT2 inhibitors.

Conclusion: The results of developing these QSAR models indicate that machine learning methods can be effectively used to predict SGLT2 inhibitors virtually, thereby expediting the drug discovery process.

Downloads

Download data is not yet available.

References

Guo W, LI H, LI Y, Kong W. Renal intrinsic cells remodeling in diabetic kidney disease and the regulatory effects of SGLT2 inhibitors. Biomed Pharmacother. 2023 Sep;165:115025. doi: 10.1016/j.biopha.2023.115025, PMID 37385209.

Vallon V, Verma S. Effects of SGLT2 inhibitors on kidney and cardiovascular function. Annu Rev Physiol. 2021 Feb 10;83(1):503-28. doi: 10.1146/annurev-physiol-031620-095920, PMID 33197224.

Alsereidi FR, Khashim Z, Marzook H, Gupta A, Al Rawi AM, Ramadan MM. Targeting inflammatory signaling pathways with SGLT2 inhibitors: insights into cardiovascular health and cardiac cell improvement. Curr Probl Cardiol. 2024 May;49(5):102524. doi: 10.1016/j.cpcardiol.2024.102524, PMID 38492622.

El Khayari A, Hakam SM, Malka G, Rochette L, El Fatimy R. New insights into the cardio-renal benefits of SGLT2 inhibitors and the coordinated role of miR-30 family. Genes Dis. 2024;11(6):101174. doi: 10.1016/j.gendis.2023.101174, PMID 39224109.

Gandhi A, Masand V, Zaki ME, Al Hussain SA, Ben Ghorbal AB, Chapolikar A. QSAR analysis of sodium glucose co–transporter 2 (SGLT2) inhibitors for anti-hyperglycaemic lead development. SAR QSAR Environ Res. 2021 Sep 2;32(9):731-44. doi: 10.1080/1062936X.2021.1971295, PMID 34494464.

Shah M, Patel M, Shah M, Patel M, Prajapati M. Computational transformation in drug discovery: a comprehensive study on molecular docking and quantitative structure-activity relationship (QSAR). Intell Pharm. 2024 Mar;2(5):589-95. doi: 10.1016/j.ipha.2024.03.001.

Hasan MR, Alsaiari AA, Fakhurji BZ, Molla MH, Asseri AH, Sumon MA. Application of mathematical modeling and computational tools in the modern drug design and development process. Molecules. 2022 Jun 29;27(13):4169. doi: 10.3390/molecules27134169, PMID 35807415.

Makhijani S. Revitalizing therapeutics: drug repurposing as a cost-effective strategy for drug development. Int J App Pharm. 2024 May 7;16(3):56-61. doi: 10.22159/ijap.2024v16i3.49581.

Singh B, Crasto M, Ravi K, Singh S. Pharmaceutical advances: integrating artificial intelligence in QSAR combinatorial and green chemistry practices. Intell Pharm. 2024 May;2(5):598-608. doi: 10.1016/j.ipha.2024.05.005.

Pillai N, Dasgupta A, Sudsakorn S, Fretland J, Mavroudis PD. Machine learning guided early drug discovery of small molecules. Drug Discov Today. 2022 Aug;27(8):2209-15. doi: 10.1016/j.drudis.2022.03.017, PMID 35364270.

Ankith M, Surya Teja SP, Damodharan N. Artificial neural networks: functioningandapplications in pharmaceutical industry. Int J App Pharm. 2018 Sep 8;10(5):28. doi: 10.22159/ijap.2018v10i5.28300.

Berthold MR, Cebron N, Dill F, Gabriel TR, Kotter T, Meinl T. Knime the konstanz information miner. SIGKDD Explor Newsl. 2009 Nov 16;11(1):26-31. doi: 10.1145/1656274.1656280.

Hermansyah O, Bustamam A, Yanuar A. Virtual screening of dipeptidyl peptidase-4 inhibitors using quantitative structure-activity relationship based artificial intelligence and molecular docking of hit compounds. Comput Biol Chem. 2021 Dec;95:107597. doi: 10.1016/j.compbiolchem.2021.107597, PMID 34800858.

Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D. The chembl database in 2017. Nucleic Acids Res. 2017 Jan 4;45(D1):D945-54. doi: 10.1093/nar/gkw1074, PMID 27899562.

Kausar S, Falcao AO. An automated framework for QSAR model building. J Cheminform. 2018 Dec 16;10(1):1. doi: 10.1186/s13321-017-0256-5, PMID 29340790.

Moinul M, Amin SA, Kumar P, Patil UK, Gajbhiye A, Jha T. Exploring sodium-glucose cotransporter (SGLT2) inhibitors with machine learning approach: a novel hope in anti-diabetes drug discovery. J Mol Graph Model. 2022 Mar;111:108106. doi: 10.1016/j.jmgm.2021.108106, PMID 34923429.

Beisken S, Meinl T, Wiswedel B, DE Figueiredo LF, Berthold M, Steinbeck C. Knime CDK: workflow driven cheminformatics. BMC Bioinformatics. 2013 Dec 22;14(1):257. doi: 10.1186/1471-2105-14-257, PMID 24103053.

Myint KZ, Wang L, Tong Q, Xie XQ. Molecular fingerprint-based artificial neural networks QSAR for ligand biological activity predictions. Mol Pharm. 2012 Oct 1;9(10):2912-23. doi: 10.1021/mp300237z, PMID 22937990.

Carracedo Reboredo P, Linares Blanco J, Rodriguez Fernandez N, Cedron F, Novoa FJ, Carballal A. A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J. 2021;19:4538-58. doi: 10.1016/j.csbj.2021.08.011, PMID 34471498.

Yang J, Cai Y, Zhao K, Xie H, Chen X. Concepts and applications of chemical fingerprint for hit and lead screening. Drug Discov Today. 2022 Nov;27(11):103356. doi: 10.1016/j.drudis.2022.103356, PMID 36113834.

Veerasamy R, Rajak H, Jain A, Sivadasan S. Validation of QSAR models-strategies and importance. Int J Drug Des Discov. 2011;2(3):511-9.

Roy K, Editor. Advances in QSAR modeling. Berlin: Springer International Publishing: Vol. 24. Challenges and advances in computational chemistry and physics; 2017.

Bhattacharya S, Rathore A, Parwani D, Mallick C, Asati V, Agarwal S. An exhaustive perspective on structural insights of SGLT2 inhibitors: a novel class of antidiabetic agent. Eur J Med Chem. 2020 Oct;204:112523. doi: 10.1016/j.ejmech.2020.112523, PMID 32717480.

Ramani J, Shah H, Vyas VK, Sharma M. A review on the medicinal chemistry of sodium-glucose co-transporter 2 inhibitors (SGLT2-I): update from 2010 to present. Eur J Med Chem Rep. 2022 Dec;6:100074. doi: 10.1016/j.ejmcr.2022.100074.

Hussain M, Atif M, Babar M, Akhtar L. Comparison of efficacy and safety profile of empagliflozin versus dapagliflozin as add on therapy in type 2 diabetic patients. J Ayub Med Coll Abbottabad. 2021;33(4):593-7. PMID 35124914.

Published

07-01-2025

How to Cite

ILLAHI, A. D., HERTONO, G. F., & YANUAR, A. (2025). COMPUTATIONAL QSAR-BASED MACHINE LEARNING APPROACH FOR PREDICTING ACTIVITY OF SGLT2 INHIBITORS USING THE KNIME PLATFORM. International Journal of Applied Pharmaceutics, 17(1), 328–333. https://doi.org/10.22159/ijap.2025v17i1.51726

Issue

Section

Original Article(s)

Most read articles by the same author(s)

1 2 > >>