COMPUTATIONAL QSAR-BASED MACHINE LEARNING APPROACH FOR PREDICTING ACTIVITY OF SGLT2 INHIBITORS USING THE KNIME PLATFORM

Authors

  • ADHA DASTU ILLAHI Laboratory of Biomedical Computation and Drug Design, Faculty of Pharmacy, Universitas Indonesia, Depok-16424, Jawa Barat, Indonesia https://orcid.org/0009-0004-9940-7640
  • GATOT FATWANTO HERTONO Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Indonesia, Depok, Indonesia Barat, Indonesia
  • ARRY YANUAR Laboratory of Biomedical Computation and Drug Design, Faculty of Pharmacy, Universitas Indonesia, Depok-16424, Jawa Barat, Indonesia

DOI:

https://doi.org/10.22159/ijap.2025v17i1.51726

Keywords:

QSAR, SGLT2 inhibitor, Machine Learning, KNIME, Artificial Intelligent, In silico

Abstract

Objective: This study aims to identify optimal predictive models and key molecular fragments by preparing a dataset and using machine learning techniques within the Konstanz Information Miner (KNIME) platform.

Methods: The human Sodium-glucose Cotransporter 2 (SGLT2) target dataset was obtained from the ChEMBL database and refined by removing salts, incomplete/incorrect data, and duplicates. The data was classified into active and inactive compounds, and fingerprints and descriptors were calculated. Christian Borgelt's Molecular Substructure Miner (MoSS) was employed to identify frequent molecular fragments. Following data partitioning, various ‘classification’ and ‘regression’ machine learning (ML) based Quantitative Structure-Activity Relationship (QSAR) models were developed and evaluated using different techniques, including sensitivity and Mean Squared Error (MSE).

Results: In QSAR classification, the Support Vector Machine (SVM) model demonstrated the best performance with an accuracy of 81.66%, while in QSAR Regression, the Extreme Gradient Boosting (XGB) model exhibited the best coefficient of determination (R2) and Mean Absolute Error (MAE) values of 0.69 and 0.47 respectively. The identification of frequent Molecular Fragments highlighted common characteristics in active SGLT2 inhibitors.

Conclusion: The results of developing these QSAR models indicate that machine learning methods can be effectively used to predict SGLT2 inhibitors virtually, thereby expediting the drug discovery process.

Downloads

Download data is not yet available.

References

Guo W, Li H, Li Y, Kong W. Renal intrinsic cells remodeling in diabetic kidney disease and the regulatory effects of SGLT2 Inhibitors. Biomedicine & Pharmacotherapy. 2023 Sep;165:115025.

Vallon V, Verma S. Effects of SGLT2 Inhibitors on Kidney and Cardiovascular Function. Annu Rev Physiol. 2021 Feb 10;83(1):503–28.

Alsereidi FR, Khashim Z, Marzook H, Gupta A, Al-Rawi AM, Ramadan MM, Saleh MA. Targeting inflammatory signaling pathways with SGLT2 inhibitors: Insights into cardiovascular health and cardiac cell improvement. Curr ProblCardiol. 2024 May;49(5):102524.

El Khayari A, Hakam S, Malka G, Rochette L, El Fatimy R. New insights into the cardio-renal benefits of SGLT2 inhibitors and the coordinated role of miR-30 family. Genes Dis. 2023 Nov;101174.

Gandhi A, Masand V, Zaki MEA, Al-Hussain SA, Ghorbal A Ben, Chapolikar A. QSAR analysis of sodium glucose co–transporter 2 (SGLT2) inhibitors for anti-hyperglycaemic lead development. SAR QSAR Environ Res. 2021 Sep 2;32(9):731–44.

Shah M, Patel M, Shah M, Patel M, Prajapati M. Computational transformation in drug discovery: A comprehensive study on molecular docking and quantitative structure activity relationship (QSAR). Intelligent Pharmacy. 2024 Mar;

Hasan MR, Alsaiari AA, Fakhurji BZ, Molla MHR, Asseri AH, Sumon MAA, Park MN, Ahammad F, Kim B. Application of Mathematical Modeling and Computational Tools in the Modern Drug Design and Development Process. Molecules. 2022 Jun 29;27(13):4169.

MAKHIJANI S. REVITALIZING THERAPEUTICS: DRUG REPURPOSING AS A COST-EFFECTIVE STRATEGY FOR DRUG DEVELOPMENT. International Journal of Applied Pharmaceutics. 2024 May 7;56–61.

Singh B, Crasto M, Ravi K, Singh S. Pharmaceutical Advances: Integrating Artificial Intelligence in QSAR, Combinatorial and Green Chemistry Practices. Intelligent Pharmacy. 2024 May;

Pillai N, Dasgupta A, Sudsakorn S, Fretland J, Mavroudis PD. Machine Learning guided early drug discovery of small molecules. Drug Discov Today. 2022 Aug;27(8):2209–15.

M. A, P. STS, N. D. ARTIFICIAL NEURAL NETWORKS: FUNCTIONINGANDAPPLICATIONS IN PHARMACEUTICAL INDUSTRY. International Journal of Applied Pharmaceutics. 2018 Sep 8;10(5):28.

Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Thiel K, Wiswedel B. KNIME - the Konstanz information miner. ACM SIGKDD Explorations Newsletter. 2009 Nov 16;11(1):26–31.

Hermansyah O, Bustamam A, Yanuar A. Virtual screening of dipeptidyl peptidase-4 inhibitors using quantitative structure–activity relationship-based artificial intelligence and molecular docking of hit compounds. ComputBiol Chem. 2021 Dec;95:107597.

Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D, Mutowo P, Atkinson F, Bellis LJ, Cibrián-Uhalte E, Davies M, Dedman N, Karlsson A, Magariños MP, Overington JP, Papadatos G, Smit I, Leach AR. The ChEMBL database in 2017. Nucleic Acids Res. 2017 Jan 4;45(D1):D945–54.

Kausar S, Falcao AO. An automated framework for QSAR model building. J Cheminform. 2018 Dec 16;10(1):1.

Moinul M, Amin SA, Kumar P, Patil UK, Gajbhiye A, Jha T, Gayen S. Exploring sodium glucose cotransporter (SGLT2) inhibitors with machine learning approach: A novel hope in anti-diabetes drug discovery. J Mol Graph Model. 2022 Mar;111:108106.

Beisken S, Meinl T, Wiswedel B, de Figueiredo LF, Berthold M, Steinbeck C. KNIME-CDK: Workflow-driven cheminformatics. BMC Bioinformatics. 2013 Dec 22;14(1):257.

Myint KZ, Wang L, Tong Q, Xie XQ. Molecular Fingerprint-Based Artificial Neural Networks QSAR for Ligand Biological Activity Predictions. Mol Pharm. 2012 Oct 1;9(10):2912–23.

Carracedo-Reboredo P, Liñares-Blanco J, Rodríguez-Fernández N, Cedrón F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C. A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J. 2021;19:4538–58.

Yang J, Cai Y, Zhao K, Xie H, Chen X. Concepts and applications of chemical fingerprint for hit and lead screening. Drug Discov Today. 2022 Nov;27(11):103356.

Veerasamy R, Rajak H, Jain A, Sivadasan S. Validation of QSAR models-strategies and importance. International Journal of Drug Design and Discovery. 2011;2(3):511–9.

Roy K, editor. Advances in QSAR Modeling. Cham: Springer International Publishing; 2017. (Challenges and Advances in Computational Chemistry and Physics; vol. 24).

Bhattacharya S, Rathore A, Parwani D, Mallick C, Asati V, Agarwal S, Rajoriya V, Das R, Kashaw SK. An exhaustive perspective on structural insights of SGLT2 inhibitors: A novel class of antidiabetic agent. Eur J Med Chem. 2020 Oct;204:112523.

Ramani J, Shah H, Vyas VK, Sharma M. A review on the medicinal chemistry of sodium glucose co-transporter 2 inhibitors (SGLT2-I): Update from 2010 to present. European Journal of Medicinal Chemistry Reports. 2022 Dec;6:100074.

Hussain M, Atif M, Babar M, Akhtar L. Comparison Of Efficacy And Safety Profile Of Empagliflozin Versus Dapagliflozin As Add On Therapy In Type 2 Diabetic Patients. J Ayub Med Coll Abbottabad. 2021;33(4):593–7.

Published

16-11-2024

How to Cite

ILLAHI, A. D., HERTONO, G. F., & YANUAR, A. (2024). COMPUTATIONAL QSAR-BASED MACHINE LEARNING APPROACH FOR PREDICTING ACTIVITY OF SGLT2 INHIBITORS USING THE KNIME PLATFORM. International Journal of Applied Pharmaceutics, 17(1). https://doi.org/10.22159/ijap.2025v17i1.51726

Issue

Section

Original Article(s)

Most read articles by the same author(s)

1 2 > >>