Developing Predictive Models for Rare Diseases Clinical Trials

Main Article Content

Dr. Dinesh Yadav

Abstract

Research on rare disorders, which are mostly inherited and often exhibit neurological symptoms, is gravely under-represented. This research uses gait analysis to categorize patients with hereditary cerebellar ataxia versus healthy controls, and it builds a predictive modelling framework for therapeutic trials targeting uncommon disorders. This study presents a gait classification methodology for distinguishing individuals with hereditary cerebellar ataxia from healthy subjects, leveraging an inertial measurement unit (IMU) to capture gait metrics and trunk acceleration from thirty diagnosed individuals and one hundred healthy participants. The dataset undergoes preprocessing steps, including outlier correction using interquartile range (IQR). The Random Forest (RF) classifier ranks features by importance, revealing significant predictors like CV-step-length, sLLEAP, and HRAP. To address class imbalance, various data augmentation techniques—under-sampling, over-sampling, SMOTE, GAN, and CT-GAN—are employed. Among these, CT-GAN exhibits the highest effectiveness in producing balanced data, contributing to enhanced classification accuracy. Among a performance parameter utilised to assess an effectiveness of a model, CT-GAN achieved a 90% accuracy rate, an 88% recall rate, an 88% F1-score, and a 0.90 ROC-AUC. The results underscore the advantages of CT-GAN in improving model performance for imbalanced clinical data, especially in rare disease prediction.

Downloads

Download data is not yet available.

Article Details

Section

Research Paper

Author Biography

Dr. Dinesh Yadav, Dr. Dinesh Yadav, Associate Professor,CSE Department, St. Andrews Institute of Technology & Management,Gurugram, Haryana, India Email:dinesh.yadav@saitm.ac.in

Dr. Dinesh Yadav,

Associate Professor,CSE Department,

St. Andrews Institute of Technology & Management ,Gurugram, Haryana, India

Email:dinesh.yadav@saitm.ac.in

How to Cite

Developing Predictive Models for Rare Diseases Clinical Trials. (2025). Journal of Global Research in Multidisciplinary Studies(JGRMS), 1(1), 28-36. https://doi.org/10.5281/zenodo.14741411

References

[1] S. Nguengang Wakap et al., “Estimating cumulative point prevalence of rare diseases: analysis of the Orphanet database,” Eur. J. Hum. Genet., 2020, doi: 10.1038/s41431-019-0508-0.

[2] C. R. Ferreira, “The burden of rare diseases,” American Journal of Medical Genetics, Part A. 2019. doi: 10.1002/ajmg.a.61124.

[3] J. Thomas, “Optimizing Nurse Scheduling: A Supply Chain Approach for Healthcare Institutions,” J. Electr. Syst., vol. 20, no. 6, 2024, [Online]. Available: https://journal.esrgroups.org/jes/article/view/3175

[4] P. F. David, R. C. David, C. Moreno Juan, and T. Diego, “Human Locomotion Databases: A Systematic Review,” IEEE J. Biomed. Heal. Informatics, 2024, doi: 10.1109/JBHI.2023.3311677.

[5] M. Rinaldi et al., “Increased lower limb muscle coactivation reduces gait performance and increases metabolic cost in patients with hereditary spastic paraparesis,” Clin. Biomech., 2017, doi: 10.1016/j.clinbiomech.2017.07.013.

[6] P. Khare, “Enhancing Security with Voice: A Comprehensive Review of AI-Based Biometric Authentication Systems,” Int. J. Res. Anal. Rev., vol. 10, no. 2, pp. 398–403, 2023.

[7] C. Hajat and E. Stein, “The global burden of multiple chronic conditions: A narrative review,” Preventive Medicine Reports. 2018. doi: 10.1016/j.pmedr.2018.10.008.

[8] M. Haque, T. Islam, N. A. A. Rahman, J. McKimm, A. Abdullah, and S. Dhingra, “Strengthening primary health-care services to help prevent and control long-term (Chronic) non-communicable diseases in low- and middle-income countries,” Risk Management and Healthcare Policy. 2020. doi: 10.2147/RMHP.S239074.

[9] A. P. A. S. and NeepakumariGameti, “Asset Master Data Management: Ensuring Accuracy and Consistency in Industrial Operations,” Int. J. Nov. Res. Dev., vol. 9, no. 9, pp. a861-c868, 2024, [Online]. Available: https://zenodo.org/records/13771045

[10] A. P. A. Singh and N. Gameti, “Leveraging Digital Twins for Predictive Maintenance: Techniques, Challenges, and Application,” IJSART, vol. 10, no. 09, pp. 118–128, 2024.

[11] S. G. Jubin Thomas, Kirti Vinod Vedi, “Effects of supply chain management strategies on the overall performance of the organisation,” Int. J. Sci. Res. Arch., vol. 13, no. 01, pp. 709–719, 2024.

[12] T. Richter et al., “Rare Disease Terminology and Definitions-A Systematic Global Review: Report of the ISPOR Rare Disease Special Interest Group,” Value Heal., 2015, doi: 10.1016/j.jval.2015.05.008.

[13] H. S. Chandu, “A Review of IoT-Based Home Security Solutions: Focusing on Arduino Applications,” TIJER – Int. Res. J., vol. 11, no. 10, pp. a391–a396, 2024, [Online]. Available: https://tijer.org/tijer/papers/TIJER2410044.pdf

[14] H. Sarpana Chandu, “Robust Control of Electrical Machines in Renewable Energy Systems: Challenges and Solutions,” Int. J. Innov. Sci. Res. Technol., vol. 09, no. 10, pp. 594–602, Oct. 2024, doi: 10.38124/ijisrt/IJISRT24OCT654.

[15] G. Yang, I. Cintina, A. Pariser, E. Oehrlein, J. Sullivan, and A. Kennedy, “The national economic burden of rare disease in the United States in 2019,” Orphanet J. Rare Dis., 2022, doi: 10.1186/s13023-022-02299-5.

[16] R. Goyal, “Exploring The Performance Of Machine Learning Models For Classification And Identification Of Fraudulent Insurance Claims,” Int. J. Core Eng. Manag., vol. 7, no. 10, pp. 34–44, 2024.

[17] S. R. B. and S. Clarita, “Evaluation Of Deep Learning For The Diagnosis Of Leukemia Blood Cancer,” Int. J. Adv. Res. Eng. Technol., vol. 11, no. 3, pp. 661–672, 2020, doi: https://iaeme.com/Home/issue/IJARET?Volume=11&Issue=3.

[18] S. B. and S. C. and S. Clarita, “AN ANALYSIS: EARLY DIAGNOSIS AND CLASSIFICATION OF PARKINSON’S DISEASE USING MACHINE LEARNING TECHNIQUES,” Int. J. Comput. Eng. Technol., vol. 12, no. 01, pp. 54-66., 2021, doi: http://www.iaeme.com/IJCET/issues.asp?JType=IJCET&VType=12&IType=1.

[19] C. Abbafati et al., “Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019,” Lancet, 2020, doi: 10.1016/S0140-6736(20)30925-9.

[20] R. Arora, A. Kumar, and A. Soni, “Deep Learning Approaches for Enhanced Kidney Segmentation: Evaluating U-Net and Attention U-Net with Cross-Entropy and Focal Loss Functions.” Aug. 26, 2024. doi: 10.20944/preprints202408.1816.v1.

[21] R. Arora, S. Gera, and M. Saxena, “Mitigating Security Risks on Privacy of Sensitive Data used in Cloud-based ERP Applications,” 2021.

[22] F. Qaiser et al., “Genome sequencing identifies rare tandem repeat expansions and copy number variants in Lennox-Gastaut syndrome,” Brain Commun., 2021, doi: 10.1093/braincomms/fcab207.

[23] J. Stephens and C. Blazynski, “Rare disease landscape: Will the blockbuster model be replaced?,” Expert Opinion on Orphan Drugs. 2014. doi: 10.1517/21678707.2014.924850.

[24] S. Hatem, J. C. Long, S. Best, Z. Fehlberg, B. N. G. Easpaig, and J. Braithwaite, “Mobile Apps for People With Rare Diseases: Review and Quality Assessment Using Mobile App Rating Scale,” Journal of Medical Internet Research. 2022. doi: 10.2196/36691.

[25] S. Marwaha, J. W. Knowles, and E. A. Ashley, “A guide for the diagnosis of rare and undiagnosed disease: beyond the exome,” Genome Medicine. 2022. doi: 10.1186/s13073-022-01026-w.

[26] S. Lanar, C. Acquadro, J. Seaton, I. Savre, and B. Arnould, “To what degree are orphan drugs patient-centered? A review of the current state of clinical research in rare diseases,” Orphanet Journal of Rare Diseases. 2020. doi: 10.1186/s13023-020-01400-0.

[27] H. Sinha, “An examination of machine learning-based credit card fraud detection systems,” Int. J. Sci. Res. Arch., vol. 12, no. 01, pp. 2282–2294, 2024, doi: https://doi.org/10.30574/ijsra.2024.12.2.1456.

[28] P. Khare, S. Arora, and S. Gupta, “Integration of Artificial Intelligence (AI) and Machine Learning (ML) into Product Roadmap Planning,” in 2024 First International Conference on Electronics, Communication and Signal Processing (ICECSP), 2024, pp. 1–6. doi: 10.1109/ICECSP61809.2024.10698502.

[29] M. R. S. and P. K. Vishwakarma, “An Efficient Machine Learning Based Solutions for Renewable Energy System,” Int. J. Res. Anal. Rev., vol. 9, no. 4, pp. 951–958, 2022, [Online]. Available: https://www.ijrar.org/papers/IJRAR22D3208.pdf

[30] M. R. S. Pawan Kumar Vishwakarma, “An Analysis of Engineering, Procurement And Construction (EPC)-Contracts Based on Renewable Energy,” IJSART, vol. 10, no. 10, pp. 26–36, 2024.

[31] H. Sinha, “A Comprehensive Study on Air Quality Detection Using ML Algorithms,” J. Emerg. Technol. Innov. Res. www.jetir.org, vol. 11, no. 9, pp. b116–b122, 2024.

[32] H. Sinha, “Predicting Employee Performance in Business Environments Using Effective Machine Learning Models,” Int. J. Nov. Res. Dev., vol. 9, no. 9, pp. 875–881, 2024.

[33] S. Bauskar, “Enhancing System Observability with Machine Learning Techniques for Anomaly Detection,” Int. J. Manag. IT Eng., vol. 14, no. 10, pp. 64–70, 2024.

[34] C. Irissarry and T. Burger-Helmchen, “Using Artificial Intelligence to Advance the Research and Development of Orphan Drugs,” Businesses, vol. 4, no. 3, pp. 453–472, 2024, doi: 10.3390/businesses4030028.

[35] A. Visibelli, B. Roncaglia, O. Spiga, and A. Santucci, “The Impact of Artificial Intelligence in the Odyssey of Rare Diseases,” Biomedicines. 2023. doi: 10.3390/biomedicines11030887.

[36] W. R. Hersh, A. M. Cohen, M. M. Nguyen, K. L. Bensching, and T. G. Deloughery, “Clinical study applying machine learning to detect a rare disease: Results and lessons learned,” JAMIA Open, 2022, doi: 10.1093/jamiaopen/ooac053.

[37] N. B. Bahadure, S. Dash, S. Padhy, A. Satpathy, and S. Routray, “Rare Diseases Severity Prediction System Using a Machine Learning-Based Technique,” in 2023 International Conference on Artificial Intelligence for Innovations in Healthcare Industries (ICAIIHI), 2023, pp. 1–6. doi: 10.1109/ICAIIHI57871.2023.10489179.

[38] N. Shaafi Kabiri et al., “Evaluation of the use of the Scale for the Assessment and Rating of Ataxia (SARA) in healthy volunteers and patients with schizophrenia,” J. Neurol. Sci., 2018, doi: 10.1016/j.jns.2018.05.019.

[39] R. T. Disler et al., “Factors impairing the postural balance in COPD patients and its influence upon activities of daily living,” Eur. Respir. J., 2019.

[40] L. Fiori et al., “Impairment of Global Lower Limb Muscle Coactivation During Walking in Cerebellar Ataxias,” Cerebellum, 2020, doi: 10.1007/s12311-020-01142-6.

[41] M. Serrao et al., “Use of dynamic movement orthoses to improve gait stability and trunk control in ataxic patients,” Eur. J. Phys. Rehabil. Med., 2017, doi: 10.23736/S1973-9087.17.04480-X.

[42] J. Yang and V. Honavar, “Feature subset selection using genetic algorithm,” IEEE Intell. Syst. Their Appl., 1998, doi: 10.1109/5254.671091.

[43] R. Wilcox, Introduction to Robust Estimation and Hypothesis Testing, Third Edition. 2011. doi: 10.1016/C2010-0-67044-1.

[44] M. Z. Hasan, R. Fink, M. R. Suyambu, and M. K. Baskaran, “Assessment and improvement of intelligent controllers for elevator energy efficiency,” in IEEE International Conference on Electro Information Technology, 2012. doi: 10.1109/EIT.2012.6220727.

[45] H. S. Chandu, “Enhancing Manufacturing Efficiency: Predictive Maintenance Models Utilizing IoT Sensor Data,” IJSART, vol. 10, no. 9, 2024, [Online]. Available: https://ijsart.com/Content/PDFDocuments/IJSARTV10I999425.pdf

[46] K. Patel, “Quality Assurance In The Age Of Data Analytics: Innovations And Challenges,” Int. J. Creat. Res. Thoughts, vol. 9, no. 12, pp. f573–f578, 2021.

[47] S. G. Jubin Thomas, Piyush Patidar, Kirti Vinod Vedi, “Predictive Big Data Analytics For Supply Chain Through Demand Forecastin,” Int. J. Creat. Res. Thoughts, vol. 10, no. 06, pp. h868–h873, 2022.

[48] S. G. Jubin Thomas, Kirti Vinod Vedi, “Artificial Intelligence And Big Data Analytics For Supply Chain Management,” Int. Res. J. Mod. Eng. Technol. Sci., vol. 06, no. 09, 2024, doi: DOI : https://www.doi.org/10.56726/IRJMETS61488.

[49] D. L. Wilson, “Asymptotic Properties of Nearest Neighbor Rules Using Edited Data,” IEEE Trans. Syst. Man Cybern., 1972, doi: 10.1109/TSMC.1972.4309137.

[50] S. Visa and A. Ralescu, “Issues in mining imbalanced data sets-a review paper,” in Proceedings of the sixteen midwest artificial intelligence and cognitive science conference, 2005.

[51] A. Estabrooks and N. Japkowicz, “A mixture-of-experts framework for learning from imbalanced data sets,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2001. doi: 10.1007/3-540-44816-0_4.

[52] J. Joseph et al., “Dendritic Cells Pulsed with HAM/TSP Exosomes Sensitize CD4 T Cells to Enhance HTLV-1 Infection, Induce Helper T-Cell Polarization, and Decrease Cytotoxic T-Cell Response,” Viruses, vol. 16, no. 9, p. 1443, Sep. 2024, doi: 10.3390/v16091443.

[53] M. Gopalsamy, “Artificial Intelligence (AI) Based Internet-ofThings (IoT)-Botnet Attacks Identification Techniques to Enhance Cyber security,” Int. J. Res. Anal. Rev., vol. 7, no. 4, pp. 414–420, 2020, [Online]. Available: https://www.ijrar.org/papers/IJRAR2AA1742.pdf

[54] M. Gopalsamy, “Predictive Cyber Attack Detection in Cloud Environments with Machine Learning from the CICIDS 2018 Dataset,” Int. J. Sci. Adv. Res. Technol., vol. 10, no. 10, pp. 36–47, 2024, [Online]. Available: https://ijsart.com/Home/IssueDetail/99449

[55] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” J. Artif. Intell. Res., 2002, doi: 10.1613/jair.953.

[56] R. Bishukarma, “The Role of AI in Automated Testing and Monitoring in SaaS Environments,” Int. J. Res. Anal. Rev., vol. 8, no. 2, pp. 846–852, 2021, [Online]. Available: https://www.ijrar.org/papers/IJRAR21B2597.pdf

[57] I. J. Goodfellow et al., “Generative adversarial nets,” in Advances in Neural Information Processing Systems, 2014. doi: 10.1007/978-3-658-40442-0_9.

[58] M. J. Cheon, D. H. Lee, J. W. Park, H. J. Choi, J. S. Lee, and O. Lee, “CTGAN VS TGAN? Which one is more suitable for generating synthetic EEG data,” J. Theor. Appl. Inf. Technol., vol. 99, no. 10, pp. 2359–2372, 2021.

[59] L. Breiman, “Random forests,” Mach. Learn., 2001, doi: 10.1023/A:1010933404324.