predicting risk of diabetes using summarization techniques

P.SathyaPriya,A.Shamili,M.Revathy

Published in International Journal of Advanced Research in Computer Science Engineering and Information Technology

ISSN: 2321-3337          Impact Factor:1.521         Volume:4         Issue:3         Year: 31 March,2016         Pages:629-633

International Journal of Advanced Research in Computer Science Engineering and Information Technology

Abstract

In Data Mining, association rule learning is a popular and well researched method for discovering interesting relations between variables in large databases. To Apply Association Rule Mining to electronic medical records (EMR) to discover sets of risk factors and their corresponding subpopulations that represent patients at particularly high risk of developing diabetes. An Electronic Medical Record (EMR) is an evolving concept defined as a systematic collection of electronic health information about individual patients or population. The high dimensionality of EMR’s, association rule mining generates a very large set of rules which we need to summarize for easy clinical use. Applied four association rule set summarization techniques and conducted a comparative evaluation to provide guidance regarding their applicability, strengths and weaknesses. We found that all four methods produced summaries that described subpopulations at high risk of diabetes with each method having its clear strength. For our purpose, our extension to the Bottom-Up Summarization (BUS) algorithm produced the most suitable summary.

Kewords

Electronic Medical Record (Emr), Aprx-Collection, Censoring, Bottom-Up Summarization (Bus), Association Rules, Association Rule Summarization

Reference

[1] “Extending Association Rule Summarization Techniques to Assess Risk of Diabetes Mellitus” Pedro J. Caraballo, Terry M. Therneau, Steven S. Cha, M. Regina Castro, and Peter W. Li, JANUARY 2015 . [2] “Use of association rule mining to assess diabetes risk in patients with impared fasting glucose,” P. J. Caraballo, M. R. Castro, S. S. Cha, P. W. Li, and G. J. Simon, Symp., 2011. [3] “Comorbidity study on type 2 diabetes mellitus using data mining,” H. S. Kim, A. M. Shin, M. K. Kim, and N. Kim,Korean J. Intern. Med., vol. 27, no. 2, pp. 197–202, Jun. 2012. [4] “High-order SNP combinations associated with complex diseases: Efficient discovery, statistical power and func- tional interactions,” G. Fang et al., PLoS ONE, vol. 7, no. 4, Article e33531, 2012.