Published in International Journal of Advanced Research in Computer Science Engineering and Information Technology
ISSN: 2321-3337 Impact Factor:1.521 Volume:6 Issue:3 Year: 31 March,2021 Pages:1488-1499
Molecular biomarkers are certain molecules or set of molecules that can be of help for diagnosis or prognosis of diseases or disorders. In the past decades, thanks to the advances in high-throughput technologies, a huge amount of molecular ‘omics’ data, e.g. transcriptomics and proteomics, have been accumulated. The availability of these omics data makes it possible to screen biomarkers for diseases or disorders. Accordingly, a number of computational approaches have been developed to identify biomarkers by exploring the omics data. In this review, we present a comprehensive survey on the recent progress of identification of molecular biomarkers with machine learning approaches. Specifically, we categorize the machine learning approaches into supervised, un-supervised and recommendation approaches, where the biomarkers including single genes, gene sets and small gene networks. In addition, we further discuss potential problems underlying bio-medical data that may pose challenges for machine learning, and provide possible directions for future biomarker identification.
Molecular biomarker, machine learning, precision medicine, disease diagnosis, gene prioritization
[1] F. S. Collins, and H. Varmus, “A new initiative on precision medicine,” N Engl J Med, vol. 372, no. 9, pp. 793-5, Feb, 2015. [2] R. Clifford, T. Louis, P. Robbe, and S. Ackroyd, “SAMHD1 is mutated recurrently in chronic lymphocytic leukemia and is involved in response to DNA damage,” Blood, vol. 123, no. 7, pp. 1021-31, Feb, 2014. [6] G. Biomarkers Definitions Working, “Biomarkers and surrogate endpoints: preferred definitions and conceptual framework,” Clin Pharmacol Ther, vol. 69, no. 3, pp. 89- 95, Mar, 2001. [7] F.-N. B. W. Group, BEST (Biomarkers, EndpointS, and other Tools)Resource, Silver Spring (MD), 2016. [8] N. Cancer Genome Atlas Research, J. N. Weinstein, E. A. Collisson, G. B. Mills, K. R. Shaw, B. A. Ozenberger et al., “The Cancer Genome Atlas Pan-Cancer analysis project,” Nat Genet, vol. 45, no. 10, pp. 1113-20, Oct,2013. [9] M. Uhlen, L. Fagerberg, B. M. Hallstrom, C. Lindskog, P. Oksvold, A. Mardinoglu et al., “Proteomics. Tissue-based map of the human proteome,” Science, vol. 347, no. 6220, pp. 1260419, Jan, 2015. [10] S. A. Forbes, D. Beare, H. Boutselakis, S. Bamford, N. Bindal, J. Tate et al., “COSMIC: somatic cancer genetics at high-resolution,” Nucleic Acids Res, vol. 45, no. D1, pp. D777-D783, Jan, 2017. [11] A. Mardinoglu, J. Boren, U. Smith, M. Uhlen, and J. Nielsen, “Systems biology in hepatology: approaches and applications,” Nat Rev Gastroenterol Hepatol, vol. 15, no. 6, pp. 365-377, Jun, 2018. [12] Y. Hasin, M. Seldin, and A. Lusis, “Multi-omics approaches to disease,” Genome Biol, vol. 18, no. 1, pp. 83, May, 2017.