efficient data processing in the cloud using transpose-minify framework

M. Florence Dayana,G. Nalin

Published in International Journal of Advanced Research in Computer Science Engineering and Information Technology

ISSN: 2321-3337          Impact Factor:1.521         Volume:1         Issue:3         Year: 08 January,2014         Pages:102-109

International Journal of Advanced Research in Computer Science Engineering and Information Technology

Abstract

With the advent of the cloud computing technology the user can operate the data and perform the computations anywhere, anytime in the world. Cloud computing provides highly scalable services to be easily consumed over the Internet on an as-needed basis. The interest thing in cloud computing has been motivated by many factors such as the low cost of system hardware, the increase in computing power and storage capacity and the massive growth in data size generated by digital media, Web authoring, scientific instruments, physical simulations,etc. To this end, still the main challenge in the cloud is how to effectively store, query, analyse, and utilize these immense datasets. To provide the solution to this problem in this paper a novel highly decentralized software framework called Transpose-Minify Framework is used for effectively managing the data

Kewords

map, reduce, data processing, transpose, minify.

Reference

[1]. I. Foster, Yong Zhao, I. Raicu and S. Lu, Cloud computing and grid computing 360-degree compared, in Proceedings of the Grid Computing Environments Workshop (GCE ’08), 2008, pp. [2].A. Szalay, A. Bunn, J. Gray, I. Foster, I. Raicu. The Importance of Data Locality in Distributed Computing Applications, in Proceedings of NSF Workflow,2006. [3]. A. S. Szalay, P. Z. Kunszt, A. Thakar, J. Gray, D. Slutz, and R. J. Brunner, Designing and mining multi-terabyte astronomy archives: The Sloan Digital Sky Survey, in Proceedings of the SIGMOD International Conference on Management of Data, 2000, pp. 451_462. [4]. J. Dean and S. Ghemawat, MapReduce: Simplified data processing on large clusters, Communications of the ACM, 51(1):107_113, 2008. [5]. S. Ghemawat, H. Gobioff, and S. T. Leung, The Google File System, in Proceedings of the 19th ACM Symposium on Operating Systems Principles, LakeGeorge, NY, October, 2003, pp. [6]. D. A. Patterson, Technical perspective: The data center is the computer, Communications of the ACM, 51(1):105, 2008, pp. 105_105. [7]. Hadoop: http://lucene.apache.org/ [8].Yahoo!, Yahoo! Developer Network, http://developer.yahoo.com/blogs/hadoop/2008/02/yahoo-worlds-largest-production-hadoop.html (accessed September 2009). [9].Hadoop, Applications powered by Hadoop: http://wiki.apache.org/hadoop/ PoweredB [10]. Presentation by Randal E. Bryant, Presented in conjunction with the 2007 Federated Computing Research Conference, http://www.cs.cmu.edu/~bryant/ presentations/DISC-FCRC07.ppt. [11]. L. Barroso, J. Dean, and U. Holzle, Web search for a planet: The Google cluster architecture, IEEE Micro, 23(2), 2003, pp. 22_28. [12].MapReduce in Wikipedia, http://en.wikipedia.org/wiki/MapReduce (accessed September2009). [13].Hadoop in Wikipedia, http://en.wikipedia.org/wiki/Hadoop (accessed September 2009). [14].CNET news, http://news.cnet.com/8301-13505_3-10196871-16.html (accessed September 2009). [15].CloudStore (Formerly Kosmos File System), http://kosmosfs.sourceforge.net/ [16].Amazon Simple Storage Service, http://aws.amazon.com/s3/ [17].Ruby MapReduce Implementation, http://en.oreilly.com/rails2008/public/ schedule/detail/2022 (accessed September 2009). [18]. Cloudera Homepage, http://www.cloudera.com/ [19].OpensolarisHadoop Live CD, http://opensolaris.org/os/project/livehadoop/ [20].Amazon Elastic MapReduce, http://aws.amazon.com/elasticmapreduce/ [21]. Disco Project Homepage, http://discoproject.org/ [22]. C. Jin and R. Buyya, MapReduce programming model for.NET-based cloud computing, in Proceedings of the 15th International European Parallel Computing Conference (EuroPar 2009), Delft, The Netherlands, August 25_28, 2009, pp. 417_428. [23]. C. Jin, C. Vecchiola, and R. Buyya, MRPGA: An Extension [24]. X. Chu, K. Nadiminti, J. Chao, S. Venugopal, and R. Buyya, Aneka: Next- Generation Enterprise Grid Platform for e-Science and e-Business Applications, Proceedings of the 3rd IEEE International Conference and Grid Computing,Bangalore, India, December 10_13, 2007, pp. 151_159. [25].Manjrasoft Products, http://www.manjrasoft.com/products.html (accessed September [26]. Skynet, http://skynet.rubyforge.org/ [27].Rinda Doc page, http://www.ruby-doc.org/stdlib/libdoc/rinda/rdoc/index.html [28].Tuplespace in Wikipedia, http://en.wikipedia.org/wiki/Tuple_space (accessed September 2009). [29]. GridGain, http://www.gridgain.com/ [30].QT concurrent Page, http://labs.trolltech.com/page/Projects/Threads/QtConcurrent [31]. C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, C. Kozyrakis, Evaluating MapReduce for Multi-core and Multiprocessor Systems, in Proceedings of the 13th International Symposium on High-Performance Computer Architecture (HPCA), Phoenix, AZ, February 2007, pp. 13_24. [32]. M. Kruijf and K. Sankaralingam, MapReduce for the Cell B. E. Architecture, TR1625, Technical Report, Department of Computer Sciences, the University of Wisconsin—Madison, 2007. [33]. B. S. He, W. B. Fang, Q. Luo, N. K. Govindaraju, and T. Y. Wang, Mars: A MapReduce framework on graphics processors, in Proceedings of the 17th