Document Type : Original Article
Author
Department of Computer Engineering, Ayandegan Institute of Higher Education, Tonekabon, Iran.
Abstract
Analysis of big data has been presented as an advanced analytical technology involving large-scale and complex applications. In this paper, we review the general background of big data, and focus on data generation and data analysis. Then, we examine the several representative applications of big data, including enterprise management, Internet of Things, online social networks. These discussions aim to provide a comprehensive overview to readers of this exciting area.
Keywords
- Sagiroglu, S., & Sinanc, D. (2013, May). Big data: A review. 2013 international conference on collaboration technologies and systems (CTS)(pp. 42-47). IEEE. DOI: 1109/CTS.2013.6567202
- Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Hung Byers, A. (2011). Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute.
- Ghemawat, S., Gobioff, H., & Leung, S. T. (2003, October). The Google file system. Proceedings of the nineteenth ACM symposium on operating systems principles(pp. 29-43). https://doi.org/10.1145/945445.945450
- Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113. https://doi.org/10.1145/1327452.1327492
- Tansley, S., & Tolle, K. M. (2009). The fourth paradigm: data-intensive scientific discovery(Vol. 1). Redmond, WA: Microsoft research.
- Gantz, J., & Reinsel, D. (2011). Extracting value from chaos. IDC iview, 1142(2011), 1-12.
- Sallam, R. L., Richardson, J., Hagerty, J., & Hostmann, B. (2011). Magic quadrant for business intelligence platforms. Retrieved from Gartner. http://in2015.in1.com.br/sites/default/files/magic_quadrant_
pdf - Chen, M., Mao, S., & Liu, Y. (2014). Big data: a survey. Mobile networks and applications, 19(2), 171-209. https://doi.org/10.1007/s11036-013-0489-0
- Goff, S. A., Vaughn, M., McKay, S., Lyons, E., Stapleton, A. E., Gessler, D., ... & Stanzione, D. (2011). The iPlant collaborative: cyberinfrastructure for plant biology. Frontiers in plant science, 2, 34. https://doi.org/10.3389/fpls.2011.00034
- Labrinidis, A., & Jagadish, H. V. (2012). Challenges and opportunities with big data. Proceedings of the VLDB endowment, 5(12), 2032-2033.
- Baah, G. K., Gray, A., & Harrold, M. J. (2006, November). On-line anomaly detection of deployed software: a statistical machine learning approach. Proceedings of the 3rd international workshop on Software quality assurance(pp. 70-77). https://doi.org/10.1145/1188895.1188911
- Moeng, M., & Melhem, R. (2010, May). Applying statistical machine learning to multicore voltage & frequency scaling. Proceedings of the 7th ACM international conference on computing frontiers(pp. 277-286). https://doi.org/10.1145/1787275.1787336
- Gaber, M. M., Zaslavsky, A., & Krishnaswamy, S. (2005). Mining data streams: a review. ACM sigmod record, 34(2), 18-26. https://doi.org/10.1145/1083784.1083789
- Verykios, V. S., Bertino, E., Fovino, I. N., Provenza, L. P., Saygin, Y., & Theodoridis, Y. (2004). State-of-the-art in privacy preserving data mining. ACM sigmod record, 33(1), 50-57. https://doi.org/10.1145/974121.974131
- Van Der Aalst, W. (2012). Process mining: overview and opportunities. ACM transactions on management information systems (TMIS), 3(2), 1-17. https://doi.org/10.1145/2229156.2229157
- Manning, C., & Schutze, H. (1999). Foundations of statistical natural language processing. MIT press.
- Pal, S. K., Talwar, V., & Mitra, P. (2002). Web mining in soft computing framework: relevance, state of the art and future directions. IEEE transactions on neural networks, 13(5), 1163-1177. DOI:1109/TNN.2002.1031947
- Chakrabarti, S. (2000). Data mining for hypertext: a tutorial survey. ACM SIGKDD explorations newsletter, 1(2), 1-11. https://doi.org/10.1145/846183.846187
- Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, 30(1-7), 107-117. https://doi.org/10.1016/S0169-7552(98)00110-X
- Konopnicki, D., & Shmueli, O. (1995, September). W3qs: a query system for the world-wide web. Proceeding of the 21th VLDB Conference (Vol. 95, pp. 54-65). Zurich, Switzerland. http://www.vldb.org/conf/1995/P054.PDF
- Chakrabarti, S., Van den Berg, M., & Dom, B. (1999). Focused crawling: a new approach to topic-specific Web resource discovery. Computer networks, 31(11-16), 1623-1640. https://doi.org/10.1016/S1389-1286(99)00052-3
- Ding, D., Metze, F., Rawat, S., Schulam, P. F., Burger, S., Younessian, E., ... & Hauptmann, A. (2012, June). Beyond audio and video retrieval: towards multimedia summarization. Proceedings of the 2nd ACM international conference on multimedia retrieval(pp. 1-8). https://doi.org/10.1145/2324796.2324799
- Wang, M., Ni, B., Hua, X. S., & Chua, T. S. (2012). Assistive tagging: a survey of multimedia tagging with human-computer joint exploration. ACM computing surveys (CSUR), 44(4), 1-24. https://doi.org/10.1145/2333112.2333120
- Lew, M. S., Sebe, N., Djeraba, C., & Jain, R. (2006). Content-based multimedia information retrieval: state of the art and challenges. ACM transactions on multimedia computing, communications, and applications (TOMM), 2(1), 1-19. https://doi.org/10.1145/1126004.1126005
- Hu, W., Xie, N., Li, L., Zeng, X., & Maybank, S. (2011). A survey on visual content-based video indexing and retrieval. IEEE transactions on systems, man, and cybernetics, part C (applications and reviews), 41(6), 797-819. DOI:1109/TSMCC.2011.2109710
- Park, Y. J., & Chang, K. N. (2009). Individual and group behavior-based customer profile model for personalized product recommendation. Expert systems with applications, 36(2), 1932-1939. https://doi.org/10.1016/j.eswa.2007.12.034
- Barragáns-Martínez, A. B., Costa-Montenegro, E., Burguillo, J. C., Rey-López, M., Mikic-Fonte, F. A., & Peleteiro, A. (2010). A hybrid content-based and item-based collaborative filtering approach to recommend TV programs enhanced with singular value decomposition. Information sciences, 180(22), 4290-4311. https://doi.org/10.1016/j.ins.2010.07.024
- Naphade, M., Smith, J. R., Tesic, J., Chang, S. F., Hsu, W., Kennedy, L., ... & Curtis, J. (2006). Large-scale concept ontology for multimedia. IEEE multimedia, 13(3), 86-91. DOI:1109/MMUL.2006.63
- Ma, Z., Yang, Y., Cai, Y., Sebe, N., & Hauptmann, A. G. (2012, October). Knowledge adaptation for ad hoc multimedia event detection with few exemplars. Proceedings of the 20th ACM international conference on Multimedia(pp. 469-478). https://doi.org/10.1145/2393347.2393414
- Hirsch, J. E. (2005). An index to quantify an individual's scientific research output. Proceedings of the national academy of sciences, 102(46), 16569-16572. https://doi.org/10.1073/pnas.0507655102
- Bower, D. F. (2005). Six degrees: the science of a connected age. Complicity: an international journal of complexity and education, 2(1). DOI: https://doi.org/10.29173/cmplct8734
- Aggarwal, C. C. (2011). An introduction to social network data analytics. In Social network data analytics(pp. 1-15). Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-8462-3_1
- Scellato, S., Noulas, A., & Mascolo, C. (2011, August). Exploiting place features in link prediction on location-based social networks. Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining(pp. 1046-1054). https://doi.org/10.1145/2020408.2020575
- Ninagawa, A., & Eguchi, K. (2010, March). Link prediction using probabilistic group models of network structure. Proceedings of the 2010 ACM symposium on applied computing(pp. 1115-1116). https://doi.org/10.1145/1774088.1774323
- Dunlavy, D. M., Kolda, T. G., & Acar, E. (2011). Temporal link prediction using matrix and tensor factorizations. ACM transactions on knowledge discovery from data (TKDD), 5(2), 1-27. https://doi.org/10.1145/1921632.1921636
- Leskovec, J., Lang, K. J., & Mahoney, M. (2010, April). Empirical comparison of algorithms for network community detection. Proceedings of the 19th international conference on world wide web(pp. 631-640).
- Du, N., Wu, B., Pei, X., Wang, B., & Xu, L. (2007, August). Community detection in large-scale social networks. Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on web mining and social network analysis(pp. 16-25). https://doi.org/10.1145/1348549.1348552
- Garg, S., Gupta, T., Carlsson, N., & Mahanti, A. (2009, November). Evolution of an online social aggregation network: an empirical study. Proceedings of the 9th ACM SIGCOMM conference on internet measurement(pp. 315-321). https://doi.org/10.1145/1644893.1644931
- Allamanis, M., Scellato, S., & Mascolo, C. (2012, November). Evolution of a location-based online social network: analysis and models. Proceedings of the 2012 internet measurement conference(pp. 145-158). https://doi.org/10.1145/2398776.2398793
- Gong, N. Z., Xu, W., Huang, L., Mittal, P., Stefanov, E., Sekar, V., & Song, D. (2012, November). Evolution of social-attribute networks: measurements, modeling, and implications using google+. Proceedings of the 2012 internet measurement conference(pp. 131-144). https://doi.org/10.1145/2398776.2398792
- Zheleva, E., Sharara, H., & Getoor, L. (2009, June). Co-evolution of social and affiliation networks. Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining(pp. 1007-1016). https://doi.org/10.1145/1557019.1557128
- Tang, J., Sun, J., Wang, C., & Yang, Z. (2009, June). Social influence analysis in large-scale networks. Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining(pp. 807-816). https://doi.org/10.1145/1557019.1557108
- Li, Y., Chen, W., Wang, Y., & Zhang, Z. L. (2013, February). Influence diffusion dynamics and influence maximization in social networks with friend and foe relationships. Proceedings of the sixth ACM international conference on web search and data mining(pp. 657-666). https://doi.org/10.1145/2433396.2433478
- Dai, W., Chen, Y., Xue, G. R., Yang, Q., & Yu, Y. (2008). Translated learning: transfer learning across different feature spaces. Advances in neural information processing systems, 21, 353-360.
- Pepper, R. (2013). Cisco visual networking index (VNI) global mobile data traffic forecast update. Retrieved from Cisco. https://www.gsm.org/spectrum/wp-content/uploads/2013/03/Cisco_VNI-global-mobile-data-traffic-forecast-update.pdf
- Rhee, Y., & Lee, J. (2009). On modeling a model of mobile community: designing user interfaces to support group interaction. Interactions, 16(6), 46-51. https://doi.org/10.1145/1620693.1620705
- Han, J., Li, Z., & Tang, L. A. (2010, April). Mining moving object, trajectory and traffic data. International conference on database systems for advanced applications(pp. 485-486). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12098-5_56
- Garg, M. K., Kim, D. J., Turaga, D. S., & Prabhakaran, B. (2010, March). Multimodal analysis of body sensor network data streams for real-time healthcare. Proceedings of the international conference on multimedia information retrieval(pp. 469-478). https://doi.org/10.1145/1743384.1743467
- Mayer-Schönberger, V., & Cukier, K. (2013). Big data: a revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt.