Quarterly Publication

Document Type : Original Article

Author

Department of Computer Engineering, Ayandegan Institute of Higher Education, Tonekabon, Iran.

Abstract

Analysis of big data has been presented as an advanced analytical technology involving large-scale and complex applications. In this paper, we review the general background of big data, and focus on data generation and data analysis. Then, we examine the several representative applications of big data, including enterprise management, Internet of Things, online social networks. These discussions aim to provide a comprehensive overview to readers of this exciting area.

Keywords

  1. Sagiroglu, S., & Sinanc, D. (2013, May). Big data: A review. 2013 international conference on collaboration technologies and systems (CTS)(pp. 42-47). IEEE. DOI: 1109/CTS.2013.6567202
  2. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Hung Byers, A. (2011). Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute.
  3. Ghemawat, S., Gobioff, H., & Leung, S. T. (2003, October). The Google file system. Proceedings of the nineteenth ACM symposium on operating systems principles(pp. 29-43). https://doi.org/10.1145/945445.945450
  4. Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM51(1), 107-113. https://doi.org/10.1145/1327452.1327492
  5. Tansley, S., & Tolle, K. M. (2009). The fourth paradigm: data-intensive scientific discovery(Vol. 1). Redmond, WA: Microsoft research.
  6. Gantz, J., & Reinsel, D. (2011). Extracting value from chaos. IDC iview1142(2011), 1-12.
  7. Sallam, R. L., Richardson, J., Hagerty, J., & Hostmann, B. (2011). Magic quadrant for business intelligence platforms. Retrieved from Gartner. http://in2015.in1.com.br/sites/default/files/magic_quadrant_
    pdf
  8. Chen, M., Mao, S., & Liu, Y. (2014). Big data: a survey. Mobile networks and applications19(2), 171-209. https://doi.org/10.1007/s11036-013-0489-0
  9. Goff, S. A., Vaughn, M., McKay, S., Lyons, E., Stapleton, A. E., Gessler, D., ... & Stanzione, D. (2011). The iPlant collaborative: cyberinfrastructure for plant biology. Frontiers in plant science2, 34. https://doi.org/10.3389/fpls.2011.00034
  10. Labrinidis, A., & Jagadish, H. V. (2012). Challenges and opportunities with big data. Proceedings of the VLDB endowment5(12), 2032-2033.
  11. Baah, G. K., Gray, A., & Harrold, M. J. (2006, November). On-line anomaly detection of deployed software: a statistical machine learning approach. Proceedings of the 3rd international workshop on Software quality assurance(pp. 70-77). https://doi.org/10.1145/1188895.1188911
  12. Moeng, M., & Melhem, R. (2010, May). Applying statistical machine learning to multicore voltage & frequency scaling. Proceedings of the 7th ACM international conference on computing frontiers(pp. 277-286). https://doi.org/10.1145/1787275.1787336
  13. Gaber, M. M., Zaslavsky, A., & Krishnaswamy, S. (2005). Mining data streams: a review. ACM sigmod record34(2), 18-26. https://doi.org/10.1145/1083784.1083789
  14. Verykios, V. S., Bertino, E., Fovino, I. N., Provenza, L. P., Saygin, Y., & Theodoridis, Y. (2004). State-of-the-art in privacy preserving data mining. ACM sigmod record33(1), 50-57. https://doi.org/10.1145/974121.974131
  15. Van Der Aalst, W. (2012). Process mining: overview and opportunities. ACM transactions on management information systems (TMIS)3(2), 1-17. https://doi.org/10.1145/2229156.2229157
  16. Manning, C., & Schutze, H. (1999). Foundations of statistical natural language processing. MIT press.
  17. Pal, S. K., Talwar, V., & Mitra, P. (2002). Web mining in soft computing framework: relevance, state of the art and future directions. IEEE transactions on neural networks13(5), 1163-1177. DOI:1109/TNN.2002.1031947
  18. Chakrabarti, S. (2000). Data mining for hypertext: a tutorial survey. ACM SIGKDD explorations newsletter1(2), 1-11. https://doi.org/10.1145/846183.846187
  19. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems30(1-7), 107-117. https://doi.org/10.1016/S0169-7552(98)00110-X
  20. Konopnicki, D., & Shmueli, O. (1995, September). W3qs: a query system for the world-wide web. Proceeding of the 21th VLDB Conference (Vol. 95, pp. 54-65). Zurich, Switzerland. http://www.vldb.org/conf/1995/P054.PDF
  21. Chakrabarti, S., Van den Berg, M., & Dom, B. (1999). Focused crawling: a new approach to topic-specific Web resource discovery. Computer networks31(11-16), 1623-1640. https://doi.org/10.1016/S1389-1286(99)00052-3
  22. Ding, D., Metze, F., Rawat, S., Schulam, P. F., Burger, S., Younessian, E., ... & Hauptmann, A. (2012, June). Beyond audio and video retrieval: towards multimedia summarization. Proceedings of the 2nd ACM international conference on multimedia retrieval(pp. 1-8). https://doi.org/10.1145/2324796.2324799
  23. Wang, M., Ni, B., Hua, X. S., & Chua, T. S. (2012). Assistive tagging: a survey of multimedia tagging with human-computer joint exploration. ACM computing surveys (CSUR)44(4), 1-24. https://doi.org/10.1145/2333112.2333120
  24. Lew, M. S., Sebe, N., Djeraba, C., & Jain, R. (2006). Content-based multimedia information retrieval: state of the art and challenges. ACM transactions on multimedia computing, communications, and applications (TOMM)2(1), 1-19. https://doi.org/10.1145/1126004.1126005
  25. Hu, W., Xie, N., Li, L., Zeng, X., & Maybank, S. (2011). A survey on visual content-based video indexing and retrieval. IEEE transactions on systems, man, and cybernetics, part C (applications and reviews)41(6), 797-819. DOI:1109/TSMCC.2011.2109710
  26. Park, Y. J., & Chang, K. N. (2009). Individual and group behavior-based customer profile model for personalized product recommendation. Expert systems with applications36(2), 1932-1939. https://doi.org/10.1016/j.eswa.2007.12.034
  27. Barragáns-Martínez, A. B., Costa-Montenegro, E., Burguillo, J. C., Rey-López, M., Mikic-Fonte, F. A., & Peleteiro, A. (2010). A hybrid content-based and item-based collaborative filtering approach to recommend TV programs enhanced with singular value decomposition. Information sciences180(22), 4290-4311. https://doi.org/10.1016/j.ins.2010.07.024
  28. Naphade, M., Smith, J. R., Tesic, J., Chang, S. F., Hsu, W., Kennedy, L., ... & Curtis, J. (2006). Large-scale concept ontology for multimedia. IEEE multimedia13(3), 86-91. DOI:1109/MMUL.2006.63
  29. Ma, Z., Yang, Y., Cai, Y., Sebe, N., & Hauptmann, A. G. (2012, October). Knowledge adaptation for ad hoc multimedia event detection with few exemplars. Proceedings of the 20th ACM international conference on Multimedia(pp. 469-478). https://doi.org/10.1145/2393347.2393414
  30. Hirsch, J. E. (2005). An index to quantify an individual's scientific research output. Proceedings of the national academy of sciences102(46), 16569-16572. https://doi.org/10.1073/pnas.0507655102
  31. Bower, D. F. (2005). Six degrees: the science of a connected age. Complicity: an international journal of complexity and education2(1). DOI: https://doi.org/10.29173/cmplct8734
  32. Aggarwal, C. C. (2011). An introduction to social network data analytics. In Social network data analytics(pp. 1-15). Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-8462-3_1
  33. Scellato, S., Noulas, A., & Mascolo, C. (2011, August). Exploiting place features in link prediction on location-based social networks. Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining(pp. 1046-1054). https://doi.org/10.1145/2020408.2020575
  34. Ninagawa, A., & Eguchi, K. (2010, March). Link prediction using probabilistic group models of network structure. Proceedings of the 2010 ACM symposium on applied computing(pp. 1115-1116). https://doi.org/10.1145/1774088.1774323
  35. Dunlavy, D. M., Kolda, T. G., & Acar, E. (2011). Temporal link prediction using matrix and tensor factorizations. ACM transactions on knowledge discovery from data (TKDD)5(2), 1-27. https://doi.org/10.1145/1921632.1921636
  36. Leskovec, J., Lang, K. J., & Mahoney, M. (2010, April). Empirical comparison of algorithms for network community detection. Proceedings of the 19th international conference on world wide web(pp. 631-640).
  37. Du, N., Wu, B., Pei, X., Wang, B., & Xu, L. (2007, August). Community detection in large-scale social networks. Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on web mining and social network analysis(pp. 16-25). https://doi.org/10.1145/1348549.1348552
  38. Garg, S., Gupta, T., Carlsson, N., & Mahanti, A. (2009, November). Evolution of an online social aggregation network: an empirical study. Proceedings of the 9th ACM SIGCOMM conference on internet measurement(pp. 315-321). https://doi.org/10.1145/1644893.1644931
  39. Allamanis, M., Scellato, S., & Mascolo, C. (2012, November). Evolution of a location-based online social network: analysis and models. Proceedings of the 2012 internet measurement conference(pp. 145-158). https://doi.org/10.1145/2398776.2398793
  40. Gong, N. Z., Xu, W., Huang, L., Mittal, P., Stefanov, E., Sekar, V., & Song, D. (2012, November). Evolution of social-attribute networks: measurements, modeling, and implications using google+. Proceedings of the 2012 internet measurement conference(pp. 131-144). https://doi.org/10.1145/2398776.2398792
  41. Zheleva, E., Sharara, H., & Getoor, L. (2009, June). Co-evolution of social and affiliation networks. Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining(pp. 1007-1016). https://doi.org/10.1145/1557019.1557128
  42. Tang, J., Sun, J., Wang, C., & Yang, Z. (2009, June). Social influence analysis in large-scale networks. Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining(pp. 807-816). https://doi.org/10.1145/1557019.1557108
  43. Li, Y., Chen, W., Wang, Y., & Zhang, Z. L. (2013, February). Influence diffusion dynamics and influence maximization in social networks with friend and foe relationships. Proceedings of the sixth ACM international conference on web search and data mining(pp. 657-666). https://doi.org/10.1145/2433396.2433478
  44. Dai, W., Chen, Y., Xue, G. R., Yang, Q., & Yu, Y. (2008). Translated learning: transfer learning across different feature spaces. Advances in neural information processing systems21, 353-360.
  45. Pepper, R. (2013). Cisco visual networking index (VNI) global mobile data traffic forecast update. Retrieved from Cisco. https://www.gsm.org/spectrum/wp-content/uploads/2013/03/Cisco_VNI-global-mobile-data-traffic-forecast-update.pdf
  46. Rhee, Y., & Lee, J. (2009). On modeling a model of mobile community: designing user interfaces to support group interaction. Interactions16(6), 46-51. https://doi.org/10.1145/1620693.1620705
  47. Han, J., Li, Z., & Tang, L. A. (2010, April). Mining moving object, trajectory and traffic data. International conference on database systems for advanced applications(pp. 485-486). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12098-5_56
  48. Garg, M. K., Kim, D. J., Turaga, D. S., & Prabhakaran, B. (2010, March). Multimodal analysis of body sensor network data streams for real-time healthcare. Proceedings of the international conference on multimedia information retrieval(pp. 469-478). https://doi.org/10.1145/1743384.1743467
  49. Mayer-Schönberger, V., & Cukier, K. (2013). Big data: a revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt.