A Mathematical Theory of Big Data


  • Zhaohao Sun

    Department of Business Studies, Papua New Guinea University of Technology, Lae 411, Morobe, Papua New Guinea




This article presents a cardinality approach to big data, a fuzzy logicbased approach to big data, a similarity-based approach to big data, and a logical approach to the marketing strategy of social networking services. All these together constitute a mathematical theory of big data. This article also examines databases with infinite attributes. The research results reveal that relativity and infinity are two characteristics of big data. The relativity of big data is based on the theory of fuzzy sets. The relativity of big data leads to the continuum from small data to big data, big data-driven small data analytics to become statistical significance. The infinity of big data is based on the calculus and cardinality theory. The infinity of big data leads to the infinite similarity of big data. The proposed theory in this article might facilitate the mathematical research and development of big data, big data analytics, big data computing, and data science with applications in intelligent business analytics and business intelligence.


Big data, Big data analytics, Fuzzy logic, Similarity, Discrete mathematics


[1] Sun, Z., Wu, Z., 2021. A Strategic Perspective on Big Data Driven Socioeconomic Development. in The 5th International Conference on Big Data Research (ICBDR).September 25-27 (pp. 35-41). Tokyo, Japan: ACM.

[2] Russell, S., Norvig, P., 2020. Artificial Intelligence: A Modern Approach (4th Edition), Upper Saddle River: Prentice Hall.

[3] Hurley, R., 2019. Data Science: A Comprehensive Guide to Data Science, Data Analytics, Data Mining, Artificial Intelligence. Machine Learning, and Big Data, Middletown, DE: Hurley.

[4] Laudon, K.G., Laudon, K.C., 2020. Management Information Systems: Managing the Digital Firm (16th Edition), Harlow, England: Pearson.

[5] Sun, Z., 2022. A Service-Oriented Foundation for Big Data. Research Anthology on Big Data Analytics, Architectures, and Applications, Hershey, PA, IGI-Global. pp. 869-887.

[6] Laval, P.B., 2015. The Mathematics of Big Data.

[7] Peters, T.J., 2015. Mathematics in Data Science.

[8] Sun, Z., Strang, K., Li, R., 2018. Big data with ten big characteristics. Proceedings of 2018 The 2nd Intl Conf. on Big Data Research (ICBDR 2018). October 27-29 (pp. 56-61). Weihai, China: ACM.

[9] Sun, Z., Wang, P.P., 2017. A Mathematical Foundation of Big Data. Journal of New Mathematics and Natural Computation. 13(2), 8-24.

[10] Sun, Z., Xiao, J., 1994. Essentials of Discrete Mathematics, Problems and Solutions., Baoding: Hebei University Press.

[11] Johnsonbaugh, R., 2013. Discrete Mathematics (7th Edition), Pearson Education Limited.

[12] Enderton, H., 1977. Elements of Set Theory, Academic Press Inc.

[13] McAfee, A., Brynjolfsson, E., 2012. Big data: The management revolution. Harvard Business Review. 90(10), 61-68.

[14] Sun, Z., Huo, Y., 2021. The spectrum of big data analytics. Journal of Computer Information Systems. 61(2), 154-162.

[15] Sallam, R., Friedman, T., 2022. Top Trends in Data and Analytics.

[16] Minelli, M., Chambers, M., Dhiraj, A., 2013. Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses, Wiley & Sons (Chinese Edition 2014).

[17] National Research Council, 2013. Frontiers in Massive Data Analysis, Washington, DC: The National Research Press.

[18] Clissa, L., 2022. Survey of Big Data sizes in 2021. (Online). Available: https://arxiv.org/abs/2202.07659. (Accessed 11 March 2022).

[19] Chen, P.P., 1976. The Entity-Relationship Model-Toward a Unified View of Data. ACM Transactions on Database Systems. 1(1), 9-36.

[20] Coronel, C., Morris, S., Rob, P., 2020. Database Systems: Design, Implementation, and Management (14th edition), Boston: Course Technology, Cengage Learning.

[21] Courant, R., 1961. Differential and Integral Calculus Volume I, Glasgow: Blackie & Son, Ltd.

[22] Kelly, J.E., 2015. Computing, cognition and the future of knowing.

[23] Sun, Z., Pambel, F., Wu, Z., 2022. The Elements of Intelligent Business Analytics: Principles, Techniques, and Tools. Handbook of Research on Foundations and Applications of Intelligent Business Analytics, Z. Sun and Z. Wu, Eds. pp. 1-20.

[24] Halevy, A., Norvig, P., Pereira, F., 2009. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems. pp. 8-12.

[25] Jech, T., 2003. Set Theory: The Third Millennium Edition, Revised and Expanded., Springer.

[26] Manyika, J., Chui, M., Bughin, J.E.A., 2011. Big data: The next frontier for innovation, competition, and productivity. (Online). Available: http://www.mckinsey.com/business-functions/business-technology/our-insights/big-data-the-next-frontier-for-innovation

[27] Sharda, R., Delen, D., Turban, E., et al., 2018. Business Intelligence, Analytics, and Data Science: A Managerial Perspective (4th Edition), Pearson.

[28] Lang, S., 2002. Algebra, Graduate Texts in Mathematics 211 (Revised third ed.), New York: Springer-Verlag.

[29] Zimmermann, H., 2001. Fuzzy set theory and its applications (4th edition), Boston: Kluwer Academic Publishers (Springer Seience+Business Media New York).

[30] Zadeh, L.A., 1979. Fuzzy sets and information granularity. Advances in Fuzzy Sets Theory and Applications, Horth-Holland, New York, Elsevier. pp. 3-18.

[31] IGI, 2015. Big Data: Concepts, Methodologies, Tools, and Applications.

[32] Zadeh, L.A., 1965. Fuzzy sets. Information and Control. 8(3), 338-353.

[33] Sun, Z., Sun, L., Strang, K., 2018. Big Data Analytics Services for Enhancing Business Intelligence. Journal of Computer Information Systems (JCIS). 58(2), 162-169.

[34] Finnie, G., Sun, Z., 2002. Similarity and metrics in case-based reasoning. International Journal Intelligent Systems. 17(3), 273-287.

[35] Gigerenzer, G., Selten, R., 2002. Bounded Rationality: The Adaptive Toolbox., MIT Press.

[36] Sun, Z., Pinjik, P., Pambel, F., 2021. Business case mining and E-R modeling optimization. Studies in Engineering and Technology. 8(1), 53-66.

[37] Larson, R., Edwards, B.H., 2010. Calculus (9th ed.), Brooks Cole Cengage Learning.

[38] Shannon, C.E., 1948. A mathematical theory of communication. The Bell System Technical Journal. 27, 379-423, 623-656.

[39] Laval, P.B., 2015. MATH 7900/4490 Math The Mathematics of Big Data (Syllabus). [Online]. Available: https://math.kennesaw.edu/~plaval/BigData/syllabus.pdf. (Accessed 4 Sept 2016).

[40] Laval, P.B., 2015. Introduction to the Mathematics of Big Data.

[41] ICERM, 2015. Mathematics in Data Science. (Online). Available: https://icerm.brown.edu/topical_workshops/tw15-6-mds/

[42] Laval, P.B., 2017. Introduction to the Mathematics of Big Data. (Online). Available: http://ksuweb.kennesaw.edu/~plaval/math4490/fall2017/mathsurvey_def.pdf. (Accessed 25 4 2018).

[43] Chui, C.K., Jiang, Q., 2013. Applied Mathematics: Data Compression, Spectral Methods, Fourier Analysis, Wavelets, and Applications, Springer.

[44] Kepner, J., Jananthan, H., 2018. Mathematics of Big Data: Spreadsheets, Databases, Matrices, and Graphs, MIT Press.

[45] Chen, Y., Ghosh, A., Kearns, M., 2016. Mathematical foundations for social computing. CACM. 59(10), 102-108.

[46] IBM, 2015. The Four V's of Big Data. (Online). Available: http://www.ibmbigdatahub.com/infographic/four-vs-bigdata

[47] Kantardzic, M., 2011. Data Mining: Concepts, Models, Methods, and Algorithms, Hoboken, NJ: Wiley & IEEE Press.


How to Cite

Sun, Z. (2022). A Mathematical Theory of Big Data. Journal of Computer Science Research, 4(2), 13–23. https://doi.org/10.30564/jcsr.v4i2.4646


Article Type