CLUSTER AND LINEAR REGRESSION ANALYSIS OF CHINESE AND MALAYSIAN RESEARCH COLLABORATION BASED ON BIG DATA AND SCIBERT

Authors

  • GUANSU WANG Asia-Europe Institute, Universiti Malaya, Kuala Lumpur, Malaysia. Author
  • ZHIHONG HUANG Zhuhai College of Jilin University, Zhuhai, Guangdong, China. Author
  • SAMEER KUMAR Asia-Europe Institute, Universiti Malaya, Kuala Lumpur, Malaysia. Author

DOI:

https://doi.org/10.59277/RRST-EE.2025.1.18

Keywords:

Research collaboration, Cluster analysis, Linear regression, "Belt and road" initiative

Abstract

This study employs big data analytics and SciBERT to conduct cluster and linear regression analyses of research collaboration between China and Malaysia. Originating from China's "Belt and Road" initiative, this collaboration has evolved into a comprehensive strategic partnership, fostering advancements in economics, agriculture, technology, education, and more. Establishing multi-dimensional strategic relationships aligns with global trends, emphasizing technological innovation's pivotal role. Both nations have implemented policies to boost science and technology, influencing their collaborative efforts. Research collaboration serves as a driving force for technological progress, intertwining with cultural exchange. The study focuses on the trends, characteristics, and influencing factors of China-Malaysia research collaboration using data from the Web of Science. The findings provide insights for optimizing collaboration models and guiding future policies, contributing to the communication and development between China and Malaysia.

References

(1) J. Zou, C. Liu, G. Yin, Z. Tang, Spatial patterns and economic effects of China’s trade with countries along the Belt and Road, Progress in Geography, 34, 5, pp. 598–605 (2015).

(2) C.B. Ngeow, The five areas of connectivity between Malaysia and China: Challenges and opportunities, The Belt and Road Initiative: ASEAN Countries’ Perspectives, pp. 117–139 (2019).

(3) M.N.M Akhir, L.C. Leong, H.M. Tahir, Malaysia-China bilateral relations, 1974-2018. WILAYAH: The International Journal of East Asian Studies, 7, 1, pp. 1–26 (2018)

(4) J.L. Ding, L.Y. Yang, H.R. Sun, X.W. Liu, X.Y. Huang, T. Yue, L.Y. Chen., M.M. Zhu, F.Y. Chen, X.Z. Wang, Bibliometric study on research collaboration among the Belt and Road areas and countries, Bulletin of Chinese Academy of Sciences, 32, 6, pp. 626–636 (2017).

(5) J.M. Zhou, Y. Huang, X.F. Wang, Y. Chen, Y. Fu, P.P. Ma, Research on the research cooperation situation between China and the countries along the Belt and Road --- econometric analysis based on Web of Science, Intelligence Engineering, 2, 4, pp. 69–79 (2016).

(6) S. Kumar, Jan, J. Mohd., Mapping research collaborations in the business and management field in Malaysia, 1980–2010, Scientometrics, 97, 3, pp. 491–517 (2013).

(7) Y.P. Ye, W.C. Ma, G.Y. Zhang, Study on the current status of S&T cooperation between China and countries along the "Belt and Road" - a comparative analysis based on patents and papers, Documentation, Information & Knowledge, 4, pp. 60–68 (2016).

(8) J.H. Chen, M.N. Xu, Analysis of the situation and influence factors of China-ASEAN scientific research cooperation, Journal of Information Resources Management, 10, 2, pp. 107–117 (2020).

(9) M. Yu Cheng, K. Wah Hen, H. Piew Tan, K. Fai Fok, Patterns of co-authorship and research collaboration in Malaysia, In Aslib Proceedings: New Information Perspectives, Emerald Group Publishing Limited (2013).

(10) H. Bukvova, Studying research collaboration: a literature review, All Sprouts Content, 10, 3 (2010).

(11) J. Davidson Frame, M.P. Carpenter, International research collaboration, Social studies of science, 9, 4, pp. 481–497 (1979).

(12) T. Plotnikova, B. Rake, Collaboration in pharmaceutical research: exploration of country-level determinants, Scientometrics, 98, 2, pp. 1173–1202 (2014).

(13) N. Song, X. He, Y. Kuang, Research hotspots and trends analysis of user experience: Knowledge maps visualization and theoretical framework construction, Frontiers in Psychology, 13 (2022).

(14) L. Šubelj, N.J. Van Eck, L. Waltman, Clustering scientific publications based on citation relations: a systematic comparison of different methods, PLOS ONE, 11, 4 (2016).

(15) ***Web of Science., Research Areas. http://webofscience.help.clarivate.com/en-us/Content/research-areas.html (2023).

(16) S. Bird, E. Klein, E. Loper, Natural language processing with Python: analyzing text with the natural language toolkit, O’Reilly Media, Inc. (2009).

(17) T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, Advances in Neural Information Processing Systems (2013).

(18) T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, arXiv. arXiv:1301.3781 (2013)

(19) J. Ramos, Using TF-IDF to determine word relevance in document queries, Proceedings of the First Instructional Conference on Machine Learning, (2003).

(20) T.B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, Language models are few-shot learners, Advances in Neural Information Processing Systems 33 (NeurIPS) (2020).

(21) J. Devlin, M.-W Chang, K. Lee, K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, arXiv, arXiv:1810.04805 (2019).

(22) S. Selva Birunda, R. Kanniga Devi, Review on word embedding techniques for text classification, Innovative Data Communication Technologies and Application, Springer Singapore (2021).

(23) I. Beltagy, K. Lo, A. Cohan, SciBERT: A pretrained language model for scientific text, Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (2019).

(24) X. Cai, S. Liu, L. Yang, Y. Lu, J. Zhao, D. Shen, T. Liu, COVIDSum: A linguistically enriched SciBERT-based summarization model for COVID-19 scientific papers, Journal of Biomedical Informatics, 127, 103999 (2022).

(25) A. Glazkova, Identifying topics of scientific articles with BERT-based approaches and topic modeling, Trends and Applications in Knowledge Discovery and Data Mining, Springer International Publishing (2021).

(26) P. Lobanova, P.Bakhtin, Y. Sergienko, Identifying and visualizing trends in science, technology, and innovation using SciBERT, IEEE Transactions on Engineering Management 1–9 (2023).

(27) R. Nainggolan, R. Perangin-angin, E. Simarmata, A.F. Tarigan, Improved the performance of the K-Means cluster using the sum of squared error (SSE) optimized by using the elbow method, Journal of Physics: Conference Series, 1361, 1, 012015 (2019).

(28) A. Golovko, H. Sahin, "Analysis of international trade integration of Eurasian countries: gravity model approach," Eurasian Economic Review, 11, 3, 519–548 (2021).

(29) X.N. Zhang, W.W. Wang, R. Harris, G. Leckie, Analysing inter-provincial urban migration flows in China: A new multilevel gravity model approach, Migration Studies, 8, 1, 19–42 (2020).

(30) ***Chinese and Malaysian Governments., Joint Communiqué of the Government of the People’s Republic of China and the Government of Malaysia, (1974).

(31) ***Chinese and Malaysian Governments., Joint statement on the future bilateral cooperation framework between the Government of the People’s Republic of China and the Government of Malaysia (1999).

(32) ***Chinese and Malaysian Governments., China-Malaysia joint communiqué (2004).

(33) S. Jia, X.L. Wang, C.P. Shen, C.Z. Yuan, I. Adachi, H. Aihara, K. Senyo, Observation of e+ e−→ γ χ c 1 and search for e+ e−→ γ χ c 0, γ χ c 2, and γ η c at s near 10.6 GeV at Belle, Physical Review D, 98, 9, 092015 (2018).

(34) 34. O.A. George, A. Putranto, A., J. Xiao, P.S. Olayiwola, X.D. Chen, J. Ogbemhe, T.J. Akinyemi, A. Kharaghani, Deep neural network for generalizing and forecasting on-demand drying ki

Downloads

Published

25.03.2025

Issue

Section

Automatique et ordinateurs | Automation and Computer Sciences

How to Cite

CLUSTER AND LINEAR REGRESSION ANALYSIS OF CHINESE AND MALAYSIAN RESEARCH COLLABORATION BASED ON BIG DATA AND SCIBERT. (2025). REVUE ROUMAINE DES SCIENCES TECHNIQUES — SÉRIE ÉLECTROTECHNIQUE ET ÉNERGÉTIQUE, 70(1), 103-108. https://doi.org/10.59277/RRST-EE.2025.1.18