ENHANCING CONVERSATIONAL AGENTS USING ROTATIONAL ATTENTION AND GATED SPLINE MODULES
DOI:
https://doi.org/10.59277/RRST-EE.2025.3.18Keywords:
Natural language understanding, Conversational AI, Enhanced T5 Model, Contextual dual-axis rotational attention, Neural-spline gated linear unitsAbstract
In natural language understanding, transformer models like T5 and GPT have achieved strong results in generating contextually relevant responses. However, limitations such as static self-attention in T5 and unidirectional context in GPT hinder their ability to capture deeper inter-token dependencies and nuanced semantics. To address these challenges, we propose an enhanced T5 (ET5) architecture integrating two novel modules: contextual dual-axis rotational attention (CDARA) and neural-spline gated linear units (NS-GLU). CDARA facilitates attention across both token and feature dimensions, while NS-GLU introduces adaptive spline-activated gating for improved nonlinear representation. Experiments on NarrativeQA, SQuAD, MultiWOZ, and DailyDialog show that ET5 consistently outperforms PEGASUS, GPT-3, and T5-LSTM FusionNet. ET5 achieves superior BERTScore (up to 0.971), BLEU (up to 0.77), and lower word error rate (WER) (as low as 0.13), confirming its effectiveness in generating fluent, accurate, and semantically rich responses. These results position ET5 as a promising advancement in transformer-based conversational AI systems.
References
(1) M. Ganga, G. Jasmine, N. Muthukumaran, M. Veluchamy, Red fox-based fractional order fuzzy PID controller for smart LED driver circuit, Rev. Roum. Sci. Techn. – Électrotechn. Et Énerg., 68, pp. 395–400 (2023).
(2) R. Ahmad, D. Siemon, U. Gnewuch, S. Robra-Bissantz, Designing personality-adaptive conversational agents for mental health care, Information Systems Frontiers, 24, pp. 923–943 (2022).
(3) A. Ramaiah, P. Devi Balasubramanian, A. Appathurai, N. Muthukumaran, Génie biomédical Biomedical Engineering detection of Parkinson’s disease via Clifford gradient-based recurrent neural network using multi-dimensional data, Rev. Roum. Sci. Techn. -Électrotechn. et Énerg, 69 (2024).
(4) J. Balakrishnan, Y.K. Dwivedi, Conversational commerce: entering the next stage of AI-powered digital assistants, Ann Oper Res (2021).
(5) A. Appathurai, A.S.I. Tinu, N. Muthukumaran, Meg and Pet images-based brain tumor detection using Kapur’s Otsu segmentation and sooty optimized Mobilenet classification, Rev. Roum. Sci. Techn. – Électrotechn. Et Énerg., 69, 3, pp. 363–368 (2024).
(6) W. Cai et al., Bandit algorithms to personalize educational chatbots, Mach Learn, 110, 9, pp. 2389–2418 (2021).
(7) T.Y. Chen, Y.C. Chiu, N. Bi, R.T.H. Tsai, Multi-modal Chatbot in intelligent manufacturing, IEEE Access (2021).
(8) S. Gong, M. Li, J. Feng, Z. Wu, L. Kong, DiffuSeq: Sequence to sequence text generation with diffusion models, (2022).
(9) L. Grassi, C.T. Recchiuto, A. Sgorbissa, Knowledge-Grounded dialogue flow management for social robots and conversational agents, Int J Soc Robot, 14, 5, pp. 1273–1293, (2022).
(10) H. Honda, M. Hagiwara, Question answering systems with deep learning-based symbolic processing, IEEE Access, 7, pp. 152368–152378 (2019).
(11) C. Hsu, C.C. Chang, Integrating machine learning and open data into social Chatbot for filtering information rumor, J Ambient Intell Humaniz Comput, 12, 1, pp. 1023–1037, (2021).
(12) R.B. Lincy, R. Gayathri, Optimized convolutional neural network for tamil handwritten character recognition, Intern J Pattern Recognit Artif Intell, 36, 11, (2022).
(13) M.M. Mohsan, M.U. Akram, G. Rasool, N.S. Alghamdi, M.A.A. Baqai, M. Abbas, Vision transformer and language model-based radiology report generation, IEEE Access, 11, pp. 1814–1824 (2023).
(14) K. Palasundram, N. Mohd Sharef, K.A. Kasmiran, A. Azman, Enhancements to the sequence-to-sequence-based natural answer generation models, IEEE Access, 8, pp. 45738–45752 (2020).
(15) Y. Park, A. Park, C. Kim, ALSI-Transformer: transformer-based code comment generation with aligned lexical and syntactic information, IEEE Access, 11, pp. 39037–39047 (2023).
(16) J. Zhang, Y. Zhao, M. Saleh, P.J. Liu, PEGASUS: Pre-training with extracted gap-sentences for abstractive summarization, arXiv (2019)
(17) B. Khan, M. Usman, I. Khan, J. Khan, D. Hussain, Y.H. Gu, Next-generation text summarization: A T5-LSTM FusionNet hybrid approach for psychological data, IEEE (2025).
(18) M.V. Namitha, G.R. Manjula, M.C. Belavagi, StegAbb: A cover-generating text steganographic tool using GPT-3 language modeling for covert communication across SDRs, IEEE Access, 12, pp. 82057–82067 (2024).
(19) D. Mylsamy, A. Appathurai, N. Muthukumaran, S. Kuppusamy, Mojo-based fuzzy agglomerative clustering algorithm with Ed 2 Mt strategy for large-scale wireless sensors networks, Rev. Roum. Sci. Techn. -Électrotechn. et Énerg,, 69 (2024).
(20) W.T. Wang, N. Tan, J.A. Hanson, C.A. Crubaugh, A.K. Hara, Initial experience with a COVID-19 screening chatbot before radiology appointments, J Digit Imaging, 35, 5, pp. 1303–1307 (2022).
(21) M.R. Kumar, R. Sundaram, M. Rengasamy, R. Balakrishnan, Effective feature extraction method for unconstrained environment: local binary pattern or local ternary pattern, Rev. Roum. Sci. Techn. – Électrotechn. Et Énerg., 69, 4, pp. 443–448 (2024).
Downloads
Published
Issue
Section
License
Copyright (c) 2025 REVUE ROUMAINE DES SCIENCES TECHNIQUES — SÉRIE ÉLECTROTECHNIQUE ET ÉNERGÉTIQUE

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.