ENHANCING CONVERSATIONAL AGENTS USING ROTATIONAL ATTENTION AND GATED SPLINE MODULES

VIGNESH ARUMUGAM; MUTHUKUMARAN NARAYANAPERUMAL

doi:10.59277/RRST-EE.2025.3.18

Authors

VIGNESH ARUMUGAM Sri Eshwar College of Engineering, Coimbatore – 641202, Tamil Nadu, India. Author
MUTHUKUMARAN NARAYANAPERUMAL Sri Eshwar College of Engineering, Coimbatore – 641202, Tamil Nadu, India. Author https://orcid.org/0000-0003-0592-6630

DOI:

https://doi.org/10.59277/RRST-EE.2025.3.18

Keywords:

Natural language understanding, Conversational AI, Enhanced T5 Model, Contextual dual-axis rotational attention, Neural-spline gated linear units

Abstract

In natural language understanding, transformer models like T5 and GPT have achieved strong results in generating contextually relevant responses. However, limitations such as static self-attention in T5 and unidirectional context in GPT hinder their ability to capture deeper inter-token dependencies and nuanced semantics. To address these challenges, we propose an enhanced T5 (ET5) architecture integrating two novel modules: contextual dual-axis rotational attention (CDARA) and neural-spline gated linear units (NS-GLU). CDARA facilitates attention across both token and feature dimensions, while NS-GLU introduces adaptive spline-activated gating for improved nonlinear representation. Experiments on NarrativeQA, SQuAD, MultiWOZ, and DailyDialog show that ET5 consistently outperforms PEGASUS, GPT-3, and T5-LSTM FusionNet. ET5 achieves superior BERTScore (up to 0.971), BLEU (up to 0.77), and lower word error rate (WER) (as low as 0.13), confirming its effectiveness in generating fluent, accurate, and semantically rich responses. These results position ET5 as a promising advancement in transformer-based conversational AI systems.

Author Biography

MUTHUKUMARAN NARAYANAPERUMAL, Sri Eshwar College of Engineering, Coimbatore – 641202, Tamil Nadu, India.

Dr. N. MUTHUKUMARAN was born in Kanniyakumari, Tamil Nadu, India, in 1984. He received the B.E Degree in Electronics and Communication Engineering, M.E Degree in Applied Electronics and the Ph.D. Degree in Information and Communication Engineering from Anna University, Chennai, India in 2007, 2010 and 2015 respectively. He is currently working as a professor in the Centre for Computational Imaging and Machine Vision in the Department of ECE at Sri Eshwar College of Engineering, Affiliated to Anna University Chennai, Coimbatore, Tamil Nadu, India. His major research interests are in the field of Digital Image/ Signal Processing, Multimedia Image/ Video Processing/ Compression, Digital and Analog Very Large-Scale Integration circuit design. Since 2006 he has published more than 73 International Journals like Springer, IEEE, Elsevier and 88 National/International conferences papers. He has published 15 International Books which is related to Engineering Students and 27 Innovation Patents. He has actively participated and organized more than 102 research related events like National and International Workshop, Faculty Development Program, Seminar, Symposium, Conference and Short-Term Courses Delivered & Attended. He has collaborated and life time member of more than 19 various Memberships body Association like IEEE, ISI, WCECS, UACEE etc.

References

(1) M. Ganga, G. Jasmine, N. Muthukumaran, M. Veluchamy, Red fox-based fractional order fuzzy PID controller for smart LED driver circuit, Rev. Roum. Sci. Techn. – Électrotechn. Et Énerg., 68, pp. 395–400 (2023).

(2) R. Ahmad, D. Siemon, U. Gnewuch, S. Robra-Bissantz, Designing personality-adaptive conversational agents for mental health care, Information Systems Frontiers, 24, pp. 923–943 (2022).

(3) A. Ramaiah, P. Devi Balasubramanian, A. Appathurai, N. Muthukumaran, Génie biomédical Biomedical Engineering detection of Parkinson’s disease via Clifford gradient-based recurrent neural network using multi-dimensional data, Rev. Roum. Sci. Techn. -Électrotechn. et Énerg, 69 (2024).

(4) J. Balakrishnan, Y.K. Dwivedi, Conversational commerce: entering the next stage of AI-powered digital assistants, Ann Oper Res (2021).

(5) A. Appathurai, A.S.I. Tinu, N. Muthukumaran, Meg and Pet images-based brain tumor detection using Kapur’s Otsu segmentation and sooty optimized Mobilenet classification, Rev. Roum. Sci. Techn. – Électrotechn. Et Énerg., 69, 3, pp. 363–368 (2024).

(6) W. Cai et al., Bandit algorithms to personalize educational chatbots, Mach Learn, 110, 9, pp. 2389–2418 (2021).

(7) T.Y. Chen, Y.C. Chiu, N. Bi, R.T.H. Tsai, Multi-modal Chatbot in intelligent manufacturing, IEEE Access (2021).

(8) S. Gong, M. Li, J. Feng, Z. Wu, L. Kong, DiffuSeq: Sequence to sequence text generation with diffusion models, (2022).

(9) L. Grassi, C.T. Recchiuto, A. Sgorbissa, Knowledge-Grounded dialogue flow management for social robots and conversational agents, Int J Soc Robot, 14, 5, pp. 1273–1293, (2022).

(10) H. Honda, M. Hagiwara, Question answering systems with deep learning-based symbolic processing, IEEE Access, 7, pp. 152368–152378 (2019).

(11) C. Hsu, C.C. Chang, Integrating machine learning and open data into social Chatbot for filtering information rumor, J Ambient Intell Humaniz Comput, 12, 1, pp. 1023–1037, (2021).

(12) R.B. Lincy, R. Gayathri, Optimized convolutional neural network for tamil handwritten character recognition, Intern J Pattern Recognit Artif Intell, 36, 11, (2022).

(13) M.M. Mohsan, M.U. Akram, G. Rasool, N.S. Alghamdi, M.A.A. Baqai, M. Abbas, Vision transformer and language model-based radiology report generation, IEEE Access, 11, pp. 1814–1824 (2023).

(14) K. Palasundram, N. Mohd Sharef, K.A. Kasmiran, A. Azman, Enhancements to the sequence-to-sequence-based natural answer generation models, IEEE Access, 8, pp. 45738–45752 (2020).

(15) Y. Park, A. Park, C. Kim, ALSI-Transformer: transformer-based code comment generation with aligned lexical and syntactic information, IEEE Access, 11, pp. 39037–39047 (2023).

(16) J. Zhang, Y. Zhao, M. Saleh, P.J. Liu, PEGASUS: Pre-training with extracted gap-sentences for abstractive summarization, arXiv (2019)

(17) B. Khan, M. Usman, I. Khan, J. Khan, D. Hussain, Y.H. Gu, Next-generation text summarization: A T5-LSTM FusionNet hybrid approach for psychological data, IEEE (2025).

(18) M.V. Namitha, G.R. Manjula, M.C. Belavagi, StegAbb: A cover-generating text steganographic tool using GPT-3 language modeling for covert communication across SDRs, IEEE Access, 12, pp. 82057–82067 (2024).

(19) D. Mylsamy, A. Appathurai, N. Muthukumaran, S. Kuppusamy, Mojo-based fuzzy agglomerative clustering algorithm with Ed 2 Mt strategy for large-scale wireless sensors networks, Rev. Roum. Sci. Techn. -Électrotechn. et Énerg,, 69 (2024).

(20) W.T. Wang, N. Tan, J.A. Hanson, C.A. Crubaugh, A.K. Hara, Initial experience with a COVID-19 screening chatbot before radiology appointments, J Digit Imaging, 35, 5, pp. 1303–1307 (2022).

(21) M.R. Kumar, R. Sundaram, M. Rengasamy, R. Balakrishnan, Effective feature extraction method for unconstrained environment: local binary pattern or local ternary pattern, Rev. Roum. Sci. Techn. – Électrotechn. Et Énerg., 69, 4, pp. 443–448 (2024).

ENHANCING CONVERSATIONAL AGENTS USING ROTATIONAL ATTENTION AND GATED SPLINE MODULES

Authors

DOI:

Keywords:

Abstract

Author Biography

References

Downloads

Published

Issue

Section

License

How to Cite

Language

Information