Document Classification in HEIs Using Deep Learning

A CNN, RNN, and Hybrid CNN-RNN Approach

Authors

  • Abdullahi Abdulkarim Department of Computer Science, Federal University of Technology, Minna Nigeria Author
  • John K. Alhassan Department of Computer Science, Federal University of Technology, Minna Nigeria Author
  • Sulaimon A. Bashir Department of Computer Science, Federal University of Technology, Minna Nigeria Author

DOI:

https://doi.org/10.62050/fscp2024.462

Keywords:

Higher Education Institutions, Convolutional Neural Networks, Recurrent Neural Networks, Deep Learning, Classification

Abstract

 Higher Education Institutions (HEIs) are increasingly confronted with the complexities of evolving rules and requirements, necessitating innovative technology solutions to streamline document handling processes. Traditional paperwork methods are often inefficient and error-prone, leading to potential non-compliance. This research addresses these challenges by developing an AI-powered electronic document management system designed to automate compliance checks and simplify document handling as HEIs grow. The primary objective is to create a document classification model utilizing deep learning techniques, including Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and a hybrid CNN-RNN approach, to enhance document accuracy and compliance. The study involves collecting and preprocessing a substantial dataset of documents, designing and evaluating various deep learning models, and optimizing hyperparameters. Performance comparisons among the models indicate that the hybrid CNN-RNN architecture outperforms individual models, achieving superior accuracy, recall, and F1-score, alongside a significantly lower mean squared error (MSE). Initial evaluations revealed the CNN, RNN, and CNN-RNN models achieved accuracies of 73%, 44%, and 27%, respectively, on the raw dataset. However, with an upgraded dataset, these models improved to 76%, 48%, and 79% accuracy, respectively, highlighting the hybrid model's enhanced capability in accurately classifying documents. The findings revealed the effectiveness of integrating advanced deep learning techniques to improve document verification processes in HEIs, ultimately facilitating better compliance and operational efficiency.

References

J. K. A. Sagum. (2021). Web-Based Document Management System for PEP Squad Events and Marketing Services. 2021 IEEE 13th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), 1–5. https://doi.org/10.1109/HNICEM54116.2021.9732033

Abang, K. I. R., Gatmaitan, D. L. V., Manalo, F. R., Torcelino, M. R., Rodriguez, R. L., & Serrano, E. A. (2022). CCT Online Request of Students Credentials: A Document Management System for Private HIEs in the Philippines. 2022 2nd International Conference in Information and Computing Research (ICORE), 25–29. https://doi.org/10.1109/iCORE58172.2022.00024

J. M. Jayoma, E. S. Moyon, & E. M. O. Morales. (2020). OCR Based Document Archiving and Indexing Using PyTesseract: A Record Management System for DSWD Caraga, Philippines. 2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), 1–6. https://doi.org/10.1109/HNICEM51456.2020.9400000

C. D. Tanuraharja, K. A. Tiara, G. Wang, & H. Alianto. (2022). Applying for E-Signature Approval with TOGAF Framework to Improve Productivity: Case Study SAP Document Management System. 2022 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), 77–81. https://doi.org/10.1109/ICIMCIS56303.2022.10017845

M. Mittal & M. Mittal. (2022). An Electronic Health Record Management System Based on Blockchain Technology. 2022 International Conference on Fourth Industrial Revolution Based Technology and Practices (ICFIRTP), 285–290. https://doi.org/10.1109/ICFIRTP56122.2022.10059456

F. Setianto & Suharjito. (2018). Analysis the Acceptance of Use for Document Management System Using Technology Acceptance Model. 2018 Third International Conference on Informatics and Computing (ICIC), 1–5. https://doi.org/10.1109/IAC.2018.8780462

N. Nautiyal, P. Agarwal, & S. Sharma. (2023). Rechain: A Secured Blockchain-Based Digital Medical Health Record Management System. 2023 4th International Conference on Innovative Trends in Information Technology (ICITIIT), 1–6. https://doi.org/10.1109/ICITIIT57246.2023.10068707

Bhatlawande, S., Shilaskar, S., Gupta, D., Dupare, P., Ghode, R. (2024). Automated Identity Document Classification. In: Sharma, H., Shrivastava, V., Tripathi, A.K., Wang, L. (eds) Communication and Intelligent Systems. ICCIS 2023. Lecture Notes in Networks and Systems, vol 969. Springer, Singapore. https://doi.org/10.1007/978-981-97-2082-8_30

Renjith, S., Manazhy, R., Suresh, M.S.S. (2024). Recognition of Sign Language Using Hybrid CNN–RNN Model. In: Hassanien, A.E., Anand, S., Jaiswal, A., Kumar, P. (eds) Innovative Computing and Communications. ICICC 2024. Lecture Notes in Networks and Systems, vol 1021. Springer, Singapore. https://doi.org/10.1007/978-981-97-3591-4_2

J. M. Jayoma, E. S. Moyon, & E. M. O. Morales. (2020). OCR Based Document Archiving and Indexing Using PyTesseract: A Record Management System for DSWD Caraga, Philippines. 2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), 1–6. https://doi.org/10.1109/HNICEM51456.2020.9400000

cover

Published

2025-03-01

Issue

Section

Physical Sciences

How to Cite

Document Classification in HEIs Using Deep Learning: A CNN, RNN, and Hybrid CNN-RNN Approach. (2025). Proceedings of the Faculty of Science Conferences, 1(1), 38-42. https://doi.org/10.62050/fscp2024.462