Document Classification in HEIs Using Deep Learning
A CNN, RNN, and Hybrid CNN-RNN Approach
DOI:
https://doi.org/10.62050/fscp2024.462Keywords:
Higher Education Institutions, Convolutional Neural Networks, Recurrent Neural Networks, Deep Learning, ClassificationAbstract
Higher Education Institutions (HEIs) are increasingly confronted with the complexities of evolving rules and requirements, necessitating innovative technology solutions to streamline document handling processes. Traditional paperwork methods are often inefficient and error-prone, leading to potential non-compliance. This research addresses these challenges by developing an AI-powered electronic document management system designed to automate compliance checks and simplify document handling as HEIs grow. The primary objective is to create a document classification model utilizing deep learning techniques, including Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and a hybrid CNN-RNN approach, to enhance document accuracy and compliance. The study involves collecting and preprocessing a substantial dataset of documents, designing and evaluating various deep learning models, and optimizing hyperparameters. Performance comparisons among the models indicate that the hybrid CNN-RNN architecture outperforms individual models, achieving superior accuracy, recall, and F1-score, alongside a significantly lower mean squared error (MSE). Initial evaluations revealed the CNN, RNN, and CNN-RNN models achieved accuracies of 73%, 44%, and 27%, respectively, on the raw dataset. However, with an upgraded dataset, these models improved to 76%, 48%, and 79% accuracy, respectively, highlighting the hybrid model's enhanced capability in accurately classifying documents. The findings revealed the effectiveness of integrating advanced deep learning techniques to improve document verification processes in HEIs, ultimately facilitating better compliance and operational efficiency.
References
J. K. A. Sagum. (2021). Web-Based Document Management System for PEP Squad Events and Marketing Services. 2021 IEEE 13th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), 1–5. https://doi.org/10.1109/HNICEM54116.2021.9732033
Abang, K. I. R., Gatmaitan, D. L. V., Manalo, F. R., Torcelino, M. R., Rodriguez, R. L., & Serrano, E. A. (2022). CCT Online Request of Students Credentials: A Document Management System for Private HIEs in the Philippines. 2022 2nd International Conference in Information and Computing Research (ICORE), 25–29. https://doi.org/10.1109/iCORE58172.2022.00024
J. M. Jayoma, E. S. Moyon, & E. M. O. Morales. (2020). OCR Based Document Archiving and Indexing Using PyTesseract: A Record Management System for DSWD Caraga, Philippines. 2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), 1–6. https://doi.org/10.1109/HNICEM51456.2020.9400000
C. D. Tanuraharja, K. A. Tiara, G. Wang, & H. Alianto. (2022). Applying for E-Signature Approval with TOGAF Framework to Improve Productivity: Case Study SAP Document Management System. 2022 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), 77–81. https://doi.org/10.1109/ICIMCIS56303.2022.10017845
M. Mittal & M. Mittal. (2022). An Electronic Health Record Management System Based on Blockchain Technology. 2022 International Conference on Fourth Industrial Revolution Based Technology and Practices (ICFIRTP), 285–290. https://doi.org/10.1109/ICFIRTP56122.2022.10059456
F. Setianto & Suharjito. (2018). Analysis the Acceptance of Use for Document Management System Using Technology Acceptance Model. 2018 Third International Conference on Informatics and Computing (ICIC), 1–5. https://doi.org/10.1109/IAC.2018.8780462
N. Nautiyal, P. Agarwal, & S. Sharma. (2023). Rechain: A Secured Blockchain-Based Digital Medical Health Record Management System. 2023 4th International Conference on Innovative Trends in Information Technology (ICITIIT), 1–6. https://doi.org/10.1109/ICITIIT57246.2023.10068707
Bhatlawande, S., Shilaskar, S., Gupta, D., Dupare, P., Ghode, R. (2024). Automated Identity Document Classification. In: Sharma, H., Shrivastava, V., Tripathi, A.K., Wang, L. (eds) Communication and Intelligent Systems. ICCIS 2023. Lecture Notes in Networks and Systems, vol 969. Springer, Singapore. https://doi.org/10.1007/978-981-97-2082-8_30
Renjith, S., Manazhy, R., Suresh, M.S.S. (2024). Recognition of Sign Language Using Hybrid CNN–RNN Model. In: Hassanien, A.E., Anand, S., Jaiswal, A., Kumar, P. (eds) Innovative Computing and Communications. ICICC 2024. Lecture Notes in Networks and Systems, vol 1021. Springer, Singapore. https://doi.org/10.1007/978-981-97-3591-4_2
J. M. Jayoma, E. S. Moyon, & E. M. O. Morales. (2020). OCR Based Document Archiving and Indexing Using PyTesseract: A Record Management System for DSWD Caraga, Philippines. 2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), 1–6. https://doi.org/10.1109/HNICEM51456.2020.9400000

Downloads
Published
Issue
Section
License
Copyright (c) 2025 Proceedings of the Faculty of Science Conferences

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.