Using AI to Auto-Tag Graduate Theses

Kyle Morgan

doi:10.5860/ital.v44i4.17381

Authors

Kyle Morgan Cal Poly Humboldt https://orcid.org/0000-0003-0524-105X

DOI:

https://doi.org/10.5860/ital.v44i4.17381

Keywords:

Artificial Intelligence (AI), Metadata Automation, Institutional Repositories, UN Sustainable Development Goals (SDGs), Automated Subject Tagging, Machine Learning Models, Generative Pre-trained Transformer (GPT), Bidirectional Encoder Representations from Transformers (BERT)

Abstract

This article presents a practical approach to using artificial intelligence (AI) for tagging graduate theses in an institutional repository with the United Nations Sustainable Development Goals. Utilizing strategies requiring no prior programming experience, the article provides a step-by-step guide, cost analysis, and lessons learned from employing two AI-based tagging methods. These methods, attempted with varying degrees of success, highlight the real potential of using AI for the thematic tagging of digital library resources.

References

“The 17 Goals,” United Nations Department of Economic and Social Affairs, accessed November 4, 2024, http://sdgs.un.org/goals.

Amina El Ganadi et al., “Bridging Islamic Knowledge and AI: Inquiring ChatGPT on Possible Categorizations for an Islamic Digital Library,” in CEUR Workshop Proceedings 3536 (2023): 21–33, https://ceur-ws.org/Vol-3536/03_paper.pdf.

Charlene Chou and Tony Chu, “An Analysis of BERT (NLP) for Assisted Subject Indexing for Project Gutenberg,” Cataloging & Classification Quarterly 60, no. 8 (2022): 807–35, https://doi.org/10.1080/01639374.2022.2138666.

Charlie Harper, Anne Kumer, Shelby Stuart, and Evan Meszaros, “AI-Informed Approaches to Metadata Tagging for Improved Resource Discovery,” in The Rise of AI: Implications and Applications of Artificial Intelligence in Academic Libraries, ed. Sandy Hervieux and Amanda Wheatley (Association of College and Research Libraries, 2022), https://alastore.ala.org/content/rise-ai-implications-and-applications-artificial-intelligence-academic-libraries-pil-78.

Eric H. C. Chow, T. J. Kao, and Xiaoli Li, “An Experiment with the Use of ChatGPT for LCSH Subject Assignment on Electronic Theses and Dissertations,” Cataloging & Classification Quarterly 62, no. 5 (2024): 574–88, https://doi.org/10.1080/01639374.2024.2394516.

G. Horton, Implementing an AI-Generated Subject Indexing Tool for Repositories, presentation at the Fantastic Futures AI4LAM Conference, Vancouver British Columbia, Canada, November 17, 2023, https://archive.org/details/implementing-ai-generated-subject-indexing-tool-for-repositories.

Jade Eva Guisiano, Raja Chiky, and Jonathas De Mello, “SDG-Meter: A Deep Learning Based Tool for Automatic Text Classification of the Sustainable Development Goals,” in Asian Conference on Intelligent Information and Database Systems, 259–71 (Springer, 2022), https://doi.org/10.1007/978-3-031-21743-2_21;Dirk U. Wulff, Dominik S. Meier, and Rui Mata, “Using Novel Data and Ensemble Models to Improve Automated Labeling of Sustainable Development Goals,” Sustainability Science 19, no. 5 (2024): 1773–87, https://doi.org/10.1007/s11625-024-01516-3.

Jenny Bodenhamer, “The Reliability and Usability of ChatGPT for Library Metadata” (Oklahoma State University, 2023), https://hdl.handle.net/20.500.14446/339626.

Karen Martínez Concha, Fernanda Palacios Zenteno, and Josefa Tello Alfaro, “Use of Artificial Intelligence in Libraries: A Systematic Review, 2019–2023,” South African Journal Libraries & Information Science 90, no. 2 (2024): 1–13, https://hdl.handle.net/10520/ejc-liasa_v90_n2_a3.

Kragelj and Borštnar, “Automatic Classification of Older Electronic Texts”.

Maja Kragelj and Mojca Borštnar, “Automatic Classification of Older Electronic Texts into the Universal Decimal Classification – UDC,” Journal of Documentation 77, no. 3 (2021): 755–76, https://doi.org/10.1108/JD-06-2020-0092.

Marit Asula et al., “Kratt: Developing an Automatic Subject Indexing Tool for the National Library of Estonia,” Cataloging & Classification Quarterly 59, no. 8 (2021): 775–93, https://doi.org/10.1080/01639374.2021.1998283.

Morales-Hernández, Gutiérrez Jagüey, and Becerra-Alonso, “A Comparison of Multi-label Text Classification Models”.

Roberto Carlos Morales-Hernández, Joaquín Gutiérrez Jagüey, and David Becerra-Alonso, “A Comparison of Multi-label Text Classification Models in Research Articles Labeled with Sustainable Development Goals,” IEEE Access 10 (2022): 123534–48, https://doi.org/10.1109/ACCESS.2022.3223094.

Rui Yao, Meilin Tian, Chi-Un Lei, and Dickson K. W. Chiu, “Assigning Multiple Labels of Sustainable Development Goals to Open Educational Resources for Sustainability Education,” Education and Information Technologies 29, no. 14 (2024): 18477–99, https://doi.org/10.1007/s10639-024-12566-6.

Rui Zhang, Maéva Vignes, Ulrich Steiner, and Arthur Zimek, “Matching Research Publications to the United Nations’ Sustainable Development Goals by Multi-label-learning with Hierarchical Categories,” in 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA) (IEEE, 2020), 516–25, https://doi.org/10.1109/DSAA49011.2020.00066.

Sherry Buchanan, “Looking at the Past to Change the Future: Showcasing Featured Collections, Building Communities, and Co-creating,” Humboldt Journal of Social Relations 1, no. 46 (2024): 17–31, https://doi.org/10.55671/0160-4341.1245.

Sugabsen Martins, “Artificial Intelligence-Assisted Classification of Library Resources: The Case of Claude AI,” Library Philosophy and Practice (2024): 1–22, https://digitalcommons.unl.edu/libphilprac/8159.

Vyacheslav Zavalin and Oksana L. Zavalina, “Are We There Yet? Evaluation of AI-generated Metadata for Online Information Resources,” Information Research: An International Electronic Journal 30, no. iConf (2025): 732–40, https://doi.org/10.47989/ir30iConf47215.

Yao Tian, Lei, and Chiu, “Assigning Multiple Labels of Sustainable Development Goals”.

Using AI to Auto-Tag Graduate Theses

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Information