Miloš Živadinović – Faculty of Organizational Sciences, Jove Ilića, 154, 11000, Belgrade, Serbia
Abstract: The appearance of Large Language Models (LLMs) has brought advancements in natural language processing (NLP), making it more available to everyone. This paper examines the application of LLMs in text mining, with a focus on ChatGPT by OpenAI. The author provides a brief overview of LLMs, highlighting their structure and training techniques, as well as parameter tuning. Utilizing ChatGPT as an example of an LLM, this paper identifies the model’s capabilities and constraints in extracting insights from textual data. Based on the author’s findings, they suggest several applications of LLMs for text mining that provide better text comprehension and set the tone for further research.
7th International Scientific Conference on Recent Advances in Information Technology, Tourism, Economics, Management and Agriculture – ITEMA 2023 – Selected Papers, Hybrid (Faculty of Organization and Informatics Varaždin, University of Zagreb, Croatia), October 26, 2023
ITEMA Selected Papers published by: Association of Economists and Managers of the Balkans – Belgrade, Serbia
ITEMA conference partners: Faculty of Economics and Business, University of Maribor, Slovenia; Faculty of Organization and Informatics, University of Zagreb, Varaždin; Faculty of Geography, University of Belgrade, Serbia; Institute of Marketing, Poznan University of Economics and Business, Poland; Faculty of Agriculture, Banat’s University of Agricultural Sciences and Veterinary Medicine ”King Michael I of Romania”, Romania
ITEMA Conference 2023 Selected Papers: ISBN 978-86-80194-76-9, ISSN 2683-5991, DOI: https://doi.org/10.31410/ITEMA.S.P.2023
Creative Commons Non Commercial CC BY-NC: This article is distributed under the terms of the Creative Commons Attribution-Non-Commercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission.
References
About. (n.d.). Retrieved December 19, 2023, from https://openai.com/about
Allamanis, M., & Sutton, C. (2013). Mining source code repositories at massive scale using language modeling. 2013 10th Working Conference on Mining Software Repositories (MSR), 207–216. https://doi.org/10.1109/MSR.2013.6624029
Beltagy, I., Lo, K., & Cohan, A. (2019). SciBERT: A Pretrained Language Model for Scientific Text. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 3613–3618. https://doi.org/10.18653/v1/D19-1371
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., … Amodei, D. (2020). Language Models are Few-Shot Learners (arXiv:2005.14165). arXiv. http://arxiv.org/abs/2005.14165
Browsing. (n.d.). Retrieved December 15, 2023, from https://openai.com/blog/chatgptplugins#browsing
ChatGPT. (n.d.). Retrieved July 18, 2023, from https://chat.openai.com
Chat Plugins. (n.d.). Retrieved December 15, 2023, from https://platform.openai.com/docs/plugins/introduction/chat-plugins-beta
Database Data Warehousing Guide. (n.d.). Oracle Help Center. Retrieved December 16, 2023, from https://docs.oracle.com/en/database/oracle/oracle-database/21/dwhsg/extraction-data-warehouses.html#GUID-A9A3D5CD-A34A-46BB-844A-76DFE119CE02
de Rosa, G. H., & Papa, J. P. (2021). A survey on text generation using generative adversarial networks. Pattern Recognition, 119, 108098. https://doi.org/10.1016/j.patcog.2021.108098
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (arXiv:1810.04805). arXiv. https://doi.org/10.48550/arXiv.1810.04805
Generative AI Market Size, Share and Industry Trends [2030]. (n.d.). Retrieved December 18, 2023, from https://www.fortunebusinessinsights.com/generative-ai-market-107837
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning.
Hodgkin, A. L., & Huxley, A. F. (1952). A quantitative description of membrane current and its application to conduction and excitation in nerve. The Journal of Physiology, 117(4), 500–544. https://doi.org/10.1113/jphysiol.1952.sp004764
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C. H., & Kang, J. (2020). BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4), 1234–1240. https://doi.org/10.1093/bioinformatics/btz682
Ling, M. H. (2023). ChatGPT (Feb 13 Version) is a Chinese Room. https://doi.org/10.48550/ARXIV.2304.12411
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2023). Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys, 55(9), 195:1-195:35. https://doi.org/10.1145/3560815
Sumathy, K., & Chidambaram, M. (2013). Text Mining: Concepts, Applications, Tools and Issues An Overview. International Journal of Computer Applications, 80(4), 29–32. https://doi.org/10.5120/13851-1685
Luo, R., Sun, L., Xia, Y., Qin, T., Zhang, S., Poon, H., & Liu, T.-Y. (2022). BioGPT: Generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics, 23(6), bbac409. https://doi.org/10.1093/bib/bbac409
Nijkamp, E., Pang, B., Hayashi, H., Tu, L., Wang, H., Zhou, Y., Savarese, S., & Xiong, C. (2022, March 25). CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. International Conference on Learning Representations. https://www.semanticscholar.org/paper/CodeGen%3A-An-Open-Large-Language-Model-for-Code-with-Nijkamp-Pang/38115e80d805fb0fb8f090dc88ced4b24be07878
OpenAI. (n.d.). Retrieved December 14, 2023, from https://openai.com/
Radford, A., & Narasimhan, K. (2018). Improving Language Understanding by Generative Pre-Training. https://www.semanticscholar.org/paper/Improving-Language-Understanding-by-Generative-Radford-Narasimhan/cd18800a0fe0b668a1cc19f2ec95b5003d0a5035
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models (arXiv:2112.10752). arXiv. http://arxiv.org/abs/2112.10752
Shazeer, N. (2020). GLU Variants Improve Transformer (arXiv:2002.05202; Version 1). arXiv. https://doi.org/10.48550/arXiv.2002.05202
Similarweb. (n.d.). Chat.openai.com traffic analytics, ranking stats & tech stack. Retrieved December 14, 2023, from https://www.similarweb.com/website/chat.openai.com/
Su, J., Lu, Y., Pan, S., Murtadha, A., Wen, B., & Liu, Y. (2023). RoFormer: Enhanced Transformer with Rotary Position Embedding (arXiv:2104.09864; Version 5). arXiv. https://doi.org/10.48550/arXiv.2104.09864
Taylor, R., Kardas, M., Cucurull, G., Scialom, T., Hartshorn, A., Saravia, E., Poulton, A., Kerkez, V., & Stojnic, R. (2022). Galactica: A Large Language Model for Science. https://doi.org/10.48550/ARXIV.2211.09085
Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., & Lample, G. (2023). LLaMA: Open and Efficient Foundation Language Models (arXiv:2302.13971). arXiv. https://doi.org/10.48550/arXiv.2302.13971
Touvron, H., Martin, L., & Stone, K. (n.d.). Llama 2: Open Foundation and Fine-Tuned Chat Models.
Tucker, H. (n.d.). Sprechen Sie Growth? How Duolingo Became A Hot Stock In 2023, Plus 99 More Mid-Cap Winners. Forbes. Retrieved December 16, 2023, from https://www.forbes.com/sites/hanktucker/2023/12/15/sprechen-sie-growth-how-duolingo-became-a-hot-stock-in-2023-plus-99-more-mid-cap-winners/
Vaswani, A., Shazeer, N. M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017, June 12). Attention is All you Need. Neural Information Processing Systems. https://www.semanticscholar.org/paper/Attention-is-All-you-Need-Vaswani-Shazeer/204e3073870fae3d05bcbc2f6a8e263d9b72e776
Wang, L., Zhao, Z., Liu, H., Pang, J., Qin, Y., & Wu, Q. (2023). A Review of Intelligent Music Generation Systems (arXiv:2211.09124). arXiv. http://arxiv.org/abs/2211.09124
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Chi, E., Xia, F., Le, Q., & Zhou, D. (2022). Chain of Thought Prompting Elicits Reasoning in Large Language Models. ArXiv. https://www.semanticscholar.org/paper/Chain-of-Thought-Prompting-Elicits-Reasoning-in-Wei-Wang/1b6e810ce0afd0dd093f789d2b2742d047e316d5
White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J., & Schmidt, D. C. (2023). A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT (arXiv:2302.11382). arXiv. http://arxiv.org/abs/2302.11382
Wolfram Plugin for ChatGPT. (n.d.). Retrieved December 15, 2023, from https://www.wolfram.com/wolfram-plugin-chatgpt/
Yao, Y., Xu, X., & Liu, Y. (2023). Large Language Model Unlearning. https://doi.org/10.48550/ARXIV.2310.10683