SUMMARIZATION OF SCIENTIFIC PAPERS IN SERBIAN LANGUAGE USING NLP METHODS

Authors

  • Наташа Ивановић Autor

DOI:

https://doi.org/10.24867/28BE29Ivanovic

Keywords:

Automatic text summarization, NLP, Sequence-to-Sequence models, Transformer models, TextRank algorithm

Abstract

The paper presents a system for summarizing scientific papers in Serbian using NLP methods, aiming to facilitate researchers' work through automatic abstract generation. The solution is implemented in two modules – the extraction phase (TextRank algorithm) and the abstraction phase (Sequence-to-sequence models), for which the Transformer model has proven to be the best choice.

References

[1] Tas, O., & Kiyani, F. (2007). A survey automatic text summarization. PressAcademia Procedia, 5(1), 205-213.
[2] Cachola, I., Lo, K., Cohan, A., & Weld, D. S. (2020). TLDR: Extreme summarization of scientific documents. arXiv preprint arXiv:2004.15011.
[3] Mihalcea, R., & Tarau, P. (2004, July). Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing (pp. 404-411).
[4] Keneshloo, Y., Shi, T., Ramakrishnan, N., & Reddy, C. K. (2019). Deep reinforcement learning for sequence-to-sequence models. IEEE transactions on neural networks and learning systems, 31(7), 2469-2489.
[5] Edmundson, H. P. (1969). New methods in automatic extracting. Journal of the ACM (JACM), 16(2), 264-285.
[6] Kosmajac, D., & Kešelj, V. (2019, March). Automatic text summarization of news articles in serbian language. In 2019 18th International Symposium INFOTEH-JAHORINA (INFOTEH) (pp. 1-6). IEEE.
[7] Lin, H., & Ng, V. (2019, July). Abstractive summarization: A survey of the state of the art. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 9815-9822).
[8] Pilault, J., Li, R., Subramanian, S., & Pal, C. (2020, November). On extractive and abstractive neural document summarization with transformer language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 9308-9319).
[9] Altmami, N. I., & Menai, M. E. B. (2022). Automatic summarization of scientific articles: A survey. Journal of King Saud University-Computer and Information Sciences, 34(4), 1011-1028.
[10] Shi, T., Keneshloo, Y., Ramakrishnan, N., & Reddy, C. K. (2021). Neural abstractive text summarization with sequence-to-sequence models. ACM Transactions on Data Science, 2(1), 1-37.
[11] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

Published

2024-09-05

Issue

Section

Electrotechnical and Computer Engineering