Recognition of Personal Data in Textual Documents

Authors

  • Đorđe Dragutinović Autor

DOI:

https://doi.org/10.24867/15BE13Dragutinovic

Keywords:

named entities, personal data, NER, SpaCy, Classla

Abstract

This paper presents an application for automatic recognition of personal data in textual documents. The requirements and design of specified application are specified, the essential elements of its implementation are described and usage of the application is demonstrated. The application is implemented using named entity recognition techniques.

References

[1] Zakon o zaštiti podataka o ličnosti („Sl. glasnik RS“, broj 87/2018). [Online] Dostupno na: https://www.paragraf.rs/propisi/zakon_o_zastiti_podataka_o_licnosti.html. [Datum pristupa: 7. jul 2021.]
[2] Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) [Online] Dostupno na: https://gdpr-info.eu. [Datum pristupa: 7. jul 2021.]
[3] S. Atdag and V. Labatut, “A comparison of named entity recognition tools applied to biographical texts“, in Proc. of the 2nd International Conference on Systems and Computer Science, Villeneuve d'Ascq (FR), 2013 [Online]. Dostupno na: https://arxiv.org/abs/1308.0661. [Datum pristupa: 7. jul 2021.]
[4] S. Vychegzhanin and E. Kotelnikov, “Comparison of named entity recognition tools applied to news article”, in Proc. of the 2019 Ivannikov Ispras Open Conference (ISPRAS) [Online]. Dostupno na: https://ieeexplore.ieee.org/document/8991165. [Datum pristupa: 8. jul 2021.]
[5] C. M. Correia da Costa, G. Veiga, A. Jorge Sousa and S. Nunes, “Evaluation of Stanford NER for extraction of assembly information from instruction manual”, in Proc. of the 17th IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), April 2017 [Online]. Dostupno na: ieeexplore.ieee.org/document/7964092. [Datum pristupa: 9. jul 2021.]
[6] D. Altinok, Mastering SpaCy: An end-to-end practical guide to implementing NLP applications using the Python ecosystem, Packt Publishing LTD., Birmingham, UK, 2021.
[7] V. Batanović, N. Ljubešić, T. Samardžić and M. Miličević Petrović, “Otvoreni resursi I tehnologije za obradu srpskog jezika”, in Proc. of the Primena slobodnog softvera i otvorenog hardvera 2020 (PSSOH 2020), Beograd, Srbija [Online]. Dostupno na: www.researchgate.net/publication/349304650. [Datum pristupa: 13. jul 2021.]

Published

2021-11-07

Issue

Section

Electrotechnical and Computer Engineering