ANALYSIS OF THE STACK EXCHANGE NETWORK: CONCEPT GRAPH VISUALIZATION AND TEXT CLASSIFICATION BY PROGRAMMING LANGUAGES

Authors

  • Jovan Ivanović Autor

DOI:

https://doi.org/10.24867/04TI01Ivanovic

Keywords:

Stack Exchange, Data mining, Data visualization, Text classification

Abstract

This paper presents an analysis of the data set retrieved from Stack Exchange, a network of questions and answers (Q&A) websites. The paper investigates the properties of the data set first by implementing a method of visualization of concepts and topics found in the data set, and then by building a classifier to classify questions by programming languages.

References

[1] https://stackexchange.com (pristupljeno u januaru 2019.)
[2] https://stackoverflow.com (pristupljeno u januaru 2019.)
[3] Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze (2008), Introduction to Information Retrieval, Cambridge University Press, chapter 18: Matrix decompositions & latent semantic indexing
[4] Lee, Daniel D., and H. Sebastian Seung. "Algorithms for non-negative matrix factorization." Advances in neural information processing systems. 2001.
[5] Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent dirichlet allocation." Journal of machine Learning research 3.Jan (2003): 993-1022.
[6] Goodfellow, Ian, et al. Deep learning. Vol. 1. Cambridge: MIT press, 2016.

Published

2019-10-02