GENERATING IMAGES WITH MULTIPLE LABELED MNIST DIGITS USING GENERATIVE NEURAL NETWORKS
DOI:
https://doi.org/10.24867/08BE26MaletinKeywords:
Generating images, Generating objects, Generative models, GAN, MNISTAbstract
When it comes to generating images using generative neural networks, there are already many common methods and principles. However, the current research is focused primarily on generating an image with a single object. Generating images with multiple desired objects on specified locations, and that are in cohesion with the background, would significantly increase the use of such models. In this paper, we present an architecture for generating images with multiple labeled digits on a simple background. The model's inputs are the labels and bounding boxes of the digits that should be painted. The trained model successfully generates images for the orders of digits it has seen during training, while in the case of new orders, it doesn't always generate the specified digits. Steps for further improving the architecture are discussed.
References
[2] Kingma, D.P. and Welling, M., 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
[3] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y., 2014. Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680).
[4] Mirza, M. and Osindero, S., 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784.
[5] Radford, A., Metz, L. and Chintala, S., 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.
[6] Hinz, T., Heinrich, S. and Wermter, S., 2019. Generating Multiple Objects at Spatially Distinct Locations. arXiv preprint arXiv:1901.00686.
[7] LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P., 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), pp.2278-2324.
[8] Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B. and Lee, H., 2016. Generative adversarial text to image synthesis. arXiv preprint arXiv:1605.05396.
[9] Jaderberg, M., Simonyan, K. and Zisserman, A., 2015. Spatial transformer networks. In Advances in neural information processing systems (pp. 2017-2025).
[10] Ioffe, S. and Szegedy, C., 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
[11] Nair, V. and Hinton, G.E., 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 807-814).
[12] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R., 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), pp.1929-1958.