Research of semantic segmentation capabilities on images from IoT devices

Authors

  • Miloš Živković Autor

DOI:

https://doi.org/10.24867/14BE23Zivkovic

Keywords:

Semantic segmentation, ERFNet, Unet, IoT devices, ESP-32 Cam

Abstract

In this paper IoT system for image acquisition using ESP32-Cam module is developed and two convolutional neural network architectures are implemented, ERFNet and Unet. Neural networks were trained on Cityscapes and Camvid datasets and comparison of their performance on images acquired from IoT devices and DSLR cameras is done, as well as on test sets of corresponding databases on which they were trained.

References

[1] Romera, E., Alvarez, J. M., Bergasa, L. M., & Arroyo, R. (2017). Erfnet: Efficient residual factorized convnet for real-time semantic segmentation. IEEE Transactions on Intelligent Transportation Systems, 19(1), 263-272.
[2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. (2015). Deep Residual Learning for Image Recognition..
[3] Ronneberger, O. (2015, May 18). U-Net: Convolutional Networks for Biomedical Image Segmentation.
[4] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The Cityscapes Dataset for Semantic Urban Scene Understanding,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
[5] Julien Fauqueur, Gabriel Brostow, Roberto Cipolla, Assisted Video Object Labeling By Joint Tracking of Regions and Keypoints, IEEE International Conference on Computer Vision (ICCV'2007) Interactive Computer Vision Workshop. Rio de Janeiro, Brazil, October 2007
[6] https://github.com/baudcode/tf-semantic-segmentation
[7] Z. C. Lipton, C. Elkan, & B. Narayanaswamy. (2014). Thresholding Classifiers to Maximize F1 Score.
[8] Ha. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, & S. Savarese. (2019). Generalized Intersection over Union: A Metric and A Loss for Bounding Box

Published

2021-09-09

Issue

Section

Electrotechnical and Computer Engineering