Research of semantic segmentation capabilities on images from IoT devices
DOI:
https://doi.org/10.24867/14BE23ZivkovicKeywords:
Semantic segmentation, ERFNet, Unet, IoT devices, ESP-32 CamAbstract
In this paper IoT system for image acquisition using ESP32-Cam module is developed and two convolutional neural network architectures are implemented, ERFNet and Unet. Neural networks were trained on Cityscapes and Camvid datasets and comparison of their performance on images acquired from IoT devices and DSLR cameras is done, as well as on test sets of corresponding databases on which they were trained.
References
[1] Romera, E., Alvarez, J. M., Bergasa, L. M., & Arroyo, R. (2017). Erfnet: Efficient residual factorized convnet for real-time semantic segmentation. IEEE Transactions on Intelligent Transportation Systems, 19(1), 263-272.
[2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. (2015). Deep Residual Learning for Image Recognition..
[3] Ronneberger, O. (2015, May 18). U-Net: Convolutional Networks for Biomedical Image Segmentation.
[4] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The Cityscapes Dataset for Semantic Urban Scene Understanding,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
[5] Julien Fauqueur, Gabriel Brostow, Roberto Cipolla, Assisted Video Object Labeling By Joint Tracking of Regions and Keypoints, IEEE International Conference on Computer Vision (ICCV'2007) Interactive Computer Vision Workshop. Rio de Janeiro, Brazil, October 2007
[6] https://github.com/baudcode/tf-semantic-segmentation
[7] Z. C. Lipton, C. Elkan, & B. Narayanaswamy. (2014). Thresholding Classifiers to Maximize F1 Score.
[8] Ha. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, & S. Savarese. (2019). Generalized Intersection over Union: A Metric and A Loss for Bounding Box
[2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. (2015). Deep Residual Learning for Image Recognition..
[3] Ronneberger, O. (2015, May 18). U-Net: Convolutional Networks for Biomedical Image Segmentation.
[4] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The Cityscapes Dataset for Semantic Urban Scene Understanding,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
[5] Julien Fauqueur, Gabriel Brostow, Roberto Cipolla, Assisted Video Object Labeling By Joint Tracking of Regions and Keypoints, IEEE International Conference on Computer Vision (ICCV'2007) Interactive Computer Vision Workshop. Rio de Janeiro, Brazil, October 2007
[6] https://github.com/baudcode/tf-semantic-segmentation
[7] Z. C. Lipton, C. Elkan, & B. Narayanaswamy. (2014). Thresholding Classifiers to Maximize F1 Score.
[8] Ha. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, & S. Savarese. (2019). Generalized Intersection over Union: A Metric and A Loss for Bounding Box
Downloads
Published
2021-09-09
Issue
Section
Electrotechnical and Computer Engineering