by Sabokrou, M, Fayyaz, M, Fathy, M and klette, R
Abstract:
We present an efficient method for detecting and localizing anomalies in videos showing crowded scenes. Research on \it fully convolutional neural networks (FCNs) has shown the potentials of this technology for object detection and localization, especially in images. We investigate how to involve temporal data, and how to transform a supervised FCN into an unsupervised one such that the resulting FCN ensures anomaly detection. Altogether, we propose an FCN-based architecture for anomaly detection and localization in crowded scenes videos. For reducing computations and, consequently, improving performance both with respect to speed and accuracy, we investigate the use of cascaded out-layer detection. Our architecture includes two main components, one for feature representation, and one for cascaded out-layer detection. Experimental results on Subway and UCSD benchmarks confirm that the detection and localization accuracy of our method is comparable to state-of-the-art methods, but at a significantly increased speed of 370 fps.
Reference:
Fully Convolutional Neural Network for Fast Anomaly Detection in Crowded Scenes (Sabokrou, M, Fayyaz, M, Fathy, M and klette, R), In , 2016.
Bibtex Entry:
@article{sabokrou2016fullyscenes, author = "Sabokrou, M and Fayyaz, M and Fathy, M and klette, R", journal = "", month = "Sep", title = "Fully Convolutional Neural Network for Fast Anomaly Detection in Crowded Scenes", url = "http://arxiv.org/abs/1609.00866v1", year = "2016", abstract = "We present an efficient method for detecting and localizing anomalies in videos showing crowded scenes. Research on {\it fully convolutional neural networks} (FCNs) has shown the potentials of this technology for object detection and localization, especially in images. We investigate how to involve temporal data, and how to transform a supervised FCN into an unsupervised one such that the resulting FCN ensures anomaly detection. Altogether, we propose an FCN-based architecture for anomaly detection and localization in crowded scenes videos. For reducing computations and, consequently, improving performance both with respect to speed and accuracy, we investigate the use of cascaded out-layer detection. Our architecture includes two main components, one for feature representation, and one for cascaded out-layer detection. Experimental results on Subway and UCSD benchmarks confirm that the detection and localization accuracy of our method is comparable to state-of-the-art methods, but at a significantly increased speed of 370 fps.", keyword = "cs.CV", keyword = "cs.CV", day = "4", }