About the research
State departments of transportation (DOTs) and city municipal agencies install a large number of roadside cameras on freeways and arterials for surveillance tasks. It is estimated that there will be approximately a billion cameras worldwide by 2020. However, most of these cameras are used for manual surveillance purposes only. The main objective of this study was to investigate the use of these cameras as a sensor for traffic state estimation.
The scope of this project involved detecting vehicles, tracking them, and estimating their speeds. The research team adopted a tracking-by-detection framework for this study. The object detection task was performed using you only look once version 3 (YOLOv3) model architecture and the tracking was performed using the simple online and realtime tracking (SORT) algorithm. The team tested the framework on videos collected from three intersections in Ames, Iowa. The combined detection and tracking was performed at approximately 40 frames per second (fps) using GeForce GTX 1080 GPU, enabling it to be implemented online easily.
Camera calibration was performed by finding the edges of moving vehicles to automatically detect the vanishing points, and the scale factor was determined manually from a known fixed distance in the image and the real world. Although this methodology performed vanishing point determination automatically without any manual intervention, the speed estimation error came out to be quite high (~13 mph). The error can be reduced significantly by performing calibration and scale factor determination fully manually. However, since it requires full manual intervention, it is difficult to scale the algorithm across multiple cameras.
In the future, the detection task can be improved by training the model on a larger dataset, and further work can be done to improve speed estimation by extending automatic camera calibration to automatic scale estimation, which would improve accuracy simultaneously.