ICCV2017 | ||||||
Learning Dynamic Siamese Network for Visual Object Tracking |
||||||
International Conference on Computer Vision |
||||||
|
Abstract How to effectively learn temporal variation of target appearance, to exclude the interference of cluttered back- ground, while maintaining real-time response, is an essen- tial problem of visual object tracking. Recently, Siamese networks have shown great potentials of matching based trackers in achieving balanced accuracy and beyond real- time speed. However, they still have a big gap to classifi- cation & updating based trackers in tolerating the temporal changes of objects and imaging conditions. In this paper, we propose dynamic Siamese network, via a fast transfor- mation learning model that enables effective online learn- ing of target appearance variation and background suppression from previous frames. We then present elementwise multi-layer fusion to adaptively integrate the network out- puts using multi-level deep features. Unlike state-of-the- art trackers, our approach allows the usage of any feasible generally- or particularly-trained features, such as SiamFC and VGG. More importantly, the proposed dynamic Siamese network can be jointly trained as a whole directly on the la- beled video sequences, thus can take full advantage of the rich spatial temporal information of moving objects. As a result, our approach achieves state-of-the-art performance on OTB-2013 and VOT-2015 benchmarks, while exhibits superiorly balanced accuracy and real-time response over state-of-the-art competitors. |
||||
Paper (PDF, 4.07M) |
||||
copyright © Liang Wan / HOME