A lightweight weak semantic framework for cinematographic shot classification
A lightweight weak semantic framework for cinematographic shot classification
Blog Article
Abstract Shot is one of the fundamental unit in the content structure of a film, which Wet Bag can provide insights into the film-director’s ideas.By analyzing the properties and types of shots, we can gain a better understanding of a film’s visual language.In this paper, we delve deeply into the task of shot type classification, proposing that utilizing multimodal video inputs can effectively improve the accuracy of the task, and that shot type classification is closely related to low-level spatiotemporal semantic features.To this end, we propose a Lightweight Weak Semantic Relevance Framework (LWSRNet) for classifying cinematographic shot types.Our framework comprises two modules: a Linear Modalities Fusion module (LMF Module) capable of fusing an arbitrary number of video modalities, and a Weak Semantic 3D-CNN based Feature Extraction Backbone (WSFE Module) for classifying shot movement and scale, respectively.
Moreover, to support practical cinematographic analysis, we collect FullShots, a large film shot dataset containing 27K shots from 19 Dimmer Module movies with professionally annotations for movement and scale information.Following experimental results validate the correctness of our proposed hypotheses, while our framework also outperforms previous methods in terms of accuracy with fewer parameters and computations, on both FullShots and MovieShots datasets.Our code is available at ( https://github.com/litchiar/ShotClassification ).