—This paper proposes an algorithm for automatic detection of 3D video shots with different perceptual features. The proposed algorithm is able to identify distinct three-dimensional visual scenes by detecting 3D video shot boundaries based on clustering of depth-temporal features. A combination of texture variation along the temporal dimension and depth variance is used by K-means clustering to find the stereo frames which comprised the 3D scene boundaries. An important characteristic of the proposed algorithm in comparison with others published in the literature for temporal segmentation of classic 2D video is that no thresholds are used in the decision processes neither training data sets. The experimental results show that the proposed method is capable of achieving high recall (e.g., 0.95) and