Stereo is a useful technique for obtaining depth information from images. However, it is required that the baseline length between two cameras should be large to increase depth precision. Consequently, stereo matching suffers from ambiguities due to large baseline length. Therefore, the matching problem would become much easier if a sequence of densely sampled images along a camera path is used. "Structure from Motion (SFM)" do not suffer from the matching problem because many SFM algorithms use multiple images, especially a sequence of images taken at short time intervals. However, this method basically has one disadvantage of having scale factor ambiguity. The depth extraction method proposed in this paper integrates SFM and stereo depth extraction algorithm to determine a scale factor for SFM and to solve the matching problem of stereo even if there are ambiguities in matching and occlusion regions in the scene. Results on real images illustrate the performance of the pro...