In this paper an approach to recover the 3D human body pose from static images is proposed. We adopt a discriminative learning technique to directly infer the 3D pose from appearance-based local image features. We use simplified Gradient Location and Orientation histogram (GLOH) as our image feature representation. We then employ the gradient tree-boost regression to train a discriminative model for mapping from the feature space to the 3D pose space. The training and evaluation of our algorithm were conducted on the walking sequences of a synchronized video and 3D motion dataset. We show that appearancebased local features can be used for pose estimation even in cluttered environments. At the same time, the discriminatively learned model allows the 3D pose to be estimated in real time.