Shaping (psychology)

Shaping is a conditioning paradigm used primarily in the experimental analysis of behavior. The method used is differential reinforcement of successive approximations. It was introduced by B. F. Skinner with pigeons and extended to dogs, dolphins, humans and other species. In shaping, the form of an existing response is gradually changed across successive trials towards a desired target behavior by rewarding exact segments of behavior. Skinner's explanation of shaping was this:

We first give the bird food when it turns slightly in the direction of the spot from any part of the cage. This increases the frequency of such behavior. We then withhold reinforcement until a slight movement is made toward the spot. This again alters the general distribution of behavior without producing a new unit. We continue by reinforcing positions successively closer to the spot, then by reinforcing only when the head is moved slightly forward, and finally only when the beak actually makes contact with the spot. ... The original probability of the response in its final form is very low; in some cases it may even be zero. In this way we can build complicated operants which would never appear in the repertoire of the organism otherwise. By reinforcing a series of successive approximations, we bring a rare response to a very high probability in a short time. ... The total act of turning toward the spot from any point in the box, walking toward it, raising the head, and striking the spot may seem to be a functionally coherent unit of behavior; but it is constructed by a continual process of differential reinforcement from undifferentiated behavior, just as the sculptor shapes his figure from a lump of clay.

The successive approximations reinforced are increasingly accurate approximations of a response desired by a trainer. As training progresses the trainer stops reinforcing the less accurate approximations. For example, in training a rat to press a lever, the following successive approximations might be reinforced:

The trainer would start by reinforcing all behaviors in the first category, then restrict reinforcement to responses in the second category, and then progressively restrict reinforcement to each successive, more accurate approximation. As training progresses, the response reinforced becomes progressively more like the desired behavior.

The culmination of the process is that the strength of the response (measured here as the frequency of lever-pressing) increases. In the beginning, there is little probability that the rat would depress the lever, the only possibility being that it would depress the lever by accident. Through training the rat can be brought to depress the lever frequently.

...
Wikipedia