Continuous Reinforcement is More Resistant to Extinction Than Partial Reinforcement

  • Journal List
  • HHS Author Manuscripts
  • PMC3335979

Behav Processes. Author manuscript; available in PMC 2013 May 1.

Published in final edited form as:

PMCID: PMC3335979

NIHMSID: NIHMS367701

Resistance to extinction and behavioral momentum

John A. Nevin

University of New Hampshire

Abstract

In the metaphor of behavioral momentum, reinforcement is assumed to strengthen discriminated operant behavior in the sense of increasing its resistance to disruption, and extinction is viewed as disruption by contingency termination and reinforcer omission. In multiple schedules of intermittent reinforcement, resistance to extinction is an increasing function of reinforcer rate, consistent with a model based on the momentum metaphor. The partial-reinforcement extinction effect, which opposes the effects of reinforcer rate, can be explained by the large disruptive effect of terminating continuous reinforcement despite its strengthening effect during training. Inclusion of a term for the context of reinforcement during training allows the model to account for a wide range of multiple-schedule extinction data and makes contact with other formulations. The relation between resistance to extinction and reinforcer rate on single schedules of intermittent reinforcement is exactly opposite to that for multiple schedules over the same range of reinforcer rates; however, the momentum model can give an account of resistance to extinction in single as well as multiple schedules. An alternative analysis based on the number of reinforcers omitted to an extinction criterion supports the conclusion that response strength is an increasing function of reinforcer rate during training.

Keywords: Multiple schedules, single schedules, resistance to extinction, omitted reinforcers, response strength, behavioral momentum

The discriminated operant is defined by an antecedent stimulus, a response that occurs in its presence, and the reinforcing consequences of that response. Behavioral momentum theory (BMT) is concerned with the strength of discriminated operant behavior that has been trained to asymptote under constant conditions, where strength is identified with the resistance to change of responding when those conditions are altered. Interest centers on how a history of reinforcement in the presence of a distinctive stimulus determines resistance to change.

This approach to the strengthening of operant behavior by reinforcement follows from the earliest work in our field. Thorndike (1911) proposed the well-known Law of Effect, whereby reinforcement strengthened the connection between an antecedent stimulus and a response, thus increasing the probability of the response when the stimulus was next presented. Thorndike (1913) suggested that after extended training had established an asymptotic probability of 1.0 for two different responses, various tests of resistance to change, such as distraction or lapse of time, were needed to distinguish stronger from weaker connections. Modern research on resistance to change studies free operant behavior, but its Thorndikean ancestry is clear.

Resistance to change is often studied in multiple schedules, where two or more stimuli signaling different rates or amounts of reinforcement are presented successively – in effect, defining two discriminated operants in the schedule components. Multiple schedules allow the comparison of asymptotic response rates and their resistance to change within subjects and sessions. Many studies have shown that resistance to change in a schedule component is directly related to the rate or amount of reinforcement occurring in that component, regardless of whether all reinforcers are contingent on responding. This general result has been obtained with goldfish (Igaki & Sakagami, 2004), rats (e.g., Blackman, 1968; Shahan & Burke, 2004), pigeons (e.g., Nevin, 1974; Nevin, Tota, Torquato, & Shull, 1990), normal children (Tota-Faucette, 1991), children with developmental disabilities (Ahearn et al., 2003; Mace et al., 2010), college students (Cohen, 1996), and adults with mental retardation (Mace et al., 1990). These studies have employed different sorts of responses and reinforcers, and have evaluated resistance to change by presenting various disruptors including response-independent reinforcers between schedule components, pre-session feeding to devalue reinforcers, response-contingent punishment, conditioned suppression, concurrent distraction, and extinction – i.e., withholding all reinforcers (for reviews see Nevin, 1979; Nevin, 1992b; Nevin & Grace, 2000).

Extinction is probably the most commonly employed method for assessing the effects of a history of reinforcement on the persistence of responding, and resistance to extinction was explicitly identified as a measure of habit strength by Hull (1943). Moreover, extinction is commonly used to reduce or eliminate problem behavior in clinical settings (for review see Petscher, Rey, & Bailey, 2009), so understanding the determiners of resistance to extinction is important for application as well as for behavior theory. I will review some studies of resistance to extinction and interpret their data in relation to BMT.

The metaphor of behavioral momentum

The metaphor is based on Newtonian mechanics. Just as a physical body continues in motion until acted upon by an external force, ongoing behavior maintained by constant conditions of reinforcement continues at a steady rate until acted upon by some external variable. And just as the change in motion of a physical body depends directly on the magnitude of the external force and inversely on the body's mass, the change in the rate of responding depends directly on the magnitude of the external variable and inversely on the behavioral equivalent of inertial mass. Thus, by analogy, when the same disruptor is applied to two asymptotic discriminated operants, the one that is more resistant to change is construed as having the greater behavioral mass, or in traditional terms, greater strength (see Nevin & Grace, 2000, for elaboration of the metaphor and its linkage to preference).

The metaphor may be expressed quantitatively by analogy to Newton's Second Law, Δv = f/m. The behavioral equivalent is ΔB = −x/m, where −x designates the value of a variable that disrupts or decreases the rate responding and m is behavioral mass. The measure of ΔB that has been employed in many studies is the logarithm of the proportion of baseline response rate observed during disruption. A number of parametric studies with pigeons have found that m is approximately equal to the square root of the rate of reinforcement (Nevin, 2002). Thus, the basic expression for BMT is:

where Bo is baseline response rate, Bx is response rate during disruption, and rs is the rate of reinforcement in a multiple-schedule component signaled by stimulus s before the disruptor x is applied.

The partial reinforcement extinction effect

The general finding that resistance to change is directly related to the signaled rate of reinforcement has one well-known exception: When a response has been reinforced every time it occurs (continuous reinforcement or CRF), it usually extinguishes more rapidly than if it has been reinforced intermittently (partial reinforcement or PRF), a result known as the partial reinforcement extinction effect (PREE). Because the rate of reinforcement under CRF must be higher than under PRF, and if x, representing the disruptive effect of extinction, has the same value for CRF and PRF, Equation 1 incorrectly predicts greater resistance to extinction after CRF than after PRF.

The overwhelming majority of studies of the PREE have employed discrete trials and compared independent groups of subjects, most often rats. To make sure that the PREE could be obtained in discrete trials with pigeon subjects, with CRF and PRF signaled and alternating irregularly within sessions as in multiple schedules, Nevin and Grace (2005, Experiment 3)1 arranged that a single peck at a white light on the left key of a standard pigeon chamber was always reinforced, and that a single peck at a red light on the right key was reinforced with probability .25. Trials were separated by 25 s, alternated irregularly, and were terminated if no peck occurred within 5 s. Twelve consecutive 40-trial sessions of extinction were conducted after 55 training sessions. For all 4 pigeons, extinction proceeded more rapidly on the white (CRF) key than the red (PRF) key, contrary to the predictions of Equation 1. The average data are shown in Figure 1.

An external file that holds a picture, illustration, etc.  Object name is nihms367701f1.jpg

Resistance to extinction, expressed as log proportion of baseline, after training in irregularly alternating trials signaling CRF (probability of reinforcement = 1.0) or PRF (probability of reinforcement = 0.25), exemplifying the within-subject PREE (from Nevin & Grace, 2005, Experiment 1).

By contrast, the extinction data of Nevin (1974, Experiment 2) accord with the predictions of Equation 1. In that experiment, pigeons responded on multiple variable-interval (VI) schedules that arranged 30 reinforcers per hour (rft/hr) in one component and 10 rft/hr in the other. Components alternated every 30 s. A single 5.5 hr session of extinction was conducted after 85 sessions of training. For all 3 pigeons, extinction proceeded more rapidly in the component that had arranged less frequent (10 rft/hr) reinforcement, ordinally opposite to the PREE. The average data are shown in Figure 2.

An external file that holds a picture, illustration, etc.  Object name is nihms367701f2.jpg

Resistance to extinction, expressed as log proportion of baseline, in a single 5.5-hr session after training on alternating multiple-schedule components with VI schedules arranging 30 or 10 reinforcers/hr, exemplifying the positive relation between component reinforcer rate on resistance to disruption in multiple schedules (from Nevin, 1974, Experiment 2).

In both examples, the functions are irregular, but the differences between signaled conditions of reinforcement are clear. For example, there were fewer total responses in 12 sessions of extinction for CRF (101) than for PRF (153), but more total responses in 5.5 hr for VI 30/hr (6420) than for VI 10/hr (2036). Relatedly, time to 50% of the pre-extinction baseline was shorter for CRF (3.8 sessions) than for PRF (8 sessions) for PRF, but longer for VI 30/hr (190 min) than for VI 10/hr (90 min). Thus, responding to a key that signaled CRF during training was less resistant to extinction than to a key that signaled PRF, whereas responding to a key color that signaled 30 rft/hr was more resistant to extinction than to a color that signaled 10 rft/hr. Of course, these experiments differ in many ways, but it is the ordinal differences in their data in relation to the ordinal differences in reinforcer rates that concern us here. I will suggest that these ordinal differences can be reconciled by analyzing the disruptive effects of extinction.

Modeling resistance to extinction

The momentum model set forth in Equation 1 assumes that the disruptor x is the same for rich and lean schedule components, as is true for disruptors such as prefeeding, intercomponent food, or distraction that are superimposed on baseline conditions so that the contingencies of reinforcement remain intact. Indeed, when prefeeding or intercomponent food were superimposed on reinforced CRF and PRF trials in the study reported by Nevin and Grace (2005, Experiment 3), resistance to disruption was consistently greater on CRF than on PRF trials – opposite to the effects of extinction shown in Figure 1. To reconcile the effects of extinction with those of other disruptors, one must assume that the disruptors differ between CRF and PRF. In particular, the transition from CRF or a high reinforcer rate to 0 must be a more potent disruptor than the transition from PRF or a low reinforcer rate to 0. Catania (1973) suggested that "the discontinuation of reinforcement has two effects: A dependency between responses and reinforcers ends, and reinforcers are no longer delivered" (p. 49). Nevin and Grace (2000) and Nevin, McLean, and Grace (2001) formalized Catania's suggestion by replacing x in Equation 1 with two terms reflecting the effects noted by Catania:

log ( B t B o ) = t ( c + d Δ r s ) r s 0.5 ,

(2)

where Bo is baseline response rate, Bt is response rate at time t in extinction, c represents the effect of suspending the contingency between responses and reinforcers, and d represents the magnitude of generalization decrement resulting from the change from rs to 0. Multiplication by t implies that both effects increase linearly with continued exposure to extinction. Equation 2 can account for the PREE because as rs increases, the disruptor term in the numerator increases more rapidly than the strength- or mass-like term in the denominator, so that at high values, log(Bx/Bo) becomes increasingly negative.

The parameter c accounts for the finding that response rate decreases when the response-reinforcer contingency is terminated and reinforcers are delivered independently of responding at the same rate as in training (e.g., Rescorla & Skucy, 1969). To extend Rescorla's findings to multiple schedules, Nevin et al. (2001) trained pigeons on multiple VI 60/hr, VI 15/hr and then switched to variable-time (VT) 60/hr, 15/hr schedules that delivered reinforcers independently of responding at the same rate as in training. When the contingency was terminated, response rates decreased over successive sessions, and the decrease was smaller in the richer component, consistent with the effects of other disruptors as described above.

The dΔrs term captures the disruptive effect of generalization decrement arising from the removal of rs presentations of reinforcing stimuli per hr from the stimulus situation within which responding has been reinforced. To evaluate d, Nevin et al. (2001) first estimated the value of c from the data for contingency termination and then estimated d from resistance to extinction after training on these schedules, assuming the same value of c. The data were consistent with the implications of Equation 2: c and d are additive and independent of the rate of reinforcement during training. Accordingly, Equation 2 is used here to characterize the effects of reinforcement on resistance to extinction.

Although Equation 2 has been used most often to characterize data for a single extinction test, or for a few extinction tests separated by extensive baseline training, it has also been applied to extinction in the steady state. Nevin and Grace (2005, Experiment 1) trained pigeons in three-component multiple VI schedules with reinforcer rates ranging from 30/hr to 480/hr, where each session included transitions from a 20-s period of reinforcement to 40 s of extinction, and showed that a modified version of Equation 2 gave an excellent account of the data (cf. Baum, this issue).

Equation 2 is mathematically equivalent to an exponential decay function, which Clark (1959) used to describe empirical extinction functions for rats trained on VI schedules. However, Equation 2 will be used here because it is a linear decreasing function of time in extinction, and deviations from linearity are readily detected.

The irregular data shown in Figures 1 and 2 are not well described by linear decreasing functions. However, some free-operant multiple-schedule studies have yielded more orderly extinction functions. For example, Nevin, Mandell, and Atak (1983) trained pigeons on two-component multiple VI schedules with 129 rft/hr, 42 rft/hr. and 10 rft/hr in the three possible pairwise combinations in counterbalanced order. Components were 1 min long and separated by 30-s intercomponent intervals (ICI); each session included 20 presentations of each component. Resistance to extinction was evaluated over 7 consecutive sessions; the average data are shown in Figure 3. The points for the first extinction session sometimes lie above 0 (no change from baseline), and there is evidence of curvature in some of the functions. Nevertheless, Equation 2 provides at least a rough description of these data, especially after the first session.

An external file that holds a picture, illustration, etc.  Object name is nihms367701f3.jpg

Resistance to extinction, expressed as log proportion of baseline, in 7 consecutive 1-hr sessions after training in multiple schedules components with VI schedules arranging the reinforcer rates noted in the legends of each panel (from Nevin et al., 1983).

Extinction as discrimination or as strength-based momentum?

If the slowing or cessation of responding during extinction depends primarily on discrimination between baseline reinforcement and its absence, response rate should decrease later in a component with a low baseline reinforcer rate than in a component with a high reinforcer rate. This is especially likely when the scheduled reinforcer rate is so low that no reinforcers occur in many components, as in the 1-min components of Nevin et al. (1983) with 10 rft/hr arranged by VI 6-min schedules. Intuitively at least, it should be quite difficult to detect a change from baseline until several sessions of extinction had elapsed. If anything, however, response rates decreased sooner in the leaner components. Figure 4 presents average data for the first session of extinction from Figure 3 for each pair of component schedules. The averages are greater than 0 in the richer component and less than 0 in the leaner component of each pair, strikingly so for the 129/hr, 10/hr pair for which discriminability of extinction should differ most between components (but in the opposite direction). Moreover, Figure 3 shows that after the first session, the functions diverge consistently, with shallower slopes for the richer component of each pair, again contrary to expectation based on the discriminability of changes in the rate of reinforcement.

An external file that holds a picture, illustration, etc.  Object name is nihms367701f4.jpg

Resistance to extinction in the first session of the data displayed in Figure 3, showing that the mean log proportion of baseline in the leaner component of each pair (unfilled bar) was similar to or less than in the richer component (filled bar), contrary to expectation based on the discriminability of nonreinforcement. Range bars indicate standard errors of the means.

Equation 2 accounts at least ordinally for the levels and slopes of extinction functions for sessions 1 through 7 in Figure 3. Accordingly, I will use the average log proportion of baseline response rate for each function over the full course of extinction depicted in Figures 1, 2, and 3 to summarize the data. The results are shown in figure 5, together with predictions of Equation 2 with c = 1.1, d = 0.0025. Although only the CRF/PRF data of Nevin and Grace (2005) confirm the predicted downturn at the right, those data accord at least ordinally with the findings of many studies of the PREE. I suggest that the rate of reinforcement obtained in the presence of a distinctive stimulus determines the strength of discriminated operant behavior in accordance with Equation 1, and that resistance to extinction reflects the strengthening effects of reinforcement during training when the disruptive effects of discontinuing reinforcement are taken into account by the terms in the numerator of Equation 2.

An external file that holds a picture, illustration, etc.  Object name is nihms367701f5.jpg

The average value of the log proportion of baseline during extinction as a function of the rate of reinforcement in a schedule component, on a logarithmic scale, for the data shown in Figures 1, 2, and 3. The reinforcer rates for the discrete-trial CRF/PRF data in Figure 1 were estimated by assuming 1-s latencies to both sorts of trials. The smooth curve is the prediction of Equation 2 (see text for explanation and parameter values).

The role of context

The foregoing analyses have shown that when reinforcement is relatively infrequent, resistance to extinction in a given schedule component depends directly on the reinforcer rate in that component. Resistance to extinction in a component with a constant reinforcer rate also depends inversely on the reinforcer rate in an alternated component. Nevin (1992a) demonstrated this inverse dependency with 5 pigeons trained on multiple VI 60/hr, VI 300/hr and VI 60/hr, VI 10/hr in successive conditions, with two different ICI durations. The top left panel of Figure 6 shows that when the ICI was 2 s, resistance to extinction in the constant 60/hr component was greater when it alternated with 10 rft/hr than when it alternated with 300 rft/hr (the alternated component data are shown in the top right panel). When the ICI was 2 min, resistance to extinction increased in all components relative to the 2-s ICI conditions, and the effect of the alternated-component reinforcer rate was reduced or eliminated (middle panels, Figure 6).

An external file that holds a picture, illustration, etc.  Object name is nihms367701f6.jpg

Resistance to extinction, expressed as log proportion of baseline, in 7 consecutive sessions after training in multiple schedules with a constant reinforcer rate in one component where the reinforcer rate in an alternated component varied between conditions. Constant-component data are presented in the left column with the alternated reinforcer rate in parentheses. The corresponding alternated-component data are presented in the right column. The top row is from Nevin (1992), conditions with a 2-s ICI; the middle row is from conditions with a 2-min ICI; and the bottom row is from Grace et al. (2003).

To confirm the effects of varying reinforcer rate in an alternated component, Grace, McLean, and Nevin (2003) conducted a fully counterbalanced replication with 8 pigeons, with VI 40/hr in the constant component and either VI 200/hr or VI 6.7/hr in the alternated component and a 30-s ICI. As shown in the bottom left panel of Figure 6, responding in the 40/hr component was more resistant to extinction when it alternated with 6.7/hr than when it alternated with 200/hr, replicating the 2-s ICI condition of Nevin (1992a).2 In sum, these data suggest that resistance to extinction in a constant multiple-schedule component is inversely related to the reinforcer rate in an alternated component when the components are reasonably closely spaced in time, and is generally greater when the components are more widely separated.

The effects of alternated-component reinforcer rate during training on subsequent resistance to extinction can be modeled by incorporating the context of reinforcement into the denominator of Equation 2:

log ( Bx Bo ) = t ( c + d Δ r s ) ( r s / r a ) 0.5 ,

(3)

where ra is the overall average reinforcer rate in the experimental session. When the reinforcer rate in the alternated component increases, ra must increase, so the ratio rs /ra for the constant component decreases. Because resistance to extinction is directly related to rs/ra , Equation 3 predicts that resistance to extinction will decrease when the altertnated-component reinforcer rate increases, and vice versa, as depicted in the top and bottom left panels of Figure 6. Because of the overall decrease in ra when the ICI is lengthened, Equation 3 also accounts for the general increase in resistance to extinction depicted in the middle panels of Figure 6.

Nevin (1992b) referred to rs/ra as the contingency ratio, and noted that it is equivalent to the a ratio proposed by Gibbon (1981) to characterize the strength of the stimulus-reinforcer relation between key light and food in autoshaping. Gibbon showed that the ratio of the overall time between reinforcers to the time between key-light onsets and reinforcers accounted quite accurately for acquisition of autoshaped pecking across conditions with varied reinforcer probabilities and with different intertrial and trial durations. In addition, the rs/ra ratio is the inverse of Gallistel's (this issue) expression for the informativeness of a CS – the extent to which it reduces uncertainty about when or whether the next US will occur. Equation 3 suggests that the determiners of the strength or behavioral mass of a discriminated operant in BMT can also be expressed as the ratio of the signaled reinforcer rate to the context in which that signal occurs – a potentially unifying link in otherwise diverse perspectives.

Note that in two-component multiple schedules, ra must be the same for both components, rendering the denominator dimensionless.3 When extinction is arranged in both components, the predictions of Equation 3 are identical to those of Equation 2, albeit with different values of c and d. The value of ra affects predicted resistance to extinction for a component with a given reinforcer rate only when the overall average reinforcer rate differs between conditions as in the studies of Nevin et al. (1983), Nevin (1992a), and Grace et al. (2003). Equation 3 was fitted to the data for all 7 sessions of extinction for all components and conditions from these studies depicted in Figures 3 and 6. All fits were performed by Microsoft Excel Solver™; parameter values and variance accounted for are given in Table 1. Figure 7 shows that Equation 3 provides acceptable agreement between predicted and obtained average proportions of baseline, suggesting that the effects of the context of reinforcement on resistance to extinction can be captured by BMT when context effects are expressed as the ratio of reinforcer rate in a target component relative to the overall average rate of all reinforcers in an experimental session.

An external file that holds a picture, illustration, etc.  Object name is nihms367701f7.jpg

Mean log proportions of baseline predicted by Equation 4 plotted against mean log proportions calculated from the data presented in Figures 3 and 6; parameter values for each study are given in Table 1.

Table 1

Parameter values and variance accounted for by fits of Equation 3 to the data of Nevin et al. (1983), Nevin (1992a), and Grace et al. (2003) depicted in Figure 6.

Study c d VAC
Nevin et al. (1983) 0.13 0.0002 .67
Nevin (1992a) 0.05 0.0001 .72
Grace et al. (2003) 0.14 0.0003 .83

Application to single schedules

Figure 5, above, shows that resistance to extinction in multiple schedules is an increasing function of component reinforcer rates over a wide range. However, the data for resistance to extinction on single schedules are quite different. In a within-subject study, Cohen (1998) compared extinction after single-schedule training on VI 30 s (120/hr) and VI 120 s (30/hr) in successive conditions with extinction after training with the same VI schedules in multiple-schedule components. He found that responding was more resistant to extinction after training with VI 120 s than with VI 30 s in single-schedule conditions, but was more resistant to extinction in the VI 30-s component than the VI 1230-s component in multiple schedules, confirming the pigeon data described above but exactly opposite to the single-schedule results.

Additional data were reported by Cohen, Riley, and Weigle (1993, Experiment 1). They varied the rate of reinforcement for rats on single VI schedules from 30/hr to 120/hr over successive conditions, and found that resistance to extinction in 3 1-hr sessions was inversely related to reinforcer rate. This result was replicated by Shull and Grimes (2006) over a wider range of schedules that varied in successive conditions from 3.33/hr to 164/hr, with a single 2-hr session of extinction. The average data from both studies are presented in Figure 8 as mean log proportions of baseline on a logarithmic x-axis, using only the first two sessions' data from Cohen et al. to equate total extinction time. The inverse relation between resistance to extinction and reinforcer rate is quite similar across studies, and is exactly opposite to the relation shown in Figure 5 for pigeons trained on multiple schedules arranging the same range of reinforcer rates.

An external file that holds a picture, illustration, etc.  Object name is nihms367701f8.jpg

Mean log proportions of baseline during 2 hr of extinction after training on single VI schedules as a function of reinforcer rates on a logarithmic axis. The smooth line is the prediction of Equation 2; see text for explanation and parameter values.

As we have seen, Equation 2 describes the effects of reinforcement rate on extinction in multiple schedules reasonably well (Figure 5), and Shull and Grimes (2006) have proposed that it can also give an account of single-schedule data with Δr interpreted as reinforcers omitted in 1 hr of extinction. A plot of Equation 2, with t = 1, c = 0.01 and d = 0.115, is superimposed on the data in Figure 8. The fit is at least tolerable, but note that the values of d differs by 2 orders of magnitude from those used to describe the multiple-schedule data in Figure 5 (c = 1.1, d = 0.0025). In effect, the large value of d shifts the descending limb at the right of the function in Figure 5 all the way to the left, thus describing a decreasing function over the full range of single-schedule reinforcer rates. The difference in parameter values may reflect differences between pigeons and rats, between single and multiple schedules, or both. Despite the differences in parameter values, I suggest that the basic ideas underlying Equation 2 – the enhancement of response persistence by reinforcement and the countervailing effects of disruption by reinforcer omission – can accommodate the data on resistance to extinction in single as well as multiple schedules.

We have seen that Equation 3, which is just like Equation 2 with a term for the context of reinforcement in its denominator, can account for variations in resistance to extinction when the context varies in multiple schedules. Equation 3 cannot, however, be used for single schedules because the discriminative stimulus and the context are undefined (see Cohen, 1998 for a thoughtful discussion of these matters).

Omitted reinforcers: An alternative measure of resistance to extinction

In a study of acquisition and extinction of autoshaped key pecking where the probability of reinforcement varied between groups, Gibbon et al. (1980) found the usual PREE: The lower the probabilty of reinforcement, the greater the persistence of extinction responding over trials. However, the PREE disappeared when the data were examined in relation to the number of programmed (or "expected") reinforcers that had been omitted during extinction. Therefore, rescaling extinction in relation to omitted reinforcers may provide an alternative to Equation 2, which was proposed to resolve the problem of the PREE for BMT.

Gallistel and Gibbon (2000) discussed the detection of nonreinforcement in the context of their Rate Estimation Theory (RET; Gallistel, this issue, formalizes the detection of changes in reinforcer rate more generally in terms of information theory). RET assumes that during extinction, the decision to respond depends on the ratio of time from the last reinforcer to the average time between reinforcers during training; when this ratio falls below a threshold criterion value, responding diminishes. Thus, if the time between reinforcers during training is doubled, the time required to reach a given threshold criterion must also be doubled. More generally, RET predicts that extinction to a given criterion will occur after the omission of a constant number of reinforcers, as found for autoshaped key pecking by Gibbon et al. (1980; see Gallistel, this issue). Thus, if the number of reinforcers omitted to some standard extinction criterion is the same after training with CRF and PRF, the PREE may be explained directly, without invoking a strengthening effect of reinforcers that is offset by disruptive effects of nonreinforcement, as in Equation 2.

The predicted constancy of omitted reinforcers to an extinction criterion does not, however, hold for free-operant behavior in multiple schedules. For example, Nevin and Grace (2005, Experiment 1) examined repeated extinction in free-operant, three-component multiple schedules and found that reinforcers omitted to a criterion of 50% of baseline increased systematically with reinforcer rate in a component. The same ordinal relation holds for the multiple-schedule data reviewed here. Table 2 displays the average numbers of reinforcers omitted to a criterion of 50% of baseline, estimated by linear interpolation, for the data presented in Figures 1, 2, 3, and 6. Note that the number of omitted reinforcers to the 50% criterion is greater for CRF than for PRF (Figure 1), and the same ordering holds for richer vs. leaner components in all of the multiple schedules described above.

Table 2

Average numbers of reinforcers omitted to an extinction criterion of 50% of baseline responding for the extinction data presented in Figures 1, 2, 3, and 6, together with values estimated from three within-subject autoshaping experiments by Rescorla (1999), together with the reinforcer rates programmed in each component during training.

Study Omitted
reinforcers
Nevin & Grace 2005 Exp. 1
CRF 3600/hr 76
PRF 900/hr 40
Nevin 1974 Exp. 2
VI 120 30/hr 48
VI 360 10/hr 8
Nevin et al. 1983
VI 28 129/hr 168
VI 360 10/hr 6
VI 28 129/hr 193
VI 86 42/hr 25
VI 86 42/hr 49
VI 360 10/hr 7
Nevin 1992 (2-s ICI)
VI 12 300/hr 248
VI 60 60/hr 29
VI 60 60/hr 85
VI 360 10/hr 5
Nevin 1992 (2-min ICI)
VI 12 300/hr 420
VI 60 60/hr 77
VI 60 60/hr 112
VI 360 10/hr 11
Grace et al. 2003
VI 18 200/hr 126
VI 90 40/hr 19
VI 90 40/hr 48
VI 540 6.7/hr 3
Rescorla 1999
   100% 720/hr 23
     50% 360/hr 13
   100% 720/hr 30
     25% 180/hr 10
     75% 540/hr 18
     25% 180/hr 6

The positive relation between reinforcer rate during training and omitted reinforcers to an extinction criterion is not limited to operant paradigms. Rescorla (1999) replicated the usual PREE in three within-subject autoshaping experiments, and reported that when the extinction data were expressed in relation to omitted reinforcers, extinction was more rapid after partial reinforcement – the reverse of the usual PREE. Because reinforcers to a 50% criterion can be estimated from the data of Rescorla's figures, I have included them in Table 2 for comparison.

There is some systematic evidence that the positive relation between reinforcer rate and omitted reinforcers to an extinction criterion also holds for single schedules. Shull and Grimes (2006) reported that reinforcers omitted to a criterion of 10% of baseline, or to 50% of total extinction responses, increased systematically with single-schedule reinforcer rate, thereby resolving at least the ordinal disparity between the effects of reinforcer rate on resistance to extinction in single and multiple schedules.4 In summary, the finding that the number of omitted reinforcers to criterion increases with baseline reinforcer rate appears to be general to free-operant single and multiple schedules over a wide range of reinforcer rates, and to within-subject discrete-trial CRF and PRF in Rescorla's (1999) autoshaping data as well as those of Nevin & Grace (2005, Figure 1). Speculatively, the constancy reported by Gibbon et al. (1980) may be limited to between-group studies.

The data for reinforcer omission challenge intuitions about discriminative control of behavior during extinction. For example, in multiple schedules with 129 and 10 reinforcers/hr in 1-min components, the average pigeon detected the omission of 6 reinforcers in the lean component, as shown by the decrease in response rate to 50% of baseline. Nevertheless, those same pigeons, in the same sessions, continued to respond in the rich component until 168 reinforcers had been omitted, despite the fact that the omission of rich-component reinforcers must be vastly easier to detect. Why did they continue to peck in the rich component for several more sessions? One possible answer is that their history of frequent reinforcement in the rich component had strengthened their tendency to peck, so that the pigeons require far more stringent evidence of nonreinforcement to overcome that tendency. Stated somewhat differently, it may be that the pigeons know that reinforcement has been discontinued in the rich component after a few reinforcer omissions but continue pecking anyway because that is what their histories compel them to do. Either way, the positive relation between reinforcer rate and omitted reinforcers to a criterion is consistent with the notion that response strength as manifested in extinction is an increasing function of reinforcer rate.

Conclusion

The extinction data for discrete-trial CRF-PRF and for free-operant multiple and single schedules, described above, show that if extinction is rescaled in relation to omitted reinforcers, resistance to extinction depends directly on reinforcer rate during training. This ordinal conclusion may be more satisfying (or less amenable to challenge) than the quantitative formulation of extinction proposed by BMT, which relies on free parameters in Equations 2 and 3 to accommodate various data sets described here. Moreover, as noted in connection with Equation 2, BMT can explain the PREE only if the exponent on reinforcer rate, in the denominator, is less than 1.0 so that as reinforcer rate increases, the disruptor term dΔrs in the numerator increases faster than the reinforcer term in the denominator. Despite these limitations in the application of BMT to extinction, the simplest version of BMT, Equation 1, captures the relation between resistance to disruption and reinforcer rate in multiple schedules in a wide range of studies when reinforcement remains in effect. The cost of added terms and parameters in the application of BMT to extinction may be justified by the advantage of its unified framework for the study of resistance to change.

Nevin Highlights

  • Resistance to extinction depends on the strengthening effects of reinforcement as well as the disruptive effects of nonreinforcement.

  • Behavioral momentum theory can explain extinction in multiple schedules, including the PREE in discrete trials.

  • The relation between resistance to extinction and rate of intermittent reinforcement differs for multiple and single schedules.

  • The number of reinforcers omitted to a 50% extinction criterion is an increasing function of reinforcer rate in all data reviewed here.

Acknowledgments

Preparation of the ms. was supported in part by NICHD Grant HD 064576 to the University of New Hampshire. I thank the reviewers of my original submission for helpful comments, and the editors for organizing the meetings and this special issue.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

This manuscript is based on a presentation at the meetings of the Society for the Quantitative Analysis of Behavior, May 2011, in Denver, CO.

1These data were originally reported by J. A. Nevin at the meetings of the Eastern Psychological Association, March 1989.

2Grace et al. (2003) also conducted extinction in the constant 40/hr component with continued reinforcement in the alternated component at the same rate as in baseline training, and found that resistance to extinction was greater in the 40/hr component when reinforcement was maintained in the alternated component throughout extinction than when responding was extinguished in both components. This result may be modeled by adding terms to the numerator of Equation 3, but the data are too limited to warrant the addition of further parameters.

3I have ignored the units for the parameters of Equations 1–4 because they are not critical for understanding the ways in which the equations operate. However, dimensional consistency is required for any equation. In Equations 1–4, the left side is dimensionless, so the right side must also be dimensionless. Therefore, in Equation 1, x must be in units of (reinforcers/time)0.5. In Equation 2, the numerator must also be in units of (reinforcers/time)0.5. Therefore, c must be in units of (reinforcers/time)0.5/time, and d must be in units of 1/[time*(reinforcers/time)0.5]. In Equation 3, because the denominator is dimensionless, the numerator must also be dimensionless. Therefore, c must be in units of 1/time, and d (and da in Equation 4) must be in units of 1/(time*reinforcers).

4Baum (this issue) argues that the increasing trend reported by Shull and Grimes may depend on their choice of criteria, and suggests that other measures might support the constancy predicted by RET. He also presents a set of data for repeated extinction following training with VI schedules ranging from VI 1200 s (3/hr) to CRF, and shows that they are reasonably consistent with constancy of reinforcer omissions to several different extinction criteria. However, reinforcer duration varied inversely with reinforcer rate for most schedules in his study, and the extent to which this confound affects the reported extinction criteria is unknown.

References

  • Blackman DE. Response rate, reinforcement frequency, and conditioned suppression. J. Exp. Anal. Behav. 1968;11:503–516. [PMC free article] [PubMed] [Google Scholar]
  • Catania AC. The nature of learning. In: Nevin JA, Reynolds GS, editors. The study of behavior. Glenview, IL: Scott, Foresman; 1973. pp. 30–68. [Google Scholar]
  • Clark FC. Some quantitative properties of operant extinction data. Psychol. Reports. 1959;5:131–139. [Google Scholar]
  • Cohen SL. Behavioral momentum of typing behavior in college students. J. Behav. Anal. Therapy. 1996;1:36–51. [Google Scholar]
  • Cohen SL. Behavioral momentum: The effects of the temporal separation of rates of reinforcement. J. Exp. Anal. Behav. 1998;69:29–47. [PMC free article] [PubMed] [Google Scholar]
  • Cohen SL, Riley DS, Weigle PA. Tests of behavioral momentum in simple and multiple schedules with rats and pigeons. J. Exp. Anal. Behav. 1993;60:255–291. [PMC free article] [PubMed] [Google Scholar]
  • Gallistel CR, Gibbon J. Time, rate, and conditioning. Psychol. Rev. 2000;107:289–344. [PubMed] [Google Scholar]
  • Gibbon J. The contingency problem in autoshaping. In: Locurto CM, Terrace HS, Gibbon J, editors. Autoshaping and conditioning theory. New York: Academic Press; 1981. pp. 285–308. [Google Scholar]
  • Gibbon J, Farrell L, Locurto CM, Duncan HJ, Terrace HS. Partial reinforcement in autoshaping with pigeons. Anim. Learn. Behav. 1980;8:45–59. [Google Scholar]
  • Grace RC, McLean AP, Nevin JA. Reinforcement context and resistance to change. Behav. Proc. 2003;64:91–101. [PubMed] [Google Scholar]
  • Hull CL. Principles of behavior. New York: Appleton-Century-Crofts; 1943. [Google Scholar]
  • Igaki T, Sakagami T. Resistance to change in goldfish. Behav. Proc. 2004;66:139–152. [PubMed] [Google Scholar]
  • Mace FC, Lalli JS, Shea MC, Lalli EP, West BJ, Roberts M, Nevin JA. The momentum of human behavior in a natural setting. J. Exp. Anal. Behav. 1990;54:163–172. [PMC free article] [PubMed] [Google Scholar]
  • Mace FC, McComas JJ, Mauro BC, Progar PR, Ervin R, Zangrillo AN. Differential reinforcement of alternative behavior increases resistance to extinction: Clinical demonstration, animal modeling, and clinical test of one solution. J. Exp. Anal. Behav. 2010;93:349–367. [PMC free article] [PubMed] [Google Scholar]
  • Nevin JA. Response strength in multiple schedules. J. Exp. Anal. Behav. 1974;21:389–408. [PMC free article] [PubMed] [Google Scholar]
  • Nevin JA. Reinforcement schedules and response strength. In: Zeiler MD, Harzem P, editors. Reinforcement and the organization of behaviour. Chichester, England: Wiley; 1979. pp. 117–158. [Google Scholar]
  • Nevin JA. Behavioral contrast and behavioral momentum. J. Exp. Psychol: Anim. Behav. Proc. 1992a;18:126–133. [Google Scholar]
  • Nevin JA. An integrative model for the study of behavioral momentum. J. Exp. Anal. Behav. 1992b;57:301–316. [PMC free article] [PubMed] [Google Scholar]
  • Nevin JA. Measuring behavioral momentum. Behav. Proc. 2002;57:287–198. [PubMed] [Google Scholar]
  • Nevin JA, Grace RC. Behavioral momentum and the law of effect. Behav. Brain Sci. 2000;23:73–130. (includes commentary) [PubMed] [Google Scholar]
  • Nevin JA, Grace RC. Resistance to extinction in the steady state and in transition. J. Exp. Psychol: Anim. Behav. Proc. 2005;31:199–212. [PubMed] [Google Scholar]
  • Nevin JA, Mandell C, Atak JR. The analysis of behavioral momentum. J. Exp. Anal. Behav. 1983;39:49–59. [PMC free article] [PubMed] [Google Scholar]
  • Nevin JA, McLean AP, Grace RC. Resistance to extinction: Contingency termination and generalization decrement. Anim. Learn. Behav. 2001;29:176–191. [Google Scholar]
  • Nevin JA, Tota ME, Torquato RD, Shull RL. Alternative reinforcement increases resistance to change: Pavlovian or operant contingencies? J. Exp. Anal. Behav. 1990;53:359–379. [PMC free article] [PubMed] [Google Scholar]
  • Petscher ES, Rey C, Bailey JA. A review of empirical support for differential reinforcement of alternative behavior. Research Devel. Disabilities. 2009 May-Jun;:409–425. [PubMed] [Google Scholar]
  • Rescorla RA. Within-subject partial reinforcement extinction effect in autoshaping. Quart. J. Exp. Psychol. 1999;52(B):75–87. [Google Scholar]
  • Rescorla RA, Skucy JC. Effect of response-independent reinforcers during extinction. J. Comp. Physiol. Psychol. 1969;67:381–389. [Google Scholar]
  • Shahan TA, Burke KA. Ethanol-maintained responding of rats is more resistant to change in a context with added non-drug reinforcement. Behav. Pharmacol. 2004;15:279–285. [PubMed] [Google Scholar]
  • Shull RL, Grimes JA. Resistance to extinction following variable-interval reinforcement: Reinforcer rate and amount. J. Exp. Anal. Behav. 2006;85:23–39. [PMC free article] [PubMed] [Google Scholar]
  • Skinner BF. Contingencies of reinforcement: A theoretical analysis. New York: Appleton-Century-Crofts; 1969. [Google Scholar]
  • Thorndike EL. Animal intelligence: An experimental study of the associative processes in animals. Psychol. Rev. Monogr. Supp. 2. 1911;No. 8 [Google Scholar]
  • Thorndike EL. Educational psychology: Vol. 2. The psychology of learning. New York: Columbia University Teachers College Press; 1913. [Google Scholar]
  • Tota-Faucette ME. Unpublished doctoral dissertation. Greensboro: University of North Carolina; 1991. Alternative reinforcement and resistance to change. [Google Scholar]

vancelantoo.blogspot.com

Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3335979/

0 Response to "Continuous Reinforcement is More Resistant to Extinction Than Partial Reinforcement"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel