DRO – An Alternate Reality

Alternate behavior:
All behavior is physical in nature. That is what allows you to take a step, move your eyes or lie down. The core of all behaviors – learned or instinctive, is in your DNA. That is because behavior is a physiological phenomenon, just like muscle tone, blood pressure and vision. If you have no genes for the complex components necessary for sight, (eye-balls, optic nerve, retina, etc.) and you cannot check out a hot chick from a distance. If you are an Australian Cattle Dog (ACD) puppy and your auditory nerve isn’t hooked up, you will be partially or completely deaf – a common problem with the breed. That is said to be a leftover from the infusion of Dalmatians into the creation of ACD’s. The ability to hear is dependent on the hardware – always. If I was going to study behavior, I’d start with those behaviors genetically transmitted from parent to off-spring. It’s called the phylogeny of the creature. That’s where it all starts. It’s the foundation.

Behavioral scientists haven’t done that – because they aren’t really scientists. They are haphazard hobbyists. They study minutia out of the context of phylogeny. They have no idea why specific behaviors happen and they haven’t bothered to catalog which behaviors are common to a specific species. EG: Wolf puppies lick their mother’s mouth when she returns from a hunting trip. If you don’t know that exists you can’t understand what place that behavior fits into the survival of that species. (It’s a pretty big place, by the way.) You also don’t realize why puppies and dogs like to lick their owner’s mouths and faces. The behavior exists because of genes that were/are functional. It isn’t created or maintained by reinforcement. (Though it can certainly be strengthened by treats and affection.)

Here’s an example of a behavior that illustrates my point. My dog Tucker, at eight months, has taught himself to ‘roll over’. I had no hand in that. No other person has been in contact with him who could have taught him that. He learned it on his own. He enjoys it. I have never ‘reinforced’ the behavior, yet he does it. It’s a complex behavior created by a sequence of moving all four legs, rotating his neck and head and arching his back. The sequence is very exacting. Do any part of it out of sequence or out of concert with other muscles and it doesn’t happen.

Now we get to the vastly common recommendation that to control an undesirable behavior one should cut off the ‘reinforcement’ for it and then reinforce an alternate behavior. If sounds sexy and sophisticated but it is the food of dunces. How does teaching ‘alternate behavior’ fit into a dog’s repertoire?

Here’s why ‘alternate behavior’ achieves other than what is promised. The dog does not make a differentiation between innate and learned behavior – because there is no difference between the two. The capacity to change behavior from nascent forms to functional behaviors is simply a physiological phenomenon. We call that ‘learning’. All animals have the ability to learn – even flat worms like planaria. However, focusing on learning implies that innate behaviors are somehow static and not complete. That presents a false dichotomy. Behavioral scientists call behavior determined by its consequences ‘operant’ behavior. Yet all behaviors are determined by their consequences yet some are purely reflexive and resist ‘consequences’.

Impala run from predators though the outcome of their running is not a one-to-one success or failure. They survive because the predator only takes one or two out of the herd. Techically their ‘running’ behavior could be an operant behavior – or not. Why is the distinction even important. Virtually all reflexive behaviors can be modified by external events. Some cannot. Pinch the nose of someone and they will eventually breathe through their mouth. Does it matter how much they breathe? How often they inhale? Most behaviors are like that. They are used to perpetuate survival.

EG: Rate of response almost doesn’t exist in the real world yet it is THE criterion worshipped by scientists. Here’s an irony for you. The goal of behavioral psychology was to understand and control the behavior of humans. The method of investigation they have used neither explains human behavior nor allows them to control it. That is what is called a ‘fail’. You would be wise to discount anything coming from something called behaviorism, experimental psychology or behavior analysis. The cycle has occurred at least three times – logical people propose why behavioral science is bogus…so they change the name. That buys them a few years but eventually if it walks like a duck and quacks like a duck it’s going to be recognized as a duck. The leopard can get spray painted but doesn’t lose its spots.

The great problem for behavior analysis is that it focuses on rate of response and denies instinctive behavior. This two-part bias affects how they see the world. If a behavior is part instinctive and part ‘operant’ they claim it’s an operant because there is some reinforcement/punishment component involved. They deny the concept that we are all single organisms with the ability to surf the nature/nurture issue seamlessly. So, what’s that word, ‘operant’ have to do with anything? Behavior analysts define that as a behavior that is formed and maintained by its consequences – unlike purely reflexive behaviors. Hmmm. Who says that an organism – any organism – makes that distinction? They don’t. The use reflexive behaviors such as vision to learn how to throw a fastball. It’s a combo behavior. Without reflexes there can be no reaction to reinforcers or punishers.

The concept that there are instinctive behaviors vs. operant behaviors is a bogus human construct. It means nothing. The idea that you can somehow stop an existing behavior by removing external rewards is childish. Once the behavior exists it is always in the dog’s bag-o-tricks. The claim that behavior analysts can predict the likelihood of the behavior based on antecedent reinforcement is a fantasy. Whether Tucker rolls over or not isn’t within the control of an observer. It’s up to him. He is an autonomous creature. If he was Southern, I’d say he has ‘druthers’.

What’s ‘Druthers’?

Druther is a contraction of ‘I’d rather’. When a Southerner asks, “What’s your druthers?” it means ‘what would you hanker to do?’ Hanker means desire. Druthers are more than simple desires. They incorporate the reality that we are autonomous organisms. We have druthers because we must adapt to our environment to stay alive. One you accomplished the bare-bones of that you get to have less-than-critical desires. Note: There are three places where druthers are not considered. Russian Gulags, 3rd World sweat-shops and behavioral science labs. That should tell you a great deal about lab rats and how they are treated.

Teach a Conflicting Behavior: DRO

DRO stands for “differential reinforcement of ‘other’ behavior”. You can shorten it to “teach something different.” It is a widely recommended means of fixing behavior problems. For example, if a dog jumps up on people, teach it to lie down, instead. There are variations on DRO, such as DRI (teaching an incompatible behavior) and DRA (teaching an alternate behavior) In essence they all mean the same thing – teach the dog something new to replace an existing behavior. Despite the popularity of these tricks the reality is that they are not suggested because they are the most effective way to stop unacceptable behavior. They are suggested because they avoid considering punishment. This is often attributed to being the “scientific” way to solve behavior problems. It’s not.

In the history of behavior analysis there is a name that stands out for courage behind enemy lines. Nate Azrin, PhD, was one of the first scientists to study behavior without a bias toward reinforcement or punishment. He started his career trying to correct the preference for positive reinforcement already present in B.F. Skinner’s ideology as early as the 1950’s.

“B. F. Skinner’s published views on punishment were well known at the time of my arrival in September 1953 at Harvard University, which I attended for the sole purpose of studying under Skinner. He had coauthored some studies of punishment earlier with W.K. Estes, but had devoted virtually all of his other animal research to the study of positive reinforcement. He was opposed to the use of punishment to influence human behavior, a view strongly expressed in his books Science and Human Behavior (1953) and Walden Two (1948), and indeed shared generally by psychology at large.

My own view at the time was that the strong opinions and ethical views regarding punishment had prevented the serious study of that process to the same extent that was true of positive reinforcement. I believed that punishment deserved more study; more specifically I believed that such study should address some of the same factors, such as the schedule of presentations, as had been found by Skinner to be so important with positive reinforcement.” (Reflections and Comments: JEAB, 2002,77, 373–392 NUMBER 3)

It is hard to disagree with someone who wishes to balance the investigation of behavior into both reinforcement and punishment to create an objective study. Logically, real science doesn’t prefer one natural phenomenon over another. Physicists do not prefer momentum over inertia. By contrast, the bias described by Azrin is stronger now than in the 1950’s. It is now an orthodoxy. Behavior analysts, the children of Skinner, hold an overwhelming bias for positive reinforcement over aversive control. You can make your own conclusions about that and the validity of “behavioral science”. This preference is easy to spot and DRO is one of the best examples that confirm the bias. Consider this comment about changing behavior. It’s from that same Nate Azrin. Remember, his credentials are impeccable. He is just as smart, just as educated and just as much an authority as any other PhD behavior analyst. He worked in B.F. Skinner’s rat lab at Harvard. He had a long and productive career as a behavior analyst. He was the real deal.

“Providing negative consequences is the fastest, most effective means of eliminating unwanted behavior – far faster than developing stimulus control or teaching an alternate behavior.” (Personal communication with the author.)

As you hold that thought consider that the vast majority of behaviorists and modern trainers recommend "teaching an alternate behavior." That is a direct contradiction of Azrin's simple solution. Their bias leads them to suggest a solution that isn't a solution. The fundamental problem is that DRO is based on an illogical assumption. They believe that teaching new behaviors removes old behaviors. If that was true, teaching you French would make you forget or be unable to speak English. Do you believe that? Au contraire, mes collègues. (On the contrary, my colleagues.) 
			

To show that this isn’t a unique perspective, here is a quote from Dr. Ron Van Houten – a respected behavior analyst. This is from The Effects of Punishment on Human Behavior, Axelrod and Apsche, Academic Press, 1983.

“Another way of suppressing unwanted behavior is to reinforce incompatible behavior. However, just as it can be difficult to teach a new behavior entirely through the use of punishment, it can be very difficult to suppress an old behavior entirely through the reinforcement of incompatible behavior. If reinforcement for the unwanted behavior cannot be completely eliminated, it will likely continue even if several new behaviors are established. Hence, the best formula for suppressing behavior involves reinforcing desirable behavior at the same time that one punishes undesirable behavior. Indeed, as has been pointed out earlier, punishment is most effective when an alternative reinforced behavior that is not punished is available. If, on the other hand, one provides an alternative behavior but does not punish the unwanted behavior, a concurrent schedule of reinforcement would prevail that would be expected to maintain both behaviors at strengths proportional to the amount of reinforcement associated with each behavior.”

To examine this from the ground up, let’s go to a dog example and study how learning takes place.

The Nature of New Behaviors:
When you create a behavior to replace a behavior, the dog doesn’t know your purpose or that one behavior is better than another. It simply goes happily along learning something new. This is like teaching an English speaker the Spanish phrase, “que paso?” The new phrase is integrated into the verbal repertoire of the speaker and used, at will, to elicit a specific response. It does not affect the speaker’s ability to use English phrases such as “what’s happening”, “what’s up?”, how’s it going, “hey, dude”, or dozens of other casual greetings. Here’s an example of how this works when teaching behaviors. You may choose to not use this term if you are speaking with someone who isn’t familiar with Spanish. After all, why use words that are going to create speed bumps for the listener? Here’s another example…

EG: Teach a dog to turn in a circle. Then teach him to truncate the behavior to half a circle. (If you are using a clicker, simply start clicking when the dog is at the half-way point. Alternate with a word like “wrong”, said in a normal tone of voice for anything that goes beyond the half-way point. ) Meaning now you are only going to require a half-turn. Teaching the truncated ½ turn doesn’t do anything to the full circle. A human would call it a mistake if the dog sometimes does a full circle is irrelevant. The organism thinks the new behavior is additive…because it is. However, the addition of a behavior creates no prejudice against the old behavior so it continues to exist and be an option for the dog. Consider buying a new pair of sandals that sits in your closet right next to your insulated snow boots. You have no inhibition against wearing the boots. To the dog, we simply now have Behavior A (full circle) and Behavior B (half circle). That the two appear similar is exactly that – an appearance in the mind of the trainer. That the dog thinks of them as related is an assumption that may or may not be true. We are capable of disassociating things at the drop of a hat. The summer shoes are different from the winter shoes but neither are in the underwear drawer. Selecting one pair over the other is based on knowledge of its particular benefit in a specific context. If you leave the house wearing sandals and drive to a snow resort you need to switch gears and put on your warmer shoes. Aha! We have druthers galore.

Having shoes that appear similar but are treated as different is carried over to many aspects of our behavior. Things may appear completely similar and yet be treated differently based on context. Think of homophones – words that sound the same but can only be defined as they are used in a sentence.

“The soldier wound his watch carefully to avoid irritating the wound on his wrist.”

Did you understand the sentence? One word is a verb and the other is a noun, but it looks like the same word. The point to take away from this is that animals make associations as they make associations, not based on a human assumption about what they know or don’t know. Unless you test them logically, making assumptions that a dog “knows” something leads to sloppy control. In the case of this homophone the same word triggers two separate meanings. The meaning of each is based on context created by syntax, the order in which the word appears in the sentence, and other cues – subject, object and predicate. Now let’s look at how assumptions about additive behaviors influence performance.

A Practical Example: Drop on Recall
Consider a classic obedience behavior called “Drop on Recall.” The dog is asked to come and then half-way to the handler is told to drop to the ground. Then the dog is asked to finish the ‘come‘. The dog knows both cues as separate things. Come means go to the handlers position. Drop means lie down. Each has its own power to control behavior in real time. However, when placed in a sequence something odd happens.

Within a few repetitions of the new pattern, ‘Come‘ followed by ‘drop‘ will become a new behavior that needs only the word ‘come’ to trigger. Now the dog comes toward the handler and drops automatically without being asked. This is a huge point. The dog’s natural ability to anticipate has trumped its ability to listen to commands in real time. Come suddenly doesn’t mean “come” anymore but drop still means drop. The embedded behavior is unchanged. The triggering command has morphed. This is no different than someone bringing sandals and snow boots from their closet when you ask for “shoes.” The brain must be able to hold or cancel associations at any given moment to satisfy the demands of a specific context.

This leads to a problem for the obedience trainer. Come will turn into drop on recall gone wrong as the dog stops paying attention to the commands in real time and anticipates what is going to happen – thereby leading to a combination of the two or more behaviors. This is how a chain is formed. That it forms when we don’t want it or when we want something else to happen is irrelevant to the reality. The dog encapsulates the behaviors that appear to be connected by tangible reinforcement. It stops listening and performs a knee-jerk behavior. Now let’s put it into the context of DRO.

DRO and Front Door Arousal: An everyday occurrence
Take a dog that rushes the front door. Teach it to go to a different location. Reinforce that heavily. Nothing has changed except adding a behavior to the dog’s repertoire. (We just bought a pair of sandals during the summer.) Though we have added a new behavior, the old skill still exists. Heavy, continual reinforcement is needed to prevent the old behavior from returning on its own. Slack off on the reinforcement and the dog senses a variation in consequences. That triggers variability and makes any other behavior more likely to occur. Just like the English speaker saying, “what’s happening?” to someone who doesn’t speak English. If the English speaker also knows a Spanish greeting it will be triggered instantly. Both behaviors exist in the speaker’s repertoire and can be used at any time that communication is needed. As an aside, this is what the term fluency means.

The net cost for the trainer is to invest far more effort into teaching and maintaining the alternate behavior. In this case, you lost your time fussing with a process that sucks up training time and doesn’t really solve the problem. Few owners will remain diligent enough to make this process work because it requires perpetual diligence, an always hungry dog, treats at the ready and nothing coming through the door that is more important to the dog than the treats that have been traditionally connected with the event. As a recommendation to rapidly solve an unacceptable behavior, DRO fails, every time. Considering that the primary cause of owners taking dogs to shelters is problem behavior this is a serious problem. Tax the owner’s patience too far and they get rid of the dog. Shelters kill about 80% of what they get. Anything too complicated or too time-consuming puts the dog’s life at risk – and that is not an exaggeration.

EG: Sophia Yin, DVM, did a study of her remote feeder machine. It took her four months to teach 20 dogs to lie down rather than rush the front door. It took hours of devotion by the owners to practice the “solution” and in the end, also required the assistance of live trainers. (This begs the question of why the remote feeder was important. Why not just have live trainers do that work?) In all real-world problems there are time and cost constraints that must be integrated into the solution. Now you know why Nate Azrin’s quote describes reality and DRO does not. You can’t block existing unacceptable behavior by playing “bait and switch”. Unless you confront the original behavior, you will never remove “que paso” or objectionable terms like “The N Word” from someone’s vocabulary.

Nailing a Solution:
Rather than leave you with just a critique of DRO I am going to give you a simple way to prove whether the criticism is justified. Find a dog that rushes the front door. They are all around you. Then find one that doesn’t like having its nails trimmed. They, also, are all around you. Now ring the doorbell. When the dog comes flying forward barking hysterically, restrain the dog and nip the tiniest bit off one nail. Then let the dog go. Ring the bell again. If the dog comes to the door, restrain the dog and nip the tiniest bit from another nail. Repeat. As soon as the dog stops rushing the door, start giving treats for “not rushing.” Done. This process can be accomplished in a couple of five-minute sessions and periodic updates if the behavior starts to return. No harm came to the dog. No horrible side-effects of punishment occurred. Compare that to weeks of trying to teach the dog an alternate behavior and add perpetual monitoring for maintenance. One additional thought. I am not limited to using only punishment or positive reinforcement. In the example of the nail-clippers I would start using positive reinforcement for DRO after the punishment procedure has inhibited the dog from rushing the door. In essence, DRO is a procedure that is logically only part of a solution. Without the punishment component (wearing sandals on cold, rainy day will punish sandal selection on cold, rainy days) teaching an alternate behavior is unworkable.

Nate Azrin was a brilliant scientist because his research did what good science always does – it reveals nature. People who allow ideological biases to govern their methods invariably create complex, ineffective methods that serve themselves rather than those who need help. The next time someone suggests teaching an alternate behavior to get rid of unacceptable behavior teach them to be silent, instead. Good luck.


One thought on “DRO – An Alternate Reality

  1. Gary,
    Another great article! I have a question regarding an example you gave here.
    What is the proper way to train my ACD
    the “drop on recall” behavior. I’m plagued on issues such as this. Should I use a separate/ unique word instead of the word, down?
    Thank you

Leave a Reply

Your email address will not be published. Required fields are marked *