Variable Consequences – Key Ingredient of Learning

Variable Consequences – Part 1.

“Consistency is contrary to nature, contrary to life. The only completely consistent people are the dead.” – Aldous Huxley

While Huxley may not have known much about dog training, his point was well taken. In order for  an  organism to be alive, it must be able to vary its behavior according to changing circumstances. Even though most forms of dog competition require consistent performance, behavioral variability is  a key ingredient to  creating exceptional performance levels. Trainers who understand how to use variability and consistency are better able create and maintain their dog’s great performance.

  The first rule of variability is a simple one — variable consequences cause variable behavior. Most scientific studies of behavior have avoided this aspect of behavior, entirely. Instead, the experimenter sets up a  pass/fail system of reinforcement called “successive approximation.” Any behavior that looks like a step toward the target behavior is reinforced. Behaviors that are not similar to the target behavior are not  reinforced. Eventually, the reinforced behaviors become predominant and the unreinforced behaviors disappear. Soon the animal is doing one behavior, over and over again. If you are a behavioral psychologist and only want to count the rate at which an animal performs a single unit of behavior, you are in hog heaven. If you are a dog trainer and want your dog to learn how to get a lightening fast finish, you have just put your dog into a behavioral straight jacket. Successive approximation depends on consistent reinforcement, and consistent reinforcement causes consistent behavior. The rather tiny increments of change from repetition to repetition that allows the animal to learn smoothly also prevents them from learning by leaps and  bounds If you want the animal to learn dramatically enhanced versions of the behavior, you must either take a very  long time, or look to other means of shaping behavior.

 OK, here we go. I said that variable reinforcement causes variable behavior. This is really just a restatement of the prime rule of operant conditioning – operant behavior is determined by its consequences. Change the consequences and you change the behavior. If you make the consequences highly variable you will make the behavior highly variable, i.e. you will cause it to “wiggle.” As the dog tries to adapt to this new set of rules, he is likely to start offering variations of the target behavior. The variations are often harder, quicker, more forceful versions of the original behavior. Mixed in with these desirable effects of variable reinforcement, you may also see slower, lower, weaker, different variations of the behavior.

The central theme to this process is to invoke your dog’s ability to experiment with a new behavior.  The benefits of learning to vary the reinforcement in this fashion are considerable. First, you teach the dog that after a behavior is shaped with successive approximation, it is time to uncork the creativity and experiment a little. Second, this format allows the dog to bypass tedious “stair step” styles of raising of criteria. While some behaviors require methodical increases in complexity, other behaviors are almost impossible to get with such linear methods of shaping. Third, varying the reinforcement teaches the animal, “If at first you don’t succeed, try, try again. It is difficult to overestimate the importance of teaching a dog to learn in a persistent fashion.

Now, for fair warning — if you have a great deal of experience training dogs, this next part is going to feel really uncomfortable. I am going to ask you to take a behavior that you have already shaped and start to make it wiggle. To get you started, I have a list of reinforcements for a series of repetitions. The idea is to take a behavior that has been shaped using consistent reinforcement and suddenly put the reinforcement schedule on a roller coaster. Your goal for this exercise is to watch closely and see what effect this change has on your dog’s behavior. Ready? Here goes.

Random consequences Project:

  NOTE: Use the cue to get the behavior to occur. Only give the reinforcement described if the animal successfully performs the behavior. In the event that the behavior disappears, drop your  standards and go  back to a one to one rate of reinforcement for 5 repetitions, then start this list at the point you left off. If you want to use this list over and over again, merely start new sessions in different places, or start at the end  and work in reverse order.

1. Click + 10 treats, praise affection and babytalk

2. No click, no treat

3. Click – no treat

 4. Click + one treat

 5. No click, no treat

6. Praise and affection only

7. Click + 1treat, praise, affection and babytalk

8. No click, no treat

 9. No click, no treat

 10. Click, no treat

11. No click, no treat

12. Praise and affection only

13. Click + 3 treats, praise affection and babytalk

14. No click, no treat

 15. Click – no treat

16. Click + one treat

17. No click, no treat

18. Praise and affection only

 19. Praise affection and babytalk

20. No click, no treat

21. No click, no treat

22. Click + one treat

23. No click, no treat

 24. Praise and affection only

25. Click + 5 treats, praise affection and babytalk

 26. No click, no treat

27. Click – no treat

28. Click + one treat

29. No click, no treat

30. Praise and affection only

 For the majority of dogs, this type of system will trigger some pretty interesting variations on the target behavior. As you work through this list over a series of sessions, you will start to see some really good variations and some really bad ones. You can start adding a touch of consistency to the process by giving extra recognition for better versions of the behavior, while using “wrong” for marginal performance. If you do see a really exceptional version of the behavior, give the dog a single click and treat, not an extra batch of treats, enthusiastic praise and affection. Why? Think about what I’ve said. An unexpected jackpot causes variable behavior. If you just saw something you wish to capture intact, use a predictable consequence – one click, one treat.

Leave a Reply

Your email address will not be published. Required fields are marked *