Getting “Wrong”, right.

Variability and getting ‘wrong’, right.
The least studied topic in training and behavior is variability. The major focus of training is to create a repertoire according to a template for excellence. Deviation from that template is called an error. This misses the point of the fundamental aspects of learning. To learn, one must offer a deviation from former behavior. Without deviation there is no learning. The cause of deviation is innate variability. This ability is critical for survival because the environment is dynamic. One-trick-ponies die out if their trick stops working. Variability insures that an animal can adapt to changing circumstances. This leads to a simple point. What constitutes an error in one setting may be the solution in another. If you punish the deviation as an error, you may unintentionally block its use elsewhere. The better option is to learn how to invoke or inhibit deviation in both learning and performance.

Understanding Variability: Training Exercise
Take a clicker and associate it with treats. Simply replace the words, “Good dog” with the click and soon the dog will start showing visible startle responses when it hears the click. Even if you don’t use a clicker in your work, seeing the visible startle will help you understand this task. If you do not want to mess with your working dog, a very reasonable decision, get some other dog for the experiment.

Ask your dog to do a single behavior like ‘down’. (This is best done with a dog that is not a working dog. Their ‘down’ is proofed and will not deviate as needed.) Do about ten repetitions. Say ‘down’, the dog lies down, click and treat. Now pay very close attention to repetition #11. This time, click, but do not give the treat. On repetition #12, pay very close attention to any deviation that may occur. Now go back to one-click, one-treat for several repetitions. On about repetition #20, click and give the dog ten times the normal treat and add all kinds of vocal praise and/or play for about 20 seconds. Try to get the dog totally jacked up. Now ask for the ‘down’ again and watch for deviation – any deviation. That might be a faster down, slower down, down in a different location, down of shorter duration. Select one. Start making that the target behavior by attaching one-click, one treat to it. The unexpected outcome displays a general rule – consistent consequences tend to create consistent behavior while unpredictable consequences trigger variability. This may include harder, faster, stronger versions of the target behavior or may trigger a completely different behavior or none at all. Don’t worry, once learned, a behavior doesn’t vaporize because the consequences wiggle. If that was true, learning French would affect your ability to speak English or vice versa.

Wrong: A powerful tool when used consistently
We all understand the need to mark good behavior with praise and bad behavior with “NO”. There is a third option. I use the word “wrong” to mark behaviors that will not be reinforced or punished. This consequence is similar to putting the wrong key in a lock or attempting to open a car that isn’t yours with your keyless entry. The word wrong tells the animal that a behavior will not pay off at this time. It does not act to prevent the behavior from being reinforced later. It gives you a gradient control over errors that does not leave a residual inhibition. Over time, the dog will actually self-correct when it hears the word ‘wrong‘ or will enthusiastically switch to a different behavior. It is like wiping information from a white board leftover from a previous last class. Often a dog will tend to offer a new behavior that has been reinforced most, most recently – meaning ‘from the last class’. Using ‘wrong‘ is especially important if you want to switch from one behavior to another in a single training session and not have the last behavior continue to assert itself. It’s like saying, “not now, maybe later”. However, it is especially useful if you wish to trigger variability as in our last exercise. The association takes a couple of weeks to take full effect. It gives you to a way instantly terminate an error with no prejudice. That cuts down on the possibility the dog may continue to offer unwanted behavior without applying aversive control that may affect the dog’s willingness to keep working.

Now that we can trigger variations, what’s next? All training starts with a partially formed tendency to do a behavior. A dog has to feel comfortable offering you new behaviors or variants of old behaviors or they become unwilling to go 100%. Think of the concept of “slipping the clutch” on a manual transmission. You can’t change gears unless you disengage the drive train. The same is true of behavior. Unless the dog clearly knows when it can experiment and when it’s time for serious work, both areas suffer. The heavy handed trainer gets a dog that shuts down in training and the dog that isn’t taught to leave variability at the training grounds adds variation when you need perfect control. That means you need to have clear signals that tell a dog which environment is in play. Want another reason to use a clicker? They are never used in performance. The dog learns that the sound of the click is the time to come up with new things. The absence of the click means serious business. However, if you have to pull your dog out of service to fix a problem, the clicker makes that possible. At the Oklahoma City Bombing, the search dogs were overloaded with ‘dead smell’. Some of them didn’t react well because it was so far beyond anything they had been trained to do. A clicker trained dog can more likely leave the search grounds, receive five or ten minutes of training and get back into the game. Without a clear message that training time is on, the dog remains ‘shut down’.

Controlling variability:

Once you have a behavior the way you want it, there are a few things that still need to be done. You have to punish the dog for ‘failure to perform the correct behavior, correctly, in a timely fashion. This event should be a strong message that neither failure to initiate the behavior nor failure to perform it as trained is never acceptable. If you need a reason to train to this level, there are several.

1) If you use a powerful punisher you will be able to maintain the high level of performance with lesser punishers to the point of simply a verbal signal.
2) Punishing the error after the animal knows the correct behavior is critical in reducing to eliminating things like ‘false positives” or failure of the dog to ‘out’ after a suspect is properly restrained. Note: Eliminating false positives is greatly enhanced if you have correctly taught the word “wrong”.
3) By doing this process after the behavior is established with lots of positive reinforcement, the contrast between consequences is greater. This locks the behavior like engaging the clutch. This will reduce sloppiness in the future.
4) While variation is the key to learning, punishment is the key to stopping behaviors, including unwanted variations. We do this naturally – we ‘correct’ behaviors by applying aversive control.
5) If the dog knows the behavior thoroughly, you can punish failure rather than trying to use e-collar nicks or continuous stimulation to compel a behavior. This cuts down on how often you must correct your dog and keeps the three possible consequences (Positive reinforcement, absence of reinforcement and punishment) clearly defined.

Practical Examples:

I was working with some military handlers doing live-fire tactical training. One of the handlers was relatively new and his finger was on his controller continuously. Though the level of shock wasn’t extreme, he was tapping the transmitter on almost every move into the training exercise of breaching a door. Several things were happening, none of them good.

1) The dog was getting used to the shock. Ivan Pavlov demonstrated this 100 years ago. He associated a low electric shock with food. After about 50 repetitions, the electric shock came to mean the same thing as the clicker. In the old days I used low levels of shock to control deaf dogs. They initially overreact because the shock is unusual. Then they become mildly distracted by it and finally they come to love it. 2) By having systematic behaviors that always lead to shock, the handler was teaching the dog hand signals that were perfectly connected to the consequence. The dog would see a hand in a shirt or cargo pants pocket and brace himself. Knowing that a shock is coming is the fastest way to be conditioned to blow it off.
3) There was no signal that meant “No” to tell the dog which aspect of his behavior was being tickled. The critical nature of using a signal to mark the exact behavior is rarely appreciated. To use stimulation without a marker is like throwing a hand grenade vs. a sniper rifle. Markers are auditory or visual aiming devices that influence behavior based on the timing of their presentation. That leads us to another rarely known fact – latency isn’t a problem if you use a marker. I have a blog post about that – you can find it here.
https://clickandtreat.com/wordpress/?p=731

Leave a Reply

Your email address will not be published. Required fields are marked *