Thursday, February 25, 2010

The Three Parts of Operant Conditioning

What we call "dog training" is also called "operant conditioning."

For all the mumbo-jumbo you hear about dog training, there are are only three basic parts to it: positive reinforcement, aversive reinforcement, and extinction.

Positive reinforcement is any kind of consequence that causes a behavior to occur more often. Examples include food, praise, and play. In some situations, positive reinforcement can be the removal of an aversive reinforcement.

Aversive reinforcement is a consequence that causes a behavior to occur less often. Examples include a leash pop, a harsh sound, or any kind of nonverbal aversive communication made through body movement or positioning. In some situations, punishment can also be the removal of a (positive) reinforcement.

Extinction is simply a complete lack of response. The nonresponse should be total -- no eye contact, no noise or sound triggered by the dog, and no responsive body movement. The dog is invisible.



Watch the short animated clip above, and you will note that the cartoon Cesar Millan uses all three methods to train South Park's Eric Cartman after "Super Nanny" collapses and goes insane in the face of the trials and tribulations of this spoiled-rotten child.

Step one in the Cesar Millan bag of tricks is to extinguish Cartman's negative behavior.

What Millan is doing by ignoring Cartman is signaling that a "new sheriff" is in town -- one that will not be overly reactive.

When Millan talks about "calm, assertive energy" what he is really saying is that the owners have to react less.

A calm owner is not sending a lot of signals, and an assertive owner is not sending tentative or confusing signals.

Send fewer signals. Send clearer signals. Do not be drawn into the dog or the child's drama in a kind of call-and-response situation.

By ignoring young Eric Cartman at the beginning, Millan is creating a "silence" which forces Cartman to pay attention. Suddenly he is not running the show, which means he now needs to pay attention to see how (and if) he can regain control. Cartman is used to running the show and he thinks that is his job. Millan is teaching him something else.

Cesar Millan puts up with a certain amount of nonsense from young Eric, and then he sends a negative signal. The signal has two components; one is tactile, and the other is oral (but not verbal).

Even as he sends the "punishment" of an unambiguous negative signal, Millan is also maintaining his control by ignoring Cartman.

Cartman is not able to "lead" the group by acting out. In fact, both Millan and Cartman's mom are ignoring him! He has gotten a negative reaction, but he has not gotten an empowering response that makes him the center of attention.

At the end of this clip, Millan is seen walking Cartman.

Walking does several things simultaneously-- it gives Cartman something physical to do, and it helps to drain off "the jitters" that both kids and dogs naturally have if they are kept cooped up for too long.

Taking Cartman for a walk also forces the Mother to spend "alone time" with Cartman -- a major reward for Cartman (attention-seeking is one reason he may have been acting out).

The act of taking Cartman for a walk also puts the Mother in the role of initiating, leading and ending the activity.

In short, walking the child or the dog is both a reward (time with mother), a remedy (activity soothes anxiety), and a recapitulation of the pack hierarchy (the Mother is reinforced as the pack leader).

Watch any episode of The Dog Whisperer, and you will see Millan use these same three techniques over and over again.

And to recap, he is using ALL of the tools of dog training:

  • Positive reinforcement (reward)

  • Aversive reinforcement (punishment)

  • Extinction (nonresponse to minor inappropriate behavior that is not self-reinforcing).


Is Cear Millan using dog treats and a clicker for positive reinforcement? No, not generally. But yes, that too is a way of giving positive reinforcement. Contrary to what some dog-training faddists might have you believe, however, click-and-treat is not the only way to give positive reinforcement.

Is the punishment harsh? No. Cartman is not being spanked, much less whipped with a telephone cord. What is happening here is simple communication. The goal is to get the child or the animal to understand what is not wanted, as well as what is wanted. Aversives do not need to be harsh for either a human or an animal to want to avoid them.

You will note that Millan does not always use a leash to train. It shocks people that Millan actually touches a dog! Oh. My. God.

But Millan is no fool -- he knows dogs in houses do not (and cannot) spend their life on a leash, but mild corrections are still needed. The answer: a simple tap with his fingers and a harsh (but not loud or overly threatening) sound serves as a warning that the immediate behavior is improper.

Millan's timing is excellent. He generally corrects dogs in mid-action, and so there is no ambiguity as to what is being said. Sometimes he will "body block" by squaring up his body with the dog -- a way of punctuating his message.

For the record, your life is a product of the same kind of operant conditioning that is being practiced by Cesar Millan.

You get to work on time because of the prospect of positive reinforcement (praise, pay and promotion) and negative reinforcement (criticism, demotion or termination).

If you tell a racist joke at the water cooler, and your coworkers turn away and act as if you are invisible, your bad behavior will be extinguished pretty quickly.

Here's a question: Do you think people would stop at a red light if they did not get traffic tickets for running through them?

Should a store owner praise you and tell you what a wonderful person you are when you pay for your goods, but simply look the other way if you steal them? If you steal from the store, should the limit of the store owner's displeasure be to tell you "no" and not praise you?

How do you think society would work if there was only praise and no punishment?

How do you think society would work if there was only punishment and no praise?

Think both of those questions over.

You see, the world needs balance. And it needs balanced trainers who come at the job with a complete set of tools.

As I have noted in the past, I can build a house with only six tools, but I need every one of them to do a credible job.

The fact that I do not use a level and a square as often as a saw and hammer does not make these two tools expendable.

And so it is with dog training.

I can train a dog with only three tools, but I need all three do to a credible job.

I would no more salute a dog trainer who never used aversive reinforcement than I would hire a builder who never used a level and a square, and for much the same reason -- lining things up and keeping them tight makes the entire structure more durable under stress and in bad weather.

And really, isn't that when we need a good house most?

As for Eric Cartman, how did the rest of his training go? Well, let's see:


The entire episode can be seen here.

Notice that young Eric Cartman had settled down pretty quickly.

Is he happy that he is not the center of attention and leading everyone around? Not yet! But Mrs. Cartman is not at her wit's end here -- a glimmer of hope is revealed because for the first time ever, Cartman is getting clear and consistent communication. Part of that communication is that bad behavior has consequences, and that the agenda is no longer being set by the small annoyance at the end of the leash.



In the end, Eric Cartman is completely transformed. No longer angry and out of control, he is getting regular positive feedback for engaging in model behavior.

He has learned the most important rule of society: Do good, get good; do bad, get bad.

But of course, it turns out that young Cartman's needs are easier to fill than his mother's!

When Cesar Millan leaves, Mrs. Cartman find that she is lonely again, and she reverts back to her old ways of making Eric the center of the house, sending the wrong signals, and relinquishing all power to "the little monster".

Any question as to how that ends?

Now to restate a point I have made before: Cesar Millan's way is not the only way to train dogs.

That said, all successful training methods are based on only three components: positive reinforcement, aversive reinforcement, and extinction. Almost everything else else is chaining, shaping, timing and repetition -- methods to put a point on the pencil.

Different trainers will have different mixes of positive to negative reinforcement, and some will use extinction to better effect than others.

Some trainers are better at timing and nonverbal communication than others.

Different trainers will have different preferences in terms of rewards and aversives, and most good trainers will change those rewards and aversives based on the type, temperament and preference of the animal.

That said, if a trainer does not ever use extinction and does not ever use aversives in training, you do not have a complete trainer or a complete training system.

Can a man with just a hammer and a saw build a house?

Sure.

But remember that the house will be slower to build, will leak when it rains, and will be hot in summer and cold in winter.

Some people are fine with that -- "Hey, it's just a little cabin in the woods. I'm almost never there."

Other folks demand a higher standard. They want a carpenter with a tape measure, a square and a level as well a hammer, a saw, and a glass cutter.

Not only will the house that carpenter builds go up faster, it will also do the job better in the long term.

Yes, both carpenters will be working with just saw and a hammer most of the time, but those four other tools, properly used, actually do make a world of difference.
.

No comments:

Post a Comment