annie grossman illustration of dogs

Episode 70 | Schedules of Reinforcement, “Baked In” Behaviors & How Dog Training Can Help You Take The MCAT

A listener who is studying for the MCAT exam wrote in to ask if there were any dog training scenarios that could help illustrate some of the terms she needed to know for the psychology section of the exam. Annie, who has learned most of what she knows about dog training from working with dogs rather than from studying terms or taking exams, does her best to help make some "science-y" concepts more understandable through the lens of dog training and human behavior as we experience in everyday life. She talks about schedules of reinforcement, learned behaviors vs preinstalled behaviors, learning by observation and more.

Transcript:

 

Annie:

 

Hello, human friends, Annie here.  As I've mentioned before, I've been recording mostly in my neighbor's apartment while he's out of town so that I can escape the craziness of my apartment, but he doesn't have air conditioning. So I couldn't deal with sitting in his apartment to record today. So I came back to my apartment to record after sweating profusely while trying to record down there. 

 

And then I realized you can't really have the air conditioning on anyway when you're recording a podcast because of the background noise, which made me think about all the sweaty podcasters working from home right now. So I blast the AC really, really high, just long enough to try and cool down the room while I record. So anyway, Hi!

 

I wanted to respond to an interesting question. I got from a listener who has been in touch with me, uh, before Supriya is, uh, her name.  I might be saying it wrong. Supriya.  Such a pretty name that sounds like surprise.

 

She wrote:

 

Hey Annie, I am currently studying for the psychology section of the MCAT, and while I've never heard of many things in this section before I am totally nailing the section on classical and operant conditioning, because I've been listening to your podcast for a while now.  I was wondering if you happen to have time before my exam in September, would you be able to expand on operant conditioning in terms of dog training, which is what makes sense to me.

Specifically, I'm studying reinforcement schedules, innate versus learned behaviors, escape and avoidance learning, the Bobo doll experiment and associative versus non associative learning. Then there's also biological constraints on learning, which I'm studying specifically for humans. But I'm curious about this in dogs too. Of course, if you don't have time, this is completely okay.  Just thought I would ask considering I've learned so much from you already.”

 

Isn't that a nice email to get, isn’t that a cool email to get? First of all, just wanted to say that I'm flattered that you feel that you've learned so much. And I'm amazed that anybody is asking me for MCAT advice. I haven't taken a science class since high school and did not do well in science classes in high school. I think I had one math and science requirement in college, and I took a HTML class, which was probably the most useful class I took in all of college.

 

I did take some psychology in college, but I just wanted to preface any answer to this question by saying that I am, I would say, an educated layman on all things relating to the science of behavior.  I think it's an endlessly interesting branch of science that I wish everyone knew more about. And you know, it's an area of science that in my experience in all of my schooling was hardly ever even mentioned. 

 

Now it's something that I love to learn about, love to think about, but that is really only because of my interest in dog training. I didn't get interested in the science of behavior and then follow that path to dog training; I got interested in becoming a dog trainer, not really knowing what that was gonna mean. I mean, I figured I was going to learn how to somehow make a living helping people with their dogs.  But the finer points of how to train dogs or how I was going to turn that into a business, all of that has been a journey. And it's a journey that has made me understand the world in such a new way.

 

It's made me understand all of animal behavior, including human behavior, in a new way. And it's a way of understanding things that makes a lot of sense because we are all behaving all the time. And we are surrounded by people who are behaving all the time. And dog training is just an application of this science that is, you know, it's rewarding.  When you can think about operant conditioning while training your dog, because you can start seeing how the same principles — things are being reinforced, things are being punished — how those apply to your own life.  Your own choices, the things you do, the things your children do, the things your employers do.

 

And when you're living with your dog, it's like you can kind of experiment with how to teach in the most effective way possible and how to make your dog a better learner. And you're kind of doing the same things that dolphin trainers are doing, but you are doing it in a way where it's like you're playing a game with your dog. 

 

Anyway, my entree into the fascinating world of behavioral science was through dog training. And what's interesting is there are people who I know who have discovered the fascinating world of behavioral science, behavior analysis through other fields. Be it working with kids, working with gymnasts, working in workplace management. There are leaders in each of these fields that are trying to systematically show people how each of these areas relates to learning and how a rather basic, even crude understanding of learning theory and the science of behavior — operant conditioning, classical conditioning — can be so transformative.

 

Now this is not to say, you can not train a dog if you don't understand the basics of behavioral science.  There are many people in many areas, not just dog training, who have figured out how to use positive reinforcement, and use it wisely and well to teach others. And they may not know any of the sciency terms.  And this doesn't make them magicians or anything. It just makes them smart. They've figured out what works and what doesn't work.

 

On the other side of the coin, of course, there are also people who have figured out what works, but they've tacked on all kinds of weird superstitions.  In the dog training world, this is the stuff of all the Cesar Milan talking about “energy” and walking through the door before your dog.  And also, you know, not committing towards using the least invasive methods possible in training. But what they're doing may work as well.

 

I guess what I'm trying to say is like, it's possible to fly a plane without understanding all the physics that goes into keeping that plane in the air. And at first I think I was that kind of pilot. I was just, you know, learning how to fly a plane, in terms of how do I train a dog.  But then I was like, Oh my God, this is so interesting. I want to know why this is working.  But it's also possible to, to fly a plane and not really care why it's staying in the air. Those people could be benefiting from the science without totally understanding it.

 

And then on the other side of things, it's possible to fly the plane while whistling happy birthday the whole time, and be convinced that it's your whistling that's keeping the thing in the air, which is how I would define a lot of dog training that's existed in the last century.  For instance, Rudd Weatherwax, who I talked about in the last episode, who was Lassie's trainer and was a big believer in using some dog training techniques that worked thanks to classical conditioning and operant conditioning, but were perhaps, in my opinion, a little misguided.

 

So I wanted to talk a little bit about some of these things that Suprihya was asking about, but wanted to just preface things by saying, I am not an expert. I am just someone who is excited about applying this stuff to dog training and found it through dog training. And I don't have a master's. I don't have a PhD. I do have a bachelor's degree in individualized study, which is about as meaningless as it sounds.

 

But there are two books on this stuff that I have learned a lot from.  One is, I guess it's kind of a college textbook. It's called Behavior Principles in Everyday Life by John D. Baldwin and Janice I. Baldwin. The other one is by Pamela J. Reid, PhD, called Excel-Erated Learning: Explaining In Plain English How Dogs Learn And How Best To Teach Them.

 

So the first thing that she asks about yeah is schedules of reinforcement.  Now everything is under some schedule or schedules of reinforcement, and there are schedules of punishment as well. And schedules can really affect behavior. How the behavior happens, when the behavior happens, if the behavior gear happens, the intensity with which the behavior happens.  If you pick up either one of these books, you can read all the fancy names for the different kinds of schedules of reinforcement, and you can read how each one affects behavior, but I'm going to attempt to explain a little bit about schedules of reinforcement as I understand them.

 

First of all, you have fixed interval schedules and the way I think of it with most of the training that I do with my dog or with other dogs, training is on a one to one fixed interval schedule. Every time a behavior happens, if that's a behavior I like, I am going to reinforce the behavior. Now, of course, in real life and not in the lab, I might sometimes miss certain behaviors.

 

But the way I think of it, let's say something is happening like a Sit. Sometimes I am going to reward that Sit with a treat. Sometimes I might reward that Sit with a pat on the head or just a smile.  But my goal is to maintain the behavior. And I think the best way to maintain a behavior is to be extremely consistent with your reinforcement.

 

Now, there might be times when I neglect to reinforce him, because I don't realize that he's responded to me giving the cue for Sit, or I'm distracted, or, I mean, it can happen, right?  If there's no reinforcement and also nothing bad happens to the dog, then that behavior is now under extinction, which means it is on its way to stop happening.

 

But, you know, I've built up so much — you know, I love the term mass. Like I've built this behavior of sitting has been rewarded so much time. Like I've put so much money into that bank account by paying my dog for sitting pretty much whenever he sits that I'm not going to be too worried about that behavior going unreinforced some of the time.

 

A behavior being under your extinction does not mean that it's just going to suddenly go away or even that it's totally going to go away at all. It just means it's like that, you know, we talk about the four quadrants, and then there's that fifth, the fifth part of operant conditioning, which is nothing happening.

 

In the Good Dog Training Course, I give an example with a little animation — it's similar to getting paid, to go to work.  That's a fixed interval schedule. You know, I work four or five days and then I get a paycheck. Well, I don't actually get a paycheck after working for five days, [laughs] but in theory, someone could. And if that happened every Friday, you would probably continue showing up to work on Monday, assuming you didn't totally hate your job and that money was worth it to you, et cetera, et cetera.

 

So imagine, you know, you get paid every week for five years.  One Friday, you come into work and there's no paycheck for you.  There's no explanation. You will probably still come into work on Monday, all else being equal, because that behavior of going to work has so much mass.  It's been reinforced so many times.  But if you went to work and got paid at the end of the first week, and then the second week there was no paycheck there, that behavior of going to work maybe hasn't acquired enough mass.  It hasn't been reinforced enough, so it's likely that you wouldn't show up for work the week after that.

 

And another thing with, with something like sitting is, you know, I think that the act of sitting probably feels good to your dog most of the time, if they're in the mood to rest a little bit. So the way I see it, just the very act of sitting, the enjoyment of sitting is also like putting money into that bank account. It's rewarding that behavior, reinforcing that behavior.  Self reinforcement, I've learned, is not actually a real term. So don't study that for your MCATs.  But it's the way in which I think about these things.

 

As far as other schedules of reinforcement, other schedules, the way I see it, are more important when you're teaching a new behavior.  For behaviors that are already learned, that you're just trying to maintain in dog training, we're gonna stick to that one to one ratio.  But there are lots of other schedules you can use which can improve a behavior or solidify a newly learned behavior.

 

And again, go to these books for the more sciency terms, but, you know, shaping, for example.  When you're shaping a new behavior, you may occasionally withhold a reward in order to try to get your dog to try something new. And if you're withholding the reward, well, maybe you're going to be withholding it for something having to do with time. I am only going to reward my dog if he stays in this Sit for five seconds.  That's one kind of schedule.

 

Another kind of schedule, it might be — and actually this is something I often suggest on the street — is to reward your dog at some fixed distance that is a distance that you can remind yourself of.  So I'll say, you know, reward every time you see the bumper of a car or reward every time you see a tree on your block, or every fire hydrant.

 

Your reward might be contingent on time or distance that actually doesn't have anything to do with your dog's behavior.  Actually, to go back to the comparison to people going to work, you know, I am someone who doesn't respond super well to what I guess is called a fixed interval schedule. When I had a job where I got paid at the end of every week, regardless of how I performed, I was just a terrible employee. You know, I would just do the bare minimum. 

 

Actually at this one job, I remember, I went away for two weeks and when I came back, I realized my boss hadn't even realized I'd left. And, you know, I was kind of aware that it was a pretty decent situation I was in if I could just keep getting paid for basically not doing very much work.  But that was a ratio of rewards that did nothing to particularly reinforce particularly good behaviors on my part, good work.

 

And I guess that's kind of like dogs who are fed at the same mealtime every day no matter what the proceeding behavior was. And, you know, for most dogs, that's not really a problem. What's being reinforced is, I guess, whatever happened right before the dog gets its meal or what's happening while the dog gets his meal. But most people aren't feeding their dog in such a way where they're considering that that bowl of food is reinforcing any specific behaviors. So, you know, I think most dogs are basically just being reinforced for being in the kitchen because they get their meals in the kitchen at a certain time every day.

 

But you know, where things get interesting when you're training new behaviors is when you are thoughtful about when you are delivering those rewards, because that is what is going to help you engage your dog's brain and get new and cool behaviors. And of course create new associations, and I know for me is just a much more interesting way to live.  I've always done better as a freelancer, as an entrepreneur, doing things that are rewarded at a variable rate. I think anyone who has any hustle in them is like that.

 

In the book Don't Shoot the Dog, I remember Karen Pryor has a great part where she compares how different schedules of reinforcement produce different kinds of behavior, but also kind of like different kinds of people. If I'm remembering correctly, she talks about jazz musicians and actors and how jazz musicians, they do this massively creative improv part of a song, and everybody claps right away. So that behavior is reinforced right away, each time.

 

Whereas an actor maybe gets applauded for once at the end of a play, which might be long after his part onstage, or might not ever get any feedback because something goes on TV and they don't sit in a room while other people are watching it and clapping or whatever.  Maybe nobody ever reviews their movie, or maybe they do, but it's a year after they filmed it.

 

Anyway. Anyway, it's sort of interesting to think about how those different rates of reinforcement, different schedules of reinforcement, different ratios — I get confused by all of these words, variables [laughs]. I do better with the real examples. 

 

Another example would be, of, I guess, variable versus fixed is — and I think about this whenever I take Via.  Via is a service in New York city where you can share a minivan with other people. And they have fixed stops, like a bus stop.  Versus something like Uber.  The Via drivers, I believe at least they used to be on a salary. They got the same amount of money whether they had passengers or not. They were paid per, I guess per day, or per hour.  

 

Whereas Lyft and Uber drivers are paid for how much they drive and how many passengers they have. I'd be interested to talk to some people who have worked at one place or the other, or maybe both, because I'm guessing that those different schedules of payment have affected those drivers’ behavior.

 

To bring it back to dog training, you know, it's like you're starting out when you're training a new behavior, every behavior is going to get rewarded. That's where, you know, I call it criteria zero, you're existing here, you get a treat, right. We're always trying to make sure the dog is just making good associations to begin with. Even if the behaviors aren't stellar, especially when you're working with puppies.  So you're starting out on this fixed ratio, one-to-one.

 

Then you are going to go through a period of not rewarding every single behavior, because that's how you shape new behaviors.  And actually sometimes withholding a [reinforcer] is a great way to get an extinction burst, is what it's called, where like an animal just suddenly tries every, every possible thing they can think of, cause they're a little bit frustrated.  Dog trainers will talk about riding the wave of the extinction burst, like keeping the dog in the game, keeping the dog interested, but withholding your reward in order to get new behaviors.

 

Let's say you're shaping a dog to go to a mat, like a very basic behavior to shape. And actually I have a podcast episode on that, you know, at first you're going to maybe reward when the dog puts one paw on the mat, and then when they put two paws on the mat.  But that might mean after the dog puts one paw on the mat, you're waiting for him to put two paws on the mat, and that there's a few times when he puts one paw on the mat and nothing happens.

 

If you have done a good job of adequately reinforcing your dog up to that point, you will have built enough mass that your dog is probably not going to completely give up and walk away.  The likelihood that that behavior is going to become extinct because you have withheld one or two rewards is probably not great, but that's part of the art of both training and knowing your animal.  Knowing what frustration level your animal can live with. 

 

Once your dog knows a behavior, you will most likely return or try to return to rewarding every successful behavior as you're working to maintain that behavior. Okay. Much more that could be said on this topic. We will address it some other time. It’s rich and meaty and wonderful.  And I'm glad Supriya is asking these questions.

 

Then she asks about innate versus learned behaviors. Well, I mean, when it comes to dogs or any animals, some behaviors are baked in.  “Baked in” not being those scientific terms. [laughs] You're not going to get those from me. But some behaviors are built in.  And these are behaviors you usually see throughout a species, not just in individuals.  They're usually behaviors that have been selected for because they provided some kind of evolutionary advantage, have helped the animal adapt in some specific way. But even if the behavior provides no specific benefit, now it's still like baked in, in a vestigial way.

 

Conrad Lawrence who's like the father of modern ethology talked a lot about fixed action patterns in his work.  Fixed action patterns being like a whole series of behaviors a dog might or any animal might engage in without those behaviors necessarily being learned. I believe I remember him talking about dogs lifting their legs as an example of this.

 

You know, a dog doesn't need to learn to lift his leg — or her leg, sometimes female dogs do it too. It's a built in behavior that probably evolved for some reason — I don't know, maybe making the pee seem like it's coming from a larger animal, maybe. I'm not sure what the reason is, but clearly it was important because the animals who happen to have a natural tendency to engage in this behavior without even having to learn it were the ones who manage to procreate, and their progeny are the ones in our homes today.

 

A lot of, a lot of these baked-in behaviors, these behaviors we don't have to learn, have to do with eating and reproduction. And, and I often will point this out when people are worried about their dogs humping and in puppy playtime, or even adult dog play time. It's a completely natural thing.

 

This is something that came pre-programmed and your dog is still figuring out when it is or isn't appropriate to do this behavior that, for the good of their species, they are born having a very strong desire to do.  Because, you never know — if you're humping the laundry bag or a female dog in heat, there's always that chance you might be doing the right thing.

 

But again, I am giving you the Annie Grossman explanation of these things. I feel like I'm still trying to figure out and probably will always be trying to figure out the finer points of this amazing area of science.  I'm just giving you my best dog trainer translation. 

 

Something like blinking is a tricky thing to think about.  Because, okay, I don't have control over the behavior of my heart beating. You don't have control over those smooth muscles in your body.  But you do have control over, I guess they're called the skeletal muscles in your body.  Blinking you have control over kind of.  You could certainly learn to blink in certain situations and learn not to blink in certain situations.

 

These fixed action patterns or, you know, displacement behaviors also come built in.  Things we do when we're stressed or excited.  In dogs that can be yawning, or licking their lips, or scratching their ear. These are things they don't learn. They come baked in.  With humans, it's, you know, jumping up and down or punching a wall or crossing your arms — things that we do when we're excited or stressed, you can unlearn those things you could learn to not do them.  So again, so much to talk about, so much to learn.

 

Okay, then she asks about escape and avoidance learning.  Escape and avoidance learning is basically everything that doesn't fall into the positive reinforcement quadrant, but that is causing a behavior to be more likely to happen again.  It's all the stuff that lives in the negative reinforcement quadrant.

 

If you raise it, use your hand as if to hit a dog and your dog backs off, your dog is being reinforced for backing away from you, because backing away from you results in avoiding the threat of being hit. You know, my favorite example in the human world of negative reinforcement, or one of my favorite examples is, you know, negative reinforcement is why you put on your seatbelt when you hear the beeping in your car.  Putting on the seatbelt makes that annoying sound go away.

 

Or an example of negative reinforcement that affects both humans and dogs is nagging.  When you do something in order to make someone leave you alone, to stop bothering you.  That is an example, can be an example of avoidance or escape.

 

And then she asked about associative versus non associative learning. I would think that's just learning by association versus learning by consequence, or classical conditioning versus operant conditioning. Although, it can certainly be argued that all of it is operant conditioning. There's always a behavior that results in a consequence.  The behavior might just be nothing. No criteria for the behavior.  And if there's zero criteria for that behavior, then you can call it classical conditioning or learning by association. 

 

So associative learning versus non associative learning. Again, I'm not the one studying for the MCATs. So tell me if you learn that I'm wrong, [laughs] but I think, uh, you should be able to master that if you have a good understanding of classical and operant conditioning, thanks to listening to this podcast.

 

And then she asks about the Bobo doll experiment, which I had actually never heard of. So I asked the Google, and according to encyclopedia Britannica, the Bobo doll experiment was done in the 1960s and was — you know, I'm just going to read it:

 

“Bobo doll experiment, groundbreaking study on aggression led by psychologist Albert Bandura that demonstrated that children are able to learn through the observation of adult behaviour. The experiment was executed via a team of researchers who physically and verbally abused an inflatable doll in front of preschool-age children, which led the children to later mimic the behaviour of the adults by attacking the doll in the same fashion.”

 

So super interesting. Not something I know very much about.  I do think there is pretty good evidence that dogs can learn from observation, at least from observing each other.  I know that Ken Ramirez who was on the show recently has done a lot of work showing that dogs can not only learn from each other, but be trained to learn from each other.  I know that humans are not special in this way. There are lots of studies of lots of different kinds of animals learning from one another by observation.

 

Again, I don't know very much about any of these studies. I would love to learn more.  I'm vaguely aware of research also that's been done and I'm sure is being done about how dogs can learn by observing people. Of course, all learning by observation, there still is a component of learning by a behavior being reinforced or punished. I guess it’s just, the observation is what prompts the trial of a new behavior, and then that behavior is either encouraged or discouraged.

 

And I'm guessing that learning by observation can also in and of itself be a skill that could be improved or discouraged. Like, if one dog in your household always tends to do what the other dog in your household is doing, it's probably because there are lots of behaviors that have been reinforced as a result of doing what the other dog does. Whereas, if every time the one dog does what the other dog does, he gets punished in some way, the likelihood that he's going to continue to do what the other dog does decreases.

 

Anecdotally, I do think that dogs who witnessed a lot of aggression are more likely to be aggressive. I think those fixed action patterns might be more likely to kick in. And probably it's the same thing with humans. You know, it's kind of like I was saying before, like, thank God we have scientists who are studying these things in labs with inflatable dolls, or what have you.  With pigeons or rats in boxes.

 

But so much of this stuff has already been learned by some people, by some communities, by some dog trainers.  All of us who might never fully grasp the intricacies of the science or know all the words, but that doesn't mean we can't appreciate it, be excited about it and learn from what these researchers are doing.

 

Anyway. Supriya, good luck on your MCATs, thanks for reaching out and do make sure to check out these two books that I can recommend more.  Excel-Erated Learning and Behavior Principles in Everyday Life. And, Oh, there's so many passages from both of these books that I would love to share, but I thought I would just end by picking one paragraph from each on the topic of schedules of reinforcement.

 

From the book by the Baldwins:

 

Numerous other activities are on variable ratio schedules of reinforcement, such as hunting, fishing, sales, begging, card games, gambling, and scientific research.  There is no fixed relationship between the number of responses and reinforcement. A salesperson may have to talk to two, four, 10, or 20 customers before making the sale. A gambler may have to make three, five or nine bets before getting a winner.

In all variable ratio schedules, people with the skills to keep the ratio numbers low will earn more reinforcers and show higher rates of responding.  People with the skills needed to pick winning horses are usually much more devoted to betting on the horses than people who lose almost all the time, unless the losers are finding other rewards and betting such as being with friends or avoiding work.  In contrast, 

 

[baby cries]

 

Alright, Magnolia has had enough. Have you had enough of this podcast?

 

[Annie continues reading]

 

Many of us develop a risk aversion if we fail too often at risky activities, such as gambling, and are not so hungry for reinforcers that we will face high risks to obtain them.  But when the lottery jackpot exceeds several million dollars, even risk averse people may take a gamble.

 

And in the book by Dr. Pamela J. Reid, she talks about how there's continuous reinforcement schedules, fixed ratio schedules, and variable ratio schedules. And I'm just going to read this before I run off to tend to my daughter who just woke up on her nap [laughs].  To all those parents out there who are dog trainers, who might be listening to this, she writes — and I was not familiar with this game that she writes about, but I'm going to try it out —

 

Those of you who are parents may be familiar with the timer game, which is an example of a variable interval interval schedule with a limited hold of zero seconds. The timer game can be placed in any situation where you have children confined to a small space with limited activity, such as a long car trip or an airplane or train ride.

It goes like this. You set the timer for an amount of time, but the children can't see it timed down. When it rings, the behavior of each child is assessed. If the child is behaving appropriately, a primary or conditioned reinforcer is dispensed. For instance, you could have them earn tokens to trade for TV viewing time later that evening.  If the child is not behaving appropriately, no reinforcement is earned, or may even be taken away. I.e., TV time is subtracted from the child's total.

The reinforcement schedule is a variable interval because it is based entirely on time, and the interval of time is unknown to the child. Technically it is a random interval schedule because there is no mean time around which the intervals vary.  The limited hold is zero seconds because the child must be admitting the desired response exactly when the interval times down.

Theoretically, the child could scream, yell, and throw temper tantrums throughout the interval, but provided she was behaving at the precise moment the timer dings, she would legitimately earn reinforcement. That's the essence of an interval schedule.

 

All right, guys, I could talk about this stuff all day, but must go to my daughter. And also I have got to turn the air conditioner back on.

 

Links:

Behavior Principles in Everyday Life by John D. Baldwin and Janice I. Baldwin

Excel-Erated Learning: Explaining In Plain English How Dogs Learn And How Best To Teach Them by Pamela J. Reid, PhD

Don't Shoot the Dog by Karen Pryor

Bobo doll experiment

Find our Good Dog Training Course and others here

 

Related episodes:

Episode 33 | How to shape your dog to go to a mat (and to be a polite Thanksgiving guest)

Episode 61 | The Greatest Animal Trainer On Earth: Ken Ramirez

Libby Sills
elizsills@gmail.com