Written by: Ethan Moore (@Moore_Stats)
Follow us on Twitter! @Prospects365
At this point, we’re all familiar with the “times through the order penalty” which states that, in general, hitters get better results the more times they face a pitcher in a game. This is one of the statistical idea behind the use of an opener, which decreases the chances that any pitcher will face the top of the order more than twice in the game.
So although this effect is studied and widely agreed upon, the jury is still out on why it happens. The two prevailing theories are:
- As the game goes on, starting pitchers get tired and begin throwing more hittable pitches as a result of fatigue. Hitters hit these pitches better and thus achieve better offensive outcomes.
- As the game goes on, each hitter gets more familiar with the pitcher and gets better outcomes later in the game simply because he is more familiar with the pitcher’s pitches which have not decreased in quality.
By analyzing the trends in my pitch quality metric (which I detailed here), we may be able to tease out whether the quality of pitches actually tends to decrease as the pitcher faces the order more times.
The idea for this project was suggested by Alex Caravan on the Driveline R&D Podcast, Episode 20 and the relevant segment can be viewed here. I typically prefer to explore my own ideas in these posts, but I thought this was an interesting question by Alex and that I could contribute to finding an answer given that I already have a pitch quality metric ready to go.
For this analysis, I used my usual 2019 pitch-by-pitch database from Baseball Savant. I only used data for pitchers who started the game, as relievers almost always face batters only once per game which could introduce some confounding into our results and muddy the water. This means that if a team used an opener, I used his stats and not the stats of the subsequent “long reliever.”
I did not use a minimum number of games started, so pitchers who started even one game in 2019 are included. Because the opener situation happened pretty infrequently in 2019, I am not too worried about these observations changing our results and including openers may give us a more representative view of how starters can be expected to perform overall.
To measure pitch quality, I calculated the average expected run value (xRV) for 2019 starters for each time through the order, then multiplied by 100 to get expected run value per 100 pitches. Negative values here are good (because we would expect good pitches to lead to a negative amount of runs for the offense), so in order for high quality pitches have positive values, I multiplied the xRV values by -1 and changed the units to be “Expected Runs Prevented per 100 Pitches.”
As a reminder, my pitch quality metric is context neutral meaning it does not take game situation, count, opposing hitter, etc. into account. Each pitch is graded purely on how other similar pitches performed, allowing us to compare all pitches to each other.
Here is what I found for 2019 starter average pitch quality by time through the order:
Although conventional wisdom may suggest that pitch quality decreases as a pitcher goes along in the game, this table suggests that pitch quality, on average, actually tends to increase as a pitcher faces a hitter more times. This would make sense if you think about a pitcher getting “warmed up” throughout the game. He starts out a little rusty so his pitches aren’t as good, but once he settles into the game, the quality increases. That is one explanation. Another explanation is much less fun; it’s called survivorship bias.
Not every pitcher faces the lineup four times a game. In fact, only 0.7% of pitches thrown by starting pitchers in 2019 were to a hitter for the fourth time around the order. These 3220 pitches were not a random sample of pitches. They were biased to be from games where the pitcher was already doing really well and thus survived to face the order for a fourth time. Better pitchers throw later into games, so of course the average pitch quality of the fourth time through the order is high!
The same idea, to a lesser extent, could explain why the pitch quality for third time through the order is higher than the second time through, or why the second time through is higher than the first time through. I’ll touch on that later.
For the more visually inclined, here is the same information shown in the table above but on a graph. The red line is the overall average pitch quality for pitches thrown by 2019 starters.
So we see that starter pitch quality was at its lowest during the first time through the order in 2019. Digging deeper, let’s look at the league’s average pitch quality per inning during the first time through the order:
Starters tended to throw their worst pitches in the first inning last year! This is consistent with the theory from earlier that pitchers need a little time to warm up in a game before they start throwing their better pitches. What’s weird is that, according to a Baseball Prospectus article about wOBA’s time through the order trends, hitters tend to have a first inning wOBA 8 points lower than in all other innings.
Combining these two findings suggests that both hitters and pitchers perform worse than expected in the first inning, perhaps due to not being “in the zone” yet. In a zero-sum game, I think it is interesting to see both pitchers and hitters somehow having suboptimal performances simultaneously.
Second Time Through
To learn a bit more about the survivorship bias in this data, let’s examine 2019 starters’ average pitch quality by inning during their second time facing the lineup:
We see that average pitch quality is higher when the starter is facing the lineup for a second time in the fifth or sixth inning versus when he is already facing the lineup a second time in the second or third inning, for example. This is likely because worse pitchers give up more baserunners and face the lineup for a second time earlier in the game whereas better pitchers get more outs, taking longer to reach the top of the lineup again.
I think this is a good illustration of how survivorship bias explains a lot of the trends we see in this data.
Breakdown by Pitch Type
At the suggestion of Alex Caravan, I decided to check out the pitch quality by TTO splits by pitch type. Because there were so few pitches thrown in the fourth time through the order category, I omitted them from this graph. Note that the black line is the pitch quality average for each TTO.
As is consistent with the findings of my initial pitch quality article, Four Seams, Two Seams and Sinkers are the pitch types with below average pitch quality regardless of TTO. And expectedly, knuckle curves and sliders are the pitches with the highest pitch quality regardless of TTO.
But here’s what’s new. Four Seamers and Sinkers tend to be of even lower quality the third time through the order whereas most other pitch types did not see the same effect. Could a decrease in pitch quality as a result of pitcher fatigue only be present for fastballs? Most other pitch types increase in pitch quality as TTO increases. Again, this may be a result of the pitcher getting more comfortable with his pitches throughout the game. Or it may be a result of survivorship bias. Or it may be a result of something else!
Pitcher Specific Case Study
The last thing I want to touch on is how these results may vary by pitcher. Let’s take three of the best pitchers in 2019 and see how their pitch quality varied by time through the order. I want to note that this graph adjusts for each pitcher’s average pitch quality, so a value of 0 means his average pitch quality at that time through the order was exactly equal to his overall average pitch quality. Values above the black line indicate the pitcher had a higher pitch quality that time around the order than he did overall.
Interestingly, we see very different patterns for Cole, deGrom, and Verlander. DeGrom had the highest pitch quality of the three in the first and second times facing the lineup, but his pitch quality clearly decreased each time through the order. In the opposite fashion, Verlander’s average pitch quality tended to increase every time through the order. This makes sense as Verlander is quite legendary for being better at the end of games than he is at the beginning (source).
Lastly, Cole’s pitch quality was pretty consistently between deGrom’s and Verlander’s…until the fourth time through the order when Cole’s pitch quality spiked to an insane level. This indicates that when Cole was “on” in 2019, he was nearly untouchable in a way that the other two starters weren’t.
However, one big issue with this graph is that it does not adjust for the quality of the pitcher, leaving the analysis vulnerable to confounding, survivorship bias, and the influence of outliers. To fix this, we can look at what each pitcher did in relation to their overall average pitch quality. This leaves us with the difference between how we would expect them to perform versus how they actually performed (and puts all starters on the same scale).
Doing so for the three pitchers above gets us this graph:
Once we adjust for each pitcher’s overall average pitch quality, we see similar trends, but now with more context. This graph tells us that deGrom’s pitch quality tends to be worse than his overall average during the third and fourth trips through the order, and that Cole and Verlander follow a more similar pattern than is shown in the initial graph. By comparing each pitcher to his own overall pitch quality average, each pitcher’s pitch quality trends by time through the order becomes more clear and useful.
Further research is needed, but there is potential here for teams to monitor the pitch quality of their starter as the game goes on and inform their decision about when to pull him by seeing if his pitch quality trend is varying from its typical pattern (in addition to all other factors currently used to make these types of decisions).
The big takeaway here is that this analysis does not support the theory that the quality of pitches tends to decrease as the game goes on due to pitchers getting physically fatigued. Although there is evidence that this could be the case for fastballs, the general trend suggests that a hitter is likely to face better quality pitches the more times he faces the same pitcher in a game.
This finding indirectly supports the theory that hitters’ increased performance as they face a pitcher more times in a game (the “Times Through the Order Penalty” for pitchers) could be the result of increased hitter familiarity with the pitcher and his pitches, tendencies, etc.
Supplemental findings included that pitch quality tends to be lowest in the first inning, most pitch types do not tend to become higher or lower quality as the game goes on, and different pitchers can have different times through the order pitch quality patterns. All of these findings, and more in this area, are ripe to be studied further.
You may be wondering what the first graph in this post, the one that showed that pitch quality tends not to decrease and actually increase as TTO increases, would look like if we adjusted for pitcher quality like we did in the Case Study section. Here’s your answer:
Again, the line at 0 would be a pitcher performing exactly as expected. In general, 2019 starters underperformed their average pitch quality the first time through, did better the second time through, and started to dip during the third time through. The dip the third time through is the only key insight this graph provides that the initial one didn’t, and I think it matches up with conventional wisdom of pitchers running out of gas quite well. The graph also shows that pitchers who did survive long enough to face the order a fourth time typically were over-performing their average pitch quality by a lot, which makes tons of sense!
Thank you for reading and please reach out to me on Twitter @Moore_Stats with any questions or feedback!
Special thanks to Alex Caravan (@Alex_Caravan) for guidance on this project.
Follow us on Twitter! @Prospects365