VizWiz: MLB

Showing posts with label MLB. Show all posts

June 30, 2024

How to Rank & Filter the Top 5 in Tableau in Under 60 Seconds!

filter , how to , MLB , rank , rank filter , table calc , table calc filter , table calculation , tableau , top 10 , top 5 No comments

In this tip, you will learn how to show the rank of each team in MLB by homeruns for 24 seasons.

We'll start by computing the RANK table calculation for each Season before filtering to only the top 5 Teams in each Season.

October 24, 2022

#MakeoverMonday Week 43 - How have Major League Baseball Pitch Types Changed?

baseball , change , color , comparison , custom number format , group , level of detail , LOD , Makeover Monday , MLB , pitching , pivot , set action , sets , sparklines , tableau , trellis chart , WatchMeViz No comments

This week's data set was pretty simple. We had 21 baseball seasons and a column for each pitch type. Pivoting the metrics made it much easier to work with for me as I could then split the view by pitch type.

During #WatchMeViz (below), I create a trellis view, showed how to create groups, sets, set actions, sparklines, LODs, custom number formatting, creating a mobile view, and more.

Thank you for tuning in. Here's the video and below is my visualization.

September 29, 2021

#MakeoverMonday 2021 Week 39 - MLB All-Time Offensive Wins Above Replacement

baseball , calculated field , dashboard , dynamic , how to , Makeover Monday , MLB , parameter , quadrant chart , scatterplot , stats , summarize , table calc , war , WatchMeViz , wins above replacement No comments

This is the final installment of Makeover Monday as a community project and I decided to go back to the very first week of Makeover Monday in 2016 to use as our viz to makeover this week. It's looking at a complicated stat in baseball called Wins Above Replacement, which is basically a measure of how good a player is that an average player that would replace him.

There are several, even more confusing stats included. I thought the easiest way to compare them would be with a scatterplot. So I made an interactive scatterplot that highlights players that are above the average of the top 200 players.

If you watch the Watch Me Viz video (below), you'll see how I used table calcs to highlight those players and also count the number of players in that quadrant. You'll learn how to use parameters to create a dynamic scatterplot, plus some other bits along the way.

Thanks for coming along on this 300 week journey with me. I hope you have developed your skills and become better at your work. Until next time...

May 7, 2019

#MakeoverMonday: Top 10 Major League Baseball Home Run Hitters

animation , bar chart , baseball , baseball-reference , flourish , home runs , lahman , Makeover Monday , MLB No comments

Earlier today I saw this really cool viz created by Will Sutton that's an animation of the top home run hitters from 1985-2016.

31 years of MLB Home Runs!
I've seen plenty of these charts lately, so for #MakeoverMonday I wanted to learn how to make my own. Feedback welcome. Thanks, @TriMyData & @VizWizBI R code available here: https://t.co/8ipKQOzmrO #Rstats pic.twitter.com/GreSt6t1b0
— Will Sutton (@WJSutton12) May 6, 2019

As I posted last week, Sophie Sparkes introduced The Data School and me to Flourish. Flourish makes it super simple to create animated visualizations with tons of customization options. Given that Tableau doesn't support animations in the browser, this is a great alternative. Flourish provides an example, you import your data, do a bit of customization and voila! You have an animated viz.

The data needed to be structured with a column for each season, so I prepped the data in Alteryx and I included all seasons from 1912-2018. I then filtered down to players with 250+ career home runs (to make the list manageable).

And here's my animated viz of the top 10 home run hitters of all-time.

May 6, 2019

#MakeoverMonday: Major League Baseball's Most Cost Effective Players

baseball , connected scatterplot , cost effective , database , lahman , Makeover Monday , MLB , OBP , on base percentage , quadrant chart , salary , scatterplot No comments

Since the Makeover Monday Community seemed to enjoy sports data two weeks ago, I thought I would provide some data about Major League Baseball this week. First, here's the original visualization to makeover:

What works well?

The title and subtitle explain what the viz is about.
Dividing the viz into two sections by using different background colors on the scatter plots
Consistent scales for the salaries across the charts for each section
Using gridlines to help the audience understand the approximate values of each point
Only labeling the type of stat once by putting the label between the players and teams charts

What could be improved?

There's no data source listed.
I have no idea why these players or team are highlighted; an explanation is needed. At first, I thought it was highlighting the most effective player/team, but it's not (at least that's what I see).
The logos are meaningless for people that aren't familiar with the teams.
What does the big logo on the upper right represent? Is that the author?
The data should be filtered to players that meet certain criteria, like at bats in a season. This would then filter out many players near zero.

MY APPROACH

I liked the idea of using a scatter plot like the original, but I wanted to focus on a metric the better measures "effective". There are so many sophisticated metrics now in baseball. I didn't want to use any of those because they're hard for people to understand. I decided to use on base percentage which is the number of times a player reached base (H + BB + HBP) divided by at bats plus walks plus hit by pitch plus sacrifice flies (AB + BB + HBP + SF).

Why did I choose OBP? Ryan Kelley sums it up best in a post on Quora:

Outs are an extremely scarce resource in the economy of a baseball game, each team has 27 to use (in a 9-inning game) while trying to score as many runs as possible. Every time a batter makes an out therefore, the expected number of runs his team will produce will decrease (assume runs are also a limited resource for now).

A batter's job is to get on base--not make an out in other words. A batter fails to do his job when he makes an out, this failure percentage is 1 - OBP. The success percentage is OBP. If every batter had a perfect 100% OBP, their team would score an infinite amount of runs before every making an out.

Now, because you're talking about value specifically. OBP alone isn't effective in measuring value. You can make it a better indicator of value by giving it context. That context depends on what kind of value you're talking about.

You could tie OBP to a player's salary. This would give you an indicator of how value that batter was to his team in the context of a labor market. After all, baseball players are just employees of franchises in the end. Their jobs are to produce wins. A hitter's job is to produce wins via producing runs. Franchises make money by selling those wins to fans as entertainment.

Each team has a fixed amount of payroll to spend on wins, so the more payroll a batter's salary takes up, the less valuable he is to his team. A good way of illustrating a player's value would be OBP/$ of Salary.

Based on Ryan's explanation, I decided to use OBP as my proxy for batter effectiveness (y-axis). For the x-axis, I wanted to use salary for comparison. However, the data does not adjust salaries for inflation, so a salary in 1985 is not listed in 2016 value. Instead, I came up with a way to normalize the data across all of the seasons.

I created a calculation that compares a player's salary to that of the average salary of the entire league for each season. I made this a percent difference so that the data would then be normalized. Therefore, a player that was 10% above a 1985 salary would be comparable to players that was 10% above a 2016 salary.

Here are my calculations:

Season average salary: { FIXED [Season] : AVG([Salary]) }
△ to Season Average Salary: (AVG([Salary]) - SUM([Season avg Salary])) / SUM([Season avg Salary])

BUILDING THE VIZ

First, I applied some filters to only include what I deemed "eligible" players. These are noted at the bottom of the viz.
Now that I have the x-axis (salary variance from season average) and the y-axis (OBP), I created a scatter plot and added a point for each player for each season.
I added reference lines for the average of each axis.
The players on the upper left are the most cost efficient players. That led me to a quadrant chart, but I only wanted to highlight the most cost effective. I created a calculation to determine the points in that quadrant and place it on the color shelf.
The problem now was that it was basically impossible to find a player in the viz. I thought about using a set action to drill in to a player, but that loses all of the context of the other players. Therefore, I create a parameter to allow the user to highlight a player and I show that players as a connected scatter plot.

SOME THINGS I FOUND

Players tend to be more cost effective earlier in their careers. That makes sense since they are on rookie contracts for the first few years of the career.
Once players sign their first big contract, they tend to either move to the upper right (high OBP, high relative salary) or the bottom right (low OBP, high relative salary).
Some players can sustain that for the rest of of their careers, but that's rare. Typically it's the superstars that follow this pattern (like Barry Bonds or Chipper Jones).
For many of the other players, as they approach the end of their career, they tend to move either to the lower right (high relative salary, low OBP) or the lower left (low relative salary, low OBP). Neither of these are particularly good for the team.

And here's my final product. I had never thought of combining a scatter plot and a connected scatter plot before. I'm quite pleased with how this turned out.

October 31, 2018

Analyzing Pitcher Performance With Density Heatmaps

baseballsavant , clayton kershaw , density , heatmap , justin verlander , major league baseball , MLB No comments

With Tableau 2018.3 comes density heatmaps, a feature I've been playing with quite a bit and love it for when I have a dense concentration of points and a regular scatterplot doesn't work well. Transparency can help with dense dots, but I think the heatmaps work much better.

To give it a test, I downloaded every pitch for Clayton Kershaw and Justin Verlander (two of the best pitchers in Major League Baseball) from 2008-2018 from the great stats website Baseball Savant. Every time I look at baseball data, I'm amazed at the detail of the stats covered; the data far exceeds anything that is covered in other sports.

After downloading the data, I built the small multiples view below for each pitcher so that I could see their progression through the years. Click on the images for the interactive versions. I love how the data shows me how each pitcher has gotten better with their "misses" through their careers. For example, when they throw sliders for balls, they now tend to miss below the strike zone. This is a great sign that they have command of their pitches and are less likely to miss in an area where the batter can take advantage.

The density heatmap feature will most likely be used by most people on maps, which makes sense, but consider looking at it as an alternative whenever you need to plot x/y coordinates and have lots of points to display.

October 22, 2018

Makeover Monday: Historical Major League Baseball Beer Prices

baseball , beer , highlight , major league baseball , Makeover Monday , MLB , small multiples , team marketing report No comments

It's #TC18 and we're hosting a Makeover Monday Live with 1000 people. I thought it would be fun to use (1) a simple data set and (2) my very first Makeover Monday viz. We decided to use this viz so that people could see how practicing week after week can improve their skills like it has mine.

What works well?

The title is clear and tells the reader what the data is about.
The user can sort the data based on their preference.
The placement of the sort options encourages interaction.
The rank helps show where a team falls amongst the league.
The color of the bars goes with the beer theme.

What could be improved?

The data source is not listed.
Having so labels on the end of every bar makes the viz too busy.
The beer mug icons are completely unnecessary.
The font looks very small.

What did I do?

The new data set has data for 2013-2018 (except 2017), so I wanted to make sure I looked at the data over time.
Made the title more descriptive so that the user (hopefully) understands what the line represents.
I borrowed several techniques I learned from Workout Wednesday week 41:

Shading those that have increased prices vs. 2013 with a red background
Labeling the top middle with the team and the latest price
Labeling the end of each line; in WW the labels were all placed on the lower-right of each pane, but I didn't like how it looked in this case

Ordered the teams from highest to lowest based on the latest price
Organized the team in a trellis format so they fit nicely into a 6x5 grid
Included the data source. my name, and the inspiration for the design

And here's my Makeover Monday week 41. Click on the image for the interactive version. I can't wait to see what everyone creates at MM Live!

February 28, 2018

Workout Wednesday: The MLB Diversity Gap

action , area chart , baseball , ethnicity , highlight , labels , MLB , Workout Wednesday No comments

This week, Rody had a few tricks up his sleeve for Workout Wednesday. At first, I thought we were going to need to do something with polygons like he did in this tip about shading between two lines. I started down that road, but quickly realized I'd used up all of the shelves and I needed more, so back to the drawing board I went.

Suddenly a possible solution popped into my head (I figured it out by hovering again and again over his viz). I'm not going to give away any spoilers. Here's my solution if you get stuck...but give it a solid effort before you look at someone else's solution.

Good luck!

February 5, 2018

Makeover Monday: Did the rise of Latino players signal the decline of African American players?

baseball , black history month , major league baseball , Makeover Monday , MLB , SABR 6 comments

This week marks a month long partnership that Tableau has asked us to kick off for Black History Month. To start the month, Eva posted this visualization from SABR.org about the breakdown of demographics in Major League Baseball since the year before Jackie Robinson's debut in 1947 (he was the first African American to play in MLB, also known as the person who broke the "color barrier").

What works well?

The x-axis is labeled every 10 years starting with the first year in the data set. This works well since there are 70 years in the data set.
Labeling the y-axis for every 20% keeps that axis from getting too cluttered.
The title is straight to the point.
Placing the legend in the middle of the graph allows the chart to use the entire space.
Stacking "White" on the bottom is a good choice since it's always the largest segment.

What could be improved?

As it's stacked bars, it's harder than necessary to determine the percentage that Black and Latino comprise since their position is influenced by the colors below them.
The bars appear to be of differing widths and that makes it look a bit blurry to me.
An area chart would be much easier to understand.
Consider more distinct color choices, particularly for White and Black.
The visualization doesn't flow well with the accompanying story, which was about the increase in blacks and the more recent decrease. There's no indicator to the audience that this is what the chart is about.

What did I do?

I started by exploring the data and looking for a more interesting story. Was there a reason or cause for the recent decline of blacks in MLB? Is this the same for other minorities? How does WAR come into play, if at all? All of these questions are super simple to answer with Tableau's ability to support the way your brain thinks.

In the end, the most interesting story I found was the relationship between the decline of African Americans plays and the rise of Latino players. So my viz focusses on that.

March 1, 2017

Workout Wednesday: World Series Game 7 - Pitch-By-Pitch

barcode , baseball , Brooks Baseball , Chicago Cubs , Cleveland Indians , diverging bar chart , MLB , shapes , Workout Wednesday , World Series 16 comments

When I set out to do this week's workout, I really wanted to recreate this amazing World Cup infographic:

I love this graphic! So much information packed in a compact space. But I couldn't find the data anywhere. What I decided to do instead was look at game 7 of the 2016 World Series. It's talked about as one of the greatest games of all time, so I thought I'd create something similar, but on a pitch-by-pitch basis.

I was able to find the data on Brooks Baseball. I then imported it into Google Sheets for each pitcher and then unioned them all in Tableau. I'd recommend you just use the TDE I've created this week as I've removed all of the extra columns you won't need. You can download it here.

Here are the requirements:

Each inning should be an individual row
Within each inning, show every pitch from left to right
The home team (Cleveland Indians) pitched first, so their bars should point up. Followed by the visiting team (Chicago Cubs), which should point down.
Each pitch is color coded based on the outcome - Ball, Strike, or In Play
The final outcome of each batter should be displayed as a shape and color coded. See the subtitle in my viz. Note that the open circle is filled in the middle with white so that the bar can't be seen through it.
Match my tooltips
Include the data source at the bottom
Match my title and subtitle
Viz must be a single worksheet
Viz should be 450x800
Optional: Match my font, Rubik in this case.

That's it! This shouldn't be as challenging as some of the past challenges I've posted. For me, this was more about trying to replicate a viz I liked. If you don't understand the fields or get stuck, ask for help. Good luck!

November 30, 2016

How Many Times Have Teams Been to the World Series?

baseball , Business Insider , MLB , World Series 6 comments

Yesterday I wrote about how much I liked a World Series viz created by Business Insider. One of my favourite ways to learn Tableau, and one I highly recommend to everyone, is to reproduce work that inspires me.

What was most fun about creating this viz is that it’s built completely with ASCII squares. Yes, I use a measure for the axis, but the measure is merely a placeholder. I learned a lot creating this viz this way; basically you can easily create a unit chart without having to densify the data by using a simple calculation that trims the ASCII squares instead. I also included bar charts in tooltips.

Download the workbook to see how I did it. In the meantime, here’s my take on the frequency of teams appearing in the World Series.

January 4, 2016

Makeover Monday: Bryce Harper Had the “Most Valuable” Season of Any MLB Player Since 2002

bar chart , baseball , fivethirtyeight , Makeover Monday , MLB , scatter plot , slope graph 2 comments

This week marks the beginning of a new series of makeovers. Throughout this year, Andy Cotgreave will be joining me in these makeovers so that we can learn from each other. You can participate and/or follow along every week here.

This week we looked at this table from FiveThirtyEight. The main data point in this table and the article is the Surplus Value column. Essentially, FiveThirtyEight uses WAR as a way to calculate a player's value and then compares that to what they were actually paid.

There's nothing particularly terrible about this table. It serves its main purpose: looking up facts. But what is does lack is a simple way to make comparisons between the players and more quickly show the differences between them. Ideally, I want to answer the question: How great was the 2015 season from Bryce Harper?

With this in mind, I created this visualisation. Click on the image to interact, as I have included some highlight actions. However, the view itself can stand alone without the interactivity as well.

November 19, 2015

Philadelphia Has the Worst Sports Teams in North America

76ers , Eagles , Flyers , MLB , NBA , nfl , NHL , philadelphia , Phillies , sports

I was listening to Sports Radio WIP yesterday as I was driving to get a cup of delicious Dunkin Donuts coffee. I heard one of the hosts talk about how Philly is the worst sports city in all of North America for cities that have teams in all four major sports leagues. A quick Google search turned up this article.

How bad are Philly sports teams?

The Eagles are more or less unwatchable. They’re inventing new ways to lose.
The 76ers have lost 20+ games in a row. That’s really, really hard to do in the NBA.
The Flyers couldn’t score if there was no goalie in the opposing net.
The Phillies…well, they did their best to be one of the worst baseball teams of all-time.

I took the ugly table of numbers from the article and built the interactive dashboard you see below, confirming my worst fears.

Philadelphia Has the Worst Sports Teams in North AmericaBased on results for the 2015 season as of Nov 16, 2015 for cities with teams in all four major sports

This merely confirms the misery that is being a Philadelphia sports fan.

September 28, 2015

Makeover Monday: How Much More Valuable are NFL Franchises than Other Leagues?

business intelligence , change , Chart of the Day , context , Cork Gaines , Makeover Monday , MLB , NBA , nfl , NHL , sports , variance

A couple weeks ago, Business Insider published a very simple bar chart showing the total value of all franchises for the four major professional sports in the USA. At the Data School, I’m always stressing context in visualisations.

Business Insider’s chart is lacking context, so in today’s makeover, I walk you through a few simple methods for adding context to a simple bar chart.

Makeover Monday: How Much More Valuable are NFL Franchises than Other Leagues?

April 28, 2014

Makeover Monday: What a beer will cost you at every Major League Baseball stadium

bar chart , baseball , beer , Business Insider , Chart of the Day , Cork Gaines , makeover , makeovermonday , MLB , rank , tableau , Tableau Mac 3 comments

Slight change of direction - Makeover Friday is being changed to Makeover Monday. There a certain twang to it! On to today's makeover.

Anyone that goes to a professional sporting event in the US knows how ridiculously expensive it is to enjoy some frosty goodness at the game. Cork Gaines of Chart of the Day created this bar chart to show the most expensive beers in Major League Baseball stadiums.

The basic problems:

As always, Cork has sorted the chart in the wrong order. Sorting should be based on what you want to emphasize. In this case, the story is about the most expensive beers, so the bars should be sorted in descending order.
A horizontal bar chart would be much easier to read.
Since the beers are not all the same size, it might make more sense to show an alternative view of cost per ounce.

Here's my alternative, created with Tableau for Mac.

Going to a MLB game? How much will a beer cost you?

I've not only addressed the issues I outlined, but I've also made it interactive. You can now answer more questions. Perhaps you're more interested in where you can find the cheapest beer or the best deal (per ounce). This is a much more informative version than Cork's.

Have a better way to display this data? Download the data here and the workbook here and leave a link in the comments.

May 16, 2012

Is drug testing working in baseball? An interactive analysis.

AL , baseball , baseball-reference , Chart of the Day , Cork Gaines , drugs , interactive , MLB , NL , sports , sports chart of the day , statistics , stats , steroids , tableau , Testing 2 comments

Cork Gaines wrote about the HR trend in baseball since testing started for performance enhancing drugs. He presented a chart of the trend (surprising effective given his past charts), but he never answered his own question….is testing working?

One way to determine the answer is through comparisons to other statistics.

I downloaded the season averages across both leagues and MLB in total from baseball-reference and built this interactive analysis. The stats are order by batting stats then pitching stats.

This viz allows you to compare home runs to many other statistics through the selectors at the top right. In addition you can:

View any two statistics to look for trends by choosing a primary measure and a comparison
Filter the time frame to all years, the pre-testing era, and the testing era (1993+)
Filter the leagues to focus your analysis
Click on a league at the bottom to highlight that league

In this initial view of HR vs. ERA, I see a couple of things:

HR are on a slow descent in the testing era, especially since 2000
ERA is in a similar decline, possibly indicating that improved pitches has had as much of an impact as testing
Batting Average has remained flat. This means that the reduction in HR has not impacted BA.
Teams are simply scoring fewer runs, likely due to the reduction in long balls
The trend in complete games is despicable

What do you see? Play around with the different stats and see if you can draw any conclusions.

April 4, 2012

Let your Umpire Ejection Fantasy League preparations begin!

baseball , ejections , fantasy , interactive , managers , MLB , tableau , UEFL , umpires No comments

If you are interested in fantasy baseball, but want a slightly different take on things, I have just the game for you. Join the Umpire Ejection Fantasy League. I stumbled across this league as I searched for managerial ejection data, but found this umpire-specific data much more interesting. You can download the data here.

Living here in Atlanta for the past 15 years, I’m more than aware of the historical significance of Bobby Cox’s ejections (he’s the all-time leader…or is it last place?), but I wanted to know which umpires draw the most ire from managers today (Bobby retired after the 2010 season).

To help you with your UEFL draft, take a tour of this interactive viz. There are filters on the right side of each sheet to assist you with your own analysis.

The Ejections Summary gives a quick overview of:

The spread of ejections across innings – not surprisingly most ejections occur towards the end of the game
How umpires perform as a whole at the different segments of the game – they’re correct more often at the end of the game as well, leading directly to more ejections
The top 5 reasons for ejections – arguing balls and strikes is an automatic ejections, so there’s no surprise it’s #1

The 2nd sheet, Who to Argue With & When, helps you isolate the specific time when you are most likely to benefit (or not) from an argument. In particular, I like the bar chart on the bottom right. This chart tells you the best umpire and time to get ejected if you want to turn the game from a losing position into a winning outcome.

The last sheet, Which Umpires Eject the Most, is a simple list of the umpires most likely to eject someone and the managers most likely to get ejected. Click on any manager or umpire to see who they get in the most arguments with.

Good luck in your draft!

March 4, 2012

Are teams benefiting from relievers pitching less? A visual analysis.

BABIP , baseball , BB , BF/G , ERA , ERA- , FanGraphs , FIP , FIP- , MLB , stats , strikeout , walk 2 comments

If you love baseball and particularly if you love baseball stats, you need to follow FanGraphs. The depth of the analysis is simple incredible, but one of the things I find lacking is visual analysis. There are often tables and some rudimentary charts, but I think the writing could be enhanced by adding some viz to the terrific explanations of the numbers.

Recently they wrote about the use of relief pitchers in Major League Baseball and whether adding depth to the bullpen resulted in a strong ROI. In this post, I’m going to quote directly from the article, but all of the charts and graphs that supplement the words were created by me.

All of the data that I used can be found here and the Tableau workbook I used to created the charts can be found here.

Batters Faced per Game
“The change in bullpen usage is the biggest difference in the sport now compared to 30 years ago.”

“Despite the fact that modern bullpen roles have been well established for quite a while, the dwindling rate of batters faced per appearance shows no signs of slowing down. While the drop from 1982-1991 was the most extreme, the last two decades have each seen the league shed an additional half a batter per reliever appearance, and given that we’ve seen teams now expand to carrying 13 pitchers at times, there seems to be no end in sight to this trend.”

The article only provides a table and if the writer did not include the analysis in words, there’s no way anyone would have ever been able to identify this trend by scanning their eyes across a list of number.

The chart above in broken down by decade by year and includes three methods for analyzing batters faced per game (BF/G).

BF/G (top lines) – This is simply a trend of batters faced per game over the last 30 years. As the writer points out, the drop in the first decade is the most extreme (1.6 BF/G decline), but the last two decades have each declined more than a half batter (0.6 and 0.5 BF/G respectively).
BF/G vs. 1982 (middle lines) – I wanted to understand how drastically the number of batters faced per relief appearance has really changed from 30 yrs ago. The numbers and trend are truly staggering. 19.8% decline by 1991 and additional 12.3% decline by 2001 and another 7.5% decline through 2011. That all adds up to an almost 40% decline.
BF/G vs. Start of Decade (bottom lines) – This is similar to #2 except the calculation “resets” each decade. The idea here is to measure how much the BF/G rate has changed within the decade. If the –11% trend from the last two decades continues, you can expect relief pitchers to be facing less than four batters per appearance by 2021. So basically, every pitcher would be treated like a closer.

Wow! Bullpen strategy sure has changed!

Walk and Home Run Rates
“With pitchers facing fewer batters, you’d expect them to be able to throw harder and exploit platoon advantages for better results overall. The trade-off should be more quality for less quantity.”

“Looking at the numbers, we don’t really see much evidence that the modern bullpen has helped relievers perform better at all.”

“Over the last thirty years, walk rates by relievers are essentially unchanged. They went up a bit when the home run barrage took over the late-1990s, but have gone back down as home runs have become less common.” (top two lines)
“The ratio of walks to home runs is pretty steady and consistent over the last thirty years, and there’s certainly no evidence that the modern day bullpen has helped pitchers avoid the base on balls.” (bottom lines)

Strikeout Rates
“On the other hand, strikeout rate has skyrocketed, increasing by 40% since 1982. This would seem to support the idea that relievers can be more effective in shorter stints, and that playing the match-ups can help prevent run scoring.”

I have broken down the strikeout rates similarly to the BF/G rates.

K% (top lines) – This is simply a trend of strikeout percentage over the last 30 years. K% has been on a steady increase of about 2-3% over each of the last three decades.
K% vs. 1982 (middle lines) – As the writer noted, the strikeout rate has increased 40% since 1982, with the biggest increase from 1992-2001 of 18.4%. But he also goes on to explain this:

“While (starting pitchers’) strikeout rate has been raising at the same time that the modern bullpen has been evolving, this seems to be a case where correlation is not causation. If starters are seeing the same rise in strikeout rate, that points to a more fundamental shift among hitters – more sluggers swinging for the fences, the rise in acceptance of the strikeout as just another out among organizations – rather than a specific benefit being given to relievers from their new roles.”
K% vs. Start of Decade (bottom lines) – Again, this is similar to #2 except the calculation “resets” each decade. This provides stronger evidence of the “swinging for the fences” effect of the late-1990s; strikeout rates increased 19% from 1992-2001.

BF/G vs. K%
We’ve seen the write discuss BF/G and K% rates, but do these two have a relationship? When I look at relationships between two stats, I like to look at them to ways: (1) a dual axis line chart and (2) a scatter plot.

The strikeout rate for relievers is clearly correlated to batters faced per appearance. As BF/G goes down, K% goes up. This is clear and easy to understand in both of these charts. This would have been a nice nugget for the writer to include.

BABIP and HR Rates
“Likewise, it doesn’t appear that relievers are really generating much of a benefit when hitters do put the bat on the ball.”

The write makes a few notes about the stats, but I don’t agree completely.

“Home run rates have risen at a similar rate as what starting pitchers have experienced.” Ok, nothing to argue with here. I have to take his word for it since I don’t have the data for starting pitchers.
“Batting average on balls in play has increased significantly over the years.” I subtly disagree here. BABIP has only increased 11 points or 4%. Is that significant? I don’t think so.

Let’s extend the analysis a bit farther. Let’s look again at the relationship between the two stats to find correlations.

When looking at the two lines together, there isn’t a clear correlation, like the obvious inverse relationship between BF/G and K%. But what is interesting is when you plot them on a scatter plot. I added the averages across all 30 years to each axis to make a nice quadrant chart. The R-Squared is only 0.688, but what sticks out to me is how, for the most part, the years within each decade cluster together nicely (for the most part).

Nine of ten years from 1982-1991 were below average for both BABIP and HR/9, with the tenth year also below the average BABIP.
Eight of ten years from 1992-2001 were above the average BABIP with seven of those years also above the average HR/9 (remember, there was a significant increase in HRs in the late-1990s). Note how much higher the BABIP and HR/9 rates are for the years above average.
From 2002-2011, nine of ten years were above average for both BABIP and HR/9, but not nearly as far above average as 1992-2011. Notably, the HR/9 rate fell from 1.03 in 2006 to 0.85 in 2011, a 17.5% decrease (this can be seen in the line chart).

ERA- and FIP-
“If you look at (ERA- and FIP-), there’s just no evidence that bullpens are preventing runs at a better rate now than they were before the current roster construction norms came along. Any improvements in quality of performance by the elite relievers have been offset by the fact that more innings are now being given to inferior arms, so the trade-off has essentially resulted in a change of no real benefit.”

If you truly trust the reader, then you’re only choice is to take him at face value here. Me though, I like to “see” the data. I’ve done a couple things here to quantify the data, but first, two notes.

For ERA- and FIP-, values below 100% are better than the league average. The lower the number, the better. Think of them like an index. If the ERA- is 95%, then that means it’s 5% better than the league average.
Notice that the axes range from 80-120%. This was done to emphasize the lack of significant year to year variances.

For this particular chart I have:

Added a reference line at 100% to remind us that this is the average
Synchronized the axes so that you can see how ERA- and FIP- compare to each other
Added color bands below and above the average to indicate levels of goodness and badness. That is, the darker the green, the better and the dark the tan, the worse.

Now, after having “seen” the data, I agree with the writer that “there’s just no evidence that bullpens are preventing runs at a better rate now than they were before the current roster construction norms came along.”

So what do you think? Does these charts and graphs make it easier to interpret the stats? Do they help tell the story more effectively?

Subscribe to: Posts ( Atom )

June 30, 2024

October 24, 2022

September 29, 2021

May 7, 2019

May 6, 2019

What works well?

What could be improved?

October 31, 2018

October 22, 2018

What works well?

What could be improved?

What did I do?

February 28, 2018

February 5, 2018

What works well?

What could be improved?

What did I do?

March 1, 2017

November 30, 2016

January 4, 2016

November 19, 2015

September 28, 2015

April 28, 2014

May 16, 2012

April 4, 2012

March 4, 2012

Subscriber to my Newsletter

THE BOOK

Makeover Monday

Tableau Tips

Search

About Me

Popular Posts

My Portfolio

Data Viz Favorites

Useful References

Blog Archive