Baseball Prediction

(grades 7-12)

 
Introduction:    In Major-League baseball, a number of statistics are kept for each team.  Perhaps the most important team statistic is the winning percentage (the percentage of total games won).  Your team has not had a good season for a couple of years.  Your owner has decided to invest some serious money in building the team.  She has charged your group with determining which of the following players would help the most in improving the winning percentage:  a home run hitter, a high-average hitter, a hitter who bats in more runs, a base stealer, or a pitcher with a low earned-run-average.  In this webquest, you will examine the relationship of each of these statistics with the winning percentage to determine which other statistic is most closely related to it.

Task:  Using internet resources, you will obtain the team statistics for at least ten teams and use correlation to determine which other statistic is most closely related to the winning percentage.  Once you have determined the most closely related statistic, find the league player who was highest in that statistic, and use linear regression to predict the winning percentage with that player on your team.

Resources:

Major League Baseball:  http://www.majorleaguebaseball.com/
Baseball Statistics Archive (USA Today) http://www.usatoday.com/sports/baseball/sbstats.htm
Current Baseball Stats (CNN SI):  http://www.cnnsi.com/baseball/mlb/teams/

Process:

  1. Determine which ten teams you will study.
  2. For each of the ten teams, use internet resources to find the following team statistics:  Winning Percentage, Home Runs, Batting Average, RBI's, Stolen Bases, and ERA.
  3. Using a statistical calculator, find the correlation of each of the above six statistics with winning percentage.  You will also want to obtain the regression equation where winning percentage is the y or dependent variable, and the other variable is the x or independent variable.
  4. Identify the variable that is most strongly related to winning percentage.
  5. Use internet resources to determine which player led the league (your choice-National or American) in this statistic.
  6. Use the regression equation obtained in #3 to predict the winning percentage, given that your team obtains this player.  (That is, his stat becomes the x variable in the equation, and you are obtaining y.)
  7. Prepare an oral argument for the owner, explaining your reasoning and convince her that your team needs this player.


Conclusion:
 

In this activity, you will use the internet as a data source for a problem solving activity.  You will be connecting baseball data to analysis and representation.  You will use your reasoning skills to reach a conclusion based on the data, and will communicate your conclusions in an oral report.

Regression Worksheet Sample