Thursday, July 7, 2011

The + in Google+ is Temporary

It seems obvious to me now that the + in Google+ is just temporary. Google+ is the next major version of Google. It is Google 2.0, a grand unification of all of their products which we will soon collectively refer to as just Google.

Videos, Pictures, Search, Mail are becoming features of the product called Google. It won't be long until people start saying things like "Did you see the pictures I posted to Google?".

The + is going away. And it is going to happen very fast.

Sunday, April 10, 2011

Optimizing a Batting Order With Brute Force Code

It is spring time and for me that means another softball season. With it comes many of the same bar arguments my softball buddies and I have year after year. Most often these arguments are around softball strategy. Yes, we like to have fun in our co-ed softball league but we also like to win! One of the more common debates we have is how to best optimize our batting order. The two competing strategies on the team are:
  • strategy #1: stack the lineup - put all your best players together (up front) so we have a few very strong innings at the expense of more weak innings.
  • strategy #2: spread the lineup - mix your best players with the weaker players so you reduce the number of very weak innings.
Today I decided to try to finally settle the issue by doing one of the few things I enjoy as much as playing softball: writing code.

I figured it shouldn't be too hard to write a program to simulate a bunch of softball games with various lineups and report back the average runs scored per game per lineup. Then we can look at the best and worst performing lineups and see what it tells us.

Obviously no simulator is perfect. You must always make simplifications and assumptions. With that said, here are the basic rules I used to develop my simulator:
  • A player is defined by 1) a batting average and 2) a "hit profile".
  • The batting average defines the probability he/she will get a hit (instead of an out).
  • If the batter gets a hit the "hit profile" specifies the probability of getting each type of hit.
  • I defined 5 different types of hits:
  • bloop single - advances everyone 1 base
  • line drive single - advances everyone 2 bases (except the batter who stops at 1st base)
  • double - advances everyone 2 bases
  • triple - advances everyone 3 bases
  • homerun - advances everyone 4 bases
  • For each lineup I simulate 100 games and calculate the average runs scored per game.
  • I defined 5 general classes of players:
  • "power" - .600 average, 45% singles, 30% doubles, 5% triples, 20% homeruns
  • "hitter" - .500 average, 70% singles, 30% doubles
  • "avg" - .400 average, 80% singles, 20% doubles
  • "belavg" - .300 average, 100% singles
  • "weak" - .200 average, 100% singles
  • On my roster I defined 11 players:
  • 2 "power" hitters
  • 1 "hitter" hitter
  • 2 "avg" hitters
  • 3 "belavg" hitters
  • 3 "weak" hitters
With 11 players on the team we have almost 40 million different possible lineups (11! = 39,916,800). My simulation tried ALL of them. It took about two hours to run on my mac book pro (2.66 GHZ Core 2 Duo with 8GB ram) and produced a 4.8GB output file.

And the results?

The worst performing lineup scored an average of 2.57 runs per game. This losing lineup (with batting averages in parenthesis) was:
  1. belavg (.300)
  2. hitter (.500)
  3. avg (.400)
  4. weak (.200)
  5. weak (.200)
  6. weak (.200)
  7. avg (.400)
  8. belavg (.300)
  9. belavg (.300)
  10. power (.600)
  11. power (.600)
The best performing lineup scored an average of 6.4 runs per game. This winning lineup (with batting averages in parenthesis) was:
  1. power (.600)
  2. hitter (.500)
  3. avg (.400)
  4. avg (.400)
  5. power (.600)
  6. belavg (.300)
  7. weak (.200)
  8. weak (.200)
  9. weak (.200)
  10. belavg (.300)
  11. belavg (.300)

For now I will reserve judgement and just report the findings. What do you think?

The code I developed for this simulator can be found here:

Note: Two important factors that my simulation currently ignores are player speed and gender:
  • player speed - the simulation currently assumes that all players will advance the bases at the same speed which obviously does not match reality. A slow base runner may cause a hitter to get a single or double when he/she might have had a triple or homerun with faster runners on base. Slow runners on base will also cause the current batter to have a lower batting average since the base runners are more likely to get forced out at theur next base.
  • gender - our league has a few rules with respect to gender that are not currently simulated. One rule for example will penalize a team for "walking a male player to get to a female player". Whether it was intentional or unintentional, if a male walks, and the next batter is female, the male gets two bases instead of one. This effectively increases the likelihood of male batters, who hit directly before females, getting doubles.
These may be supported in version 2 of my sim. :-)