Free Super-Crunching Software

I probably have an unhealthy attraction to the powers of Excel. I taught my daughter how to use it when she was 7. When I teach corporate finance, I try to make sure that my law students come away from the course knowing how to crunch in Excel.

It would be embarrassing to teach students how to use Microsoft Word in a law-school course; but one of the goals of my corporate finance course is to make sure that they can comfortably manipulate its numerical cousin.

A middle-school math teacher recently told me that there are some things you can do on a graphing calculator that you just can’t do in Excel. I’m pretty sure (like 99 percent sure) that this is not true. In fact, Microsoft has expanded the functionality of Excel so that it’s starting to invade the domain of statistical packages.

The just-published (shameless plug) paperback edition of Super Crunchers has a new chapter that describes several different free tools that make it easier and easier to crunch numbers.

1. Microsoft has a new data-mining add-in that lets you run all kinds of cool statistical procedures inside Excel. Taking a page from the Google playbook, Microsoft is just giving this add-in away (but it only works if you’ve purchased the Office 2007 version of Excel).

2. Google (taking a page from its own playbook) is giving away its Website Optimizer, which will let you run randomized experiments on your own web page.

Any webmaster who is not running randomized trials on different page content is making a serious mistake.

Here’s an explanatory video. I’ve used the Website Optimizer myself and it is a joy to use.

3. I’ve created and assembled links to a bunch of cool “prediction tools” that let you plug in a few numbers and predict how long you’ll live, predict your due date (if you’re pregnant), rate the quality of a book title, or even predict political or sporting contests.

One of the cool things about these tools is that they provide feedback on the precision of predictions that is easy to digest. When you see the results of an experiment like this one below, you have a pretty clear idea of not only the winner, but of how confident you should be in the results.

INSERT DESCRIPTION

(As with all other statistical tests, you should not just blindly accept the p-values in the print out, but these graphics are still a huge leap forward.)

A fourth freebie is the open-source statistical package called “R.” While most members of the Freakonomics crowd tend to use Stata as their statistical package of choice (and businesses tend to run SAS or SPSS), R is the Linux of statistical software. It lets you do an awful lot for free.

Of course, having mastered the commands of Stata and SAS, I have poor incentives to learn the commands of a new (GNU) software. And R is probably not kept up to speed on the cutting-edge empirical methods as quickly as the traditional packages. (I should disclose that SPSS and SAS have paid me handsomely to give Super Crunching talks, so I may not be the most objective observer.)

But then again, R has plenty of power to run the vast majority of statistical techniques. There is still a huge discrepancy between the techniques that are used by academics and those used in business.

In fact, here’s a Super Crunchers bleg: Can anyone identify an instance where a business has run an instrumental-variables regression?

The I.V. approach has been around for decades and is a standard (if misused) technique in hundreds, if not thousands, of academic articles. But provocatively, I’d almost bet that it has never yet been used by a corporation to help make a business decision. We’ll send some Freakonomics schwag to the first person who can prove me wrong.

Leave A Comment

Comments are moderated and generally will be posted if they are on-topic and not abusive.

 

COMMENTS: 59

  1. alex says:

    It’s not just a divide betwee busiess and academics.

    There is also a divide between academics and policy people in government.

    And the on that really gets me is how simplistic the analytical methodogy of most published papers in most fields are. Perhaps this is not true of economics — not my field — but the linear regression rules in most other fields.

    Thumb up 0 Thumb down 0

  2. jonathan says:

    Cool. I always wanted to be able to predict Abalone Age.

    Thumb up 0 Thumb down 0

  3. Scott says:

    I’ve heard that Google uses instruments when evaluating the effectiveness of new tools given to advertisers to optimize their AdSense ROI. Selection may occur when unobservable characteristics of the advertisers are related to both outcomes and the decision to use the tool.

    Thumb up 0 Thumb down 0

  4. Damon says:

    I’m also a business SAS user with experience in Credit Risk and Marketing. Both environments are heavy SAS due to acheiving a critical mass in expertise a long time ago. While I agree that SAS requires a lot of training up front compared to R the two industries prefer them for a couple reasons. 1) R is open source and I just don’t think financial companies want to risk their analyses accurateness on open source software that they have to justify to various regulatory bodies. They prefer other risks. 2) The analysts themselves perpetuate their use. It’s an expensive barrier to entry between formal education then adding SAS technical training. This keeps the supply of analysts low and our salaries high.

    As for IV? Can’t say that I’ve used it, then again my bread and butter is logistic.

    Thumb up 0 Thumb down 0

  5. Alex says:

    Excel? S.A.-what?

    Matlab is the way to go.

    Thumb up 0 Thumb down 0

  6. Ankur Goel says:

    Excel can solve complex expressions for your choice of variables – it’s the add-in called “solver” (you have the option to install it when you install Excel), which will use a simplex solver for linear programs, and a non-linear solver for non-linear problems. Unlike your calculator, it will solve for integer variables and can have multiple constraints on your variables (such as bounds). Uber-cool…

    As for integrals and derivatives – Excel cannot do this out of the box, but those numerical methods are relatively simple to program as VBA macros.

    Thumb up 0 Thumb down 0

  7. Chewxy says:

    I’ve raised this question to my math and stat friends (who also happen to be champions of free software) – why use R when EViews or Stata gives you beautiful results with a click of a button? (yes, R is free, but I value my time more :D )

    As for IV regression, my startup (pressyo.com, launching October 10th) uses it in the business process. Then again we are regressing to find causes of market fluctuations of sorts. It’s more of a background thing than a research commissioned to find causes.

    Thumb up 0 Thumb down 0

  8. Andy says:

    When discussing Excel’s limitations, it is useful to remember that Excel allows creation of macros using VBScript. Technically, this makes its actual limitations more akin to the limitations of the computer itself. When you’re talking about things that Excel “can’t do”, you really mean things that it can’t do out of the box, or things that it can’t do quickly, or things it can’t do (INSERT ADVERB HERE). If you can program VBScript, or know someone who can, you can make Excel do just about anything you want.

    As evidence of Excel’s surprising capabilities, and as an attack against everyone’s productivity, I present:

    http://www.excelgames.org/

    Thumb up 0 Thumb down 0