Thursday, December 30, 2004

H.R. 4818 Part 1

The Omnibus Spending Bill for Fiscal Year 2005 is known as H.R. 4818 ( Here are some quick statistics on this bill.


It consists of:

  • 685 pages
  • 7,675 paragraphs
  • 27,869 Lines

Reading at an average rate of 350 words per minute it would take you about 14 hours to read the whole thing. I can assure you, I haven't event gotten 10% through it yet. Not sure I'm that much of a masochist.

I did start to see some patterns in the bill which I wanted to explore with an actual analysis of the individual values authorized for the various expenditures. I noticed that $1,000,000 was a very frequent amount for a dollar value. In fact, the vast majority of all dollar values were nice round numbers as in $1,000,000 , $2,000,000 and so on. I suspected that $1,000,000 was the most common amount for an expenditure.

I started doing a manual count (using the browser's Find function) for the text '$1,000,000' within the bill and lost count several times through this exercise. Frustrated, I went ahead and imported the full text of the bill into Microsoft Word and wrote a Word Macro program to filter out all the numbers in the text. This filtered list was then imported into Microsoft Excel which allowed me to do some analysis on the individual expenditure values.

Sure enough, $1,000,000 was the most frequent expenditure. In fact,$1,000,000 was referred to 117 times or roughly 7% (117 of 225) of all non-unique dollar values referred to in the document. I'll note that not all the dollar values were specifically for expenditures. Sometimes a value of $50, for example, was referred to establishing a fee amount as opposed to a grant or expenditure.

More food for thought on these non-unique (or frequently occurring) dollar values: Non-unique dollar values were referred to more than once a total of 1,650 times. Of those 1,650 references, unique numbers within that population represented 225 individual values.

A quick Pareto analysis (the 80/20 rule) on these 225 (the unique instances within the non-unique reference population) shows that distribution of these dollar values contains the top 38% of these (86 of 225 non-unique by frequency count) representing 80% (1,321 of the total 1,650 references) of all non-unique references.

So a number like $1,000,000 had a 7 out of 100 chance of being one of the budget numbers referred to more than once. Of all number values like $1,000,000 , $2,000,000 and $500,000 that were frequently utilized (more than once 80% of the time) those numbers represented almost 40% (86 of 225) of the non-unique numbers.

Essentially a few numbers were frequently utilized. The total dollar value of this frequently referred to sub-population of budget numbers was $22,421,572,500. The largest value referred to more than once was $35,000,000,000; referred to twice: once for the Highway Budget and then again with regards to the National Housing Act.

The statistical average dollar value of this sub-population of budget numbers is $515,646,599. The statistical median dollar value where the dollar value is referred to more than once is $8,000,000; its frequency is 15 times.

Here's a quick run down of the next 9 highest dollar values in terms of the frequency by which they are referred to (Dollar Value/Frequency):

  • $2,000,000/94
  • $500,000/ 66
  • $5,000,000/54
  • $10,000,000/53
  • $100,000/45
  • $4,000,000/37
  • $3,000,000/31
  • $25,000,000/29

As one can see, all nice round numbers. Any guess on whether-or-not these values were rounded down or up based on the original funding requests?

In my next post I start looking a bit closer at some of the actual expenditures by looking closesly at those nice round $1,000,000 amounts.


