Google has been pushing the need for speed on the web for some time now. More and more articles are written about how speed will affect search engine optimization. I always contended that the faster and smaller a page is, the better, especially for the smaller, fresher sites. Look at the Google web crawlers. As large as Google is, the web is infinitely bigger. They can only handle so much data at any given time. The amount of data their crawlers handle is somewhere between absurd and ridiculous, but still not unlimited. Google knows which sites people want to see, which sites people go to and will index these sites more often. This is why a site like Yahoo! can change and be indexed in Google almost instantly, and smaller sites have to wait hours, days or even weeks to get indexed.
In the SEO field, getting pages indexed is half the battle, especially on large sites. If Google only allocates so much CPU time to a website, you better make sure the pages are as fast as can be. If your server is a little slow, and the pages are bloated, instead of crawling 30 pages, Google Bot may be stuck only looking at 15 pages because they took so much longer to download and look at.
Based on this theory, we spent the past few weeks optimizing queries and reducing database calls to improve page load. The amount of data we have is large compared to the current reach of the site. As we grow and expand resources, we have to constantly look at speed as an issue. Instead of throwing hardware at the problem, we started by optimizing code. As a result, we have noticed a large increase in pages indexed in correlation with the speed of the page download. Recently we completely overhauled our URL structure resulting in a loss of thousands and thousands of pages in the index, with hopes of more in the future. Take a look at the graph below to see the correlation between page speed, and pages crawled per day.