Google: A Behind-the-Scenes Look
I just found this webcast of a lecture on the technical details of Google. Probably not of much general use, but very interesting to a geek like me
One thing it did highlight to me is just how much procesing power they have and how easy it is for them to access and experiment with. In Aug03 they ran almost 30,000 jobs, accounting for 217 years of CPU time and reading 3,288TB of data. Yet, on average each job lasted just over 10minutes.
With the recent talk of LSI I had been doing some tests myself and had pretty well convinced myself that it would be too processor heavy to use. (a test on 2000 product descriptions took 5hrs to compute). Having seen that presentation it is clear they are using LSI or something that produces very similar results.
There are a couple of related papers: -
MapReduce & Google File System
|