Wednesday, April 09, 2008

Google takes over the Internet

Let me bring you up to speed.

Ok, that's not exactly hyperbole. While Google has been essentially the most popular first click on the Internet, it has done so as an honest broker. It has said that pages sink or swim in its search results on the basis of algorithms, not favors for favors. It is a road sign on the (sigh) information superhighway, rarely an offramp or destination.

There are some exceptions. Google has steadily built a stable of productivity applications: Gmail, Gcal (the calendar), Google Maps, Google Docs and Spreadsheets (think Word and Excel, less powerful, but collaborative). They also own some content, like Youtube (and Google Video), and of course Blogger.

You can find the bleeding edge stuff at labs.google.com. Google 411 is a fascinating example. You can call a toll-free number from anywhere, 1800 GOOG 411, talk to a robot, search for a business by type, then get connected to the business without putting down the phone. The whole service is always free.

Why is Google doing Google 411? No, no, it's not just to put regular 411 out of business, which it totally would if anyone knew about it. There's a much awesomer reason to do it. While you talk to the robot, your voice data is being collected to train speech recognition algorithms. They're trading information so their robot can listen to you. It's not much of an invasion of privacy, really (if you called real 411, you'd be telling a person what you were looking for anyway). You can even block your caller ID before calling in, if you're paranoid.

Last week, Google stuck a program out there that allows you to use their office products even when you're offline, and seamlessly resyncs your data when you regain your internet connection. Google Docs and Spreadsheets are now competing directly with Microsoft Office...

Trust me, they are doing stuff there you could not imagine.

Until now. Google's system scales better than anything else out there; the amount of data they use and process is mindboggling, and they've created several unique tools to analyze it all efficiently. This week Google opened up its internet infrastructure (in beta), allowing people access to their application servers, Google FS filesystem and BigTable databases. It's called Google App Engine.

They're allowing 500 MB of space free, and a generous amount of bandwidth per month. The level of use they are allowing is free into the future, three applications per person. If you want more space or page views, you'll pay Google some amount of money.

Chances are good that the next great internet application, the next Facebook, will be written on these servers. And best of all, it takes 90% of the thinking out of hosting internet applications. It is Internet programming for the masses, in Python. It costs nothing just to go make a cool project that anyone in the world can see. A lot of budding young programmers will do just that. And Google will buy that awesome application that already runs on their infrastructure, along with the programmers who wrote it.

I decided to be a budding young programmer. To go with my Emacs and Unix obsession, I'm getting back into Python by fooling around with it to do processing for my Netflix prize data. And of course, I'm on the waiting list for the Engine... internet programming is a somewhat hairy world. I am taking my baby steps through Unix first, then onward.

I am on the home stretch of The Baroque Cycle. It's been just about perfect. I'm on the last 200 pages (out of 2500 or so). I'll be sorry to see it go. But, there's a new Neal Stephenson novel coming out in the fall, so I'm right on time. I've also never read Catch-22, so I got that from the library.

No comments: