Google Developers Blog: July 2005
COLLECTED BY
Organization:
Internet Archive
The Internet Archive discovers and captures web pages through many different web crawls.
At any given time several distinct crawls are running, some for months, and some every day or longer.
View the web archive through the
Wayback Machine .
Web wide crawl number 16
The seed list for Wide00016 was made from the join of the top 1 million domains from CISCO and the top 1 million domains from Alexa.
The Wayback Machine - https://web.archive.org/web/20170705152908/https://developers.googleblog.com/2005/07/
There have been a fair number of papers published on Google technologies. At conferences, most people I've met have read the paper on which Google was founded, but the ones on GFS and MapReduce are lesser known. If you're interested, most Google research papers are posted here . I want to highlight one in particular: Rob Pike's recent draft submission "Interpreting the Data: Parallel Analysis with Sawzall ," which was submitted to Scientific Programming Journal's Special Issue on Grids and Worldwide Computing is quite interesting. If processing vast amounts of data is your thing, you may want to check it out.
We've made a small change to the patches page in response to your feedback. There's now a page of sample videos, and the full patched source tarball of VLC that we build Google Video's player from. We've also put the Google Search Appliance distribution and kernel mirror online , so as to best comply with the GPL. You can find links to all of these on the patches page. In what we think is an insanely cool development, we've gotten word that VLC has taken that patch and is integrating elements of it into VLC. You can see the changeset on their site. Thanks to the VLC team, you folks rock.
A lot of people have expressed interest in adding data to Google Earth , so we're happy to now have documentation and a tutorial about the KML file format it uses. The folks on the Keyhole BBS have done some amazing work and we wanted to help them continue to invent.