Google Developers Blog: July 2005
COLLECTED BY
Organization:
Internet Archive
These crawls are part of an effort to archive pages as they are created and archive the pages that they refer to. That way, as the pages that are referenced are changed or taken from the web, a link to the version that was live when the page was written will be preserved.
Then the Internet Archive hopes that references to these archived pages will be put in place of a link that would be otherwise be broken, or a companion link to allow people to see what was originally intended by a page's authors.
The goal is to
fix all broken links on the web .
Crawls of supported "No More 404" sites.
A daily crawl of more than 200,000 home pages of news sites, including the pages linked from those home pages. Site list provided by
The GDELT Project
The Wayback Machine - https://web.archive.org/web/20181009152651/https://developers.googleblog.com/2005/07/
There have been a fair number of papers published on Google technologies. At conferences, most people I've met have read the paper on which Google was founded, but the ones on GFS and MapReduce are lesser known. If you're interested, most Google research papers are posted here . I want to highlight one in particular: Rob Pike's recent draft submission "Interpreting the Data: Parallel Analysis with Sawzall ," which was submitted to Scientific Programming Journal's Special Issue on Grids and Worldwide Computing is quite interesting. If processing vast amounts of data is your thing, you may want to check it out.
We've made a small change to the patches page in response to your feedback. There's now a page of sample videos, and the full patched source tarball of VLC that we build Google Video's player from. We've also put the Google Search Appliance distribution and kernel mirror online , so as to best comply with the GPL. You can find links to all of these on the patches page. In what we think is an insanely cool development, we've gotten word that VLC has taken that patch and is integrating elements of it into VLC. You can see the changeset on their site. Thanks to the VLC team, you folks rock.
A lot of people have expressed interest in adding data to Google Earth , so we're happy to now have documentation and a tutorial about the KML file format it uses. The folks on the Keyhole BBS have done some amazing work and we wanted to help them continue to invent.