-
Notifications
You must be signed in to change notification settings - Fork 10
Overview
Whether links fail because of DDoS attacks, censorship, or just plain old link rot, reliably accessing linked content is a problem for Internet users everywhere. This isn't a new problem. ("Cool URIs don't change" exists for a reason!) To combat this, some centralized initiatives such as the Internet Archive have long been attempting to crawl and snapshot the Internet.
But more and more, just a handful of centralized entities host information online. Online centralization creates “choke points” that can restrict access to web content. The more routes we provide to information, the more all people can freely share that information, even in the face of filtering or blockages. Amber adds to these routes.
Amber automatically preserves a snapshot of every page linked to on a website, giving visitors a fallback option if links become inaccessible. If one of the pages linked to on this website were to ever go down, Amber can provide visitors with access to an alternate version. This safeguards the promise of the URL: that information placed online can remain there, even amidst network or endpoint disruptions.
Lots of copies keeps stuff safe. By default, Amber stores snapshots directly on the host website. But users can choose to store snapshots using a combination of the following third party storage and archiving systems: the Internet Archive, Perma.cc, and Amazon Simple Storage Service (Amazon S3). Amber users can choose to use these existing services to free up space, take advantage of donated host space, or simply to contribute to existing web preservation efforts. The more snapshots created and distributed on independent platforms, the greater potential to preserve access to critical content.
Amber consists of a two separate components: 1) a set of stored snapshots of links and 2) a queue of links to check and snapshot. As content is produced on your website, such as posts or pages, links are added to the queue and periodically. Amber then visits each link and snapshots the content at the link. This all happens on your own web space, and no external calls are made by default.
Before it is preserved, every link will be evaluated for both the size limitations and site-specific exclusions you set up, and the web crawler permissions the hyperlink sets up. If eligible, a snapshot is taken. If the content at the linked URL is inaccessible, or if the site has requested that Amber not create a snapshot of it, then the web page will be flagged for review at a later date and not immediately preserved.
Linked URLs are periodically reviewed so that your website, and eventually your viewers, can be provided with an existing snapshot in the case that the link goes down. (Linked URL status is determined by 400- or 500- level HTTP response status codes as returned to your web server. Due to browser security limitations, Amber cannot determine the status code as returned to the visitor.) The date of the next review of a link's status depends on whether or not the site's status has changed in the past, and sites that have recently changed site status will be flagged for review within one day. (For example, a site that was once marked as up and is now marked as down will be flagged for review.) At each review where the site's status has not changed from before, the time between the current check and the next check is increased by one day. Under this configuration the maximum time between checks is thirty days, so the status of each link is reviewed, at minimum, on a monthly basis. If Amber is configured to update snapshots periodically, maintaining content freshness, new snapshots are created at this stage.
Besides you, potential contributor? Well, Amber is an open source project led by the Berkman Klein Center for Internet & Society. It builds on a proposal from Tim Berners-Lee and Jonathan Zittrain for a "mutual aid treaty for the Internet" that would enable operators of websites to enter easily into mutually beneficial agreements and bolster the robustness of the entire web. The project also aims to mitigate risks associated with increasing centralization of online content. Amber also continues the work completed by students from the 2011-2012 Ideas for a Better Internet "Mirror As You Link" group, which developed working code for an extension to mirror WordPress blogs.
[Overview] (https://github.com/berkmancenter/amber_wordpress/wiki/Overview)
[Requirements] (https://github.com/berkmancenter/amber_wordpress/wiki/Requirements)
[Installation] (https://github.com/berkmancenter/amber_wordpress/wiki/Installation)
[Default] (https://github.com/berkmancenter/amber_wordpress/wiki/Default)
[Configuration] (https://github.com/berkmancenter/amber_wordpress/wiki/Configuration)
[Dashboard] (https://github.com/berkmancenter/amber_wordpress/wiki/Dashboard)
[Report feedback and bugs] (https://github.com/berkmancenter/amber_wordpress/wiki/Report-feedback-and-bugs)
[Known Issues] (https://github.com/berkmancenter/amber_wordpress/wiki/Known-Issues)
[Frequently Asked Questions] (https://github.com/berkmancenter/amber_wordpress/wiki/FAQ)
Need help? If you can't find the answer here, shoot us an email: [amber@cyber.law.harvard.edu] (mailto:amber@cyber.law.harvard.edu)