Our web archive

This service is not available until further notice. Please check our social media for updates.

The UK Web Archive is a digital archive that collects and preserves websites and online content published in the UK. Its goal is to make sure that important digital content is not lost, and is accessible for future generations of researchers, historians, and the public.

The National Library of Scotland joined the UK Web Archive Consortium in 2005. The Library also maintains an Archive-It collection. Archive-It fulfilled the Library’s obligation to archive the Scottish Government website between 2011 and 2012.

What is web crawling

The UK Web Archive uses web crawlers to capture and preserve selected websites, social media, and other online content. Web crawlers are a type of software that captures web pages as they look on a particular date. A crawl is the path taken by specific software to capture web pages so that they can be accessed in the future. Helpful tutorial videos are available on the UK Web Archive YouTube channel.

The archive uses a range of methods to select and curate content. This includes:

capturing important events and issues,
preserving websites that are at risk of disappearing,
and partnering with organisations to identify and preserve key online resources.

top

What we collect

The Library works to select material which does not automatically fit into the criteria for the domain crawl, or which we consider would benefit the collection by being copied more frequently. You can browse collections and check for archived versions of websites from the UK Web Archive. Please note that you may need to visit us in person to read some archived copies.

Particular collections that relate to Scotland include:

Website design is always changing, and web archiving is always catching up. We make a best effort to collect sites so that they can be replayed accurately, but copies may not always be complete.

top

How to archive a site with us

If you would like a website to be included in the collection, please nominate it. You can find out more about what can be archived on the UK web archive technical information page.

The copyright for material that is copied under legal deposit law remains with the owner. To make these accessible both onsite and online the Library must ask for permission. Alternative licencing could make it easier to make copies accessible off-site. Please consider publishing under a creative commons licence.

The Archive Ready site is a useful tool for self-assessment, which includes some advice about archivability.

Following best practice in content design helps make sure that website content can be archived and found by end users. A website archives better when it follows accessibility and web standards, because crawlers work similarly to screen reader software. Sometimes the way your website is designed can prevent a web crawler from archiving your content. The web archive site provides guidance on making your website crawler friendly.

top

News about us

Information about our collections work can be found in our blog under "21st century collections". News about web archiving can be found via social media on @natlibscot under the #WebArchiveWednesday hashtag.

top

External data and tools

GLAM workbench: This award-winning project has a useful section on analysing web archives. For example, it includes tools which analyse archived pages to identify when its text has changed. This can be used to look at open access copies from our collections in much more detail.
SHINE: This is a prototype historical search engine which queries data acquired by JISC from the Internet Archive (IA) and includes all .uk websites in the IA web collection crawled between around 1996 until April 2013.
Memento: This service can be used to check how well archived a site has been across public web archives such as the National Library of Scotland. This can be particularly useful for finding an archive copy from a particular day.
British Library labs digital research space.
Archive of Tomorrow project: This has data derived from the web archive during the Archive of Tomorrow project, and metadata from the collection, with examples of work which use them.

top

Web archive research and training

The International Internet Preservation Coalition has more information and training about using web archives.
'Research infrastructure for the Study of Archived Web materials' (RESAW) is a biennial conference for digital humanists.
The Digital Preservation Coalition holds training events on making web archives.
The Web Archiving Initiative list on Wikipedia (begun by the Portuguese Web Archive) is an interesting starting point for those with a general interesting in international web archiving.
Ogden, J., Maemura, E. '‘Go fish’: Conceptualising the challenges of engaging national web archives for digital research'. Int J Digit Humanities (2021). Open Access article.
Milligan, Ian. 'History in the Age of Abundance?: How the Web Is Transforming Historical Research'. 2019. Web. Electronic Legal Deposit Item – Available on Library premises.
Brügger, Niels., and Ralph. Schroeder. 'The Web as History Using Web Archives to Understand the past and the Present'. London: UCL, 2017. Web. Electronic Legal Deposit Item – Available on Library premises.
Ingram, Darren P. 'Web 25: Histories from the First 25 Years of the World Wide Web.' Electronic Legal Deposit Item – Available on Library premises.
Big UK Domain Data for the Arts and Humanities project (2014-2016)

top

Page last updated: 2 November 2023.

Our web archive

On this page

What is web crawling

What we collect

How to archive a site with us

News about us

External data and tools

Web archive research and training

Additional

Related guides