 Okay so once again I need to thank Cliff Lynch for inviting me to give this talk and for letting me use the participants in Berkeley's information access seminars as a as guinea pigs to debug it. This talk is basically what I did on my summer vacation, writing a report under contract to the Mellon Foundation entitled emulation and virtualization as preservation strategies. As usual you don't have to take notes or ask for the slides because in expanded text with links to the sources will go up on my blog shortly. The report itself is available from the Mellon Foundation and from the LOX website. Now I'm old enough to know that giving talks with live demos over hotel internet is really a bad idea. So I must start by invoking the blessings of the demo gods and asking you to free up the bandwidth. If you're interested in this talk please put your device to sleep. If this is just a comfortable place to drink coffee and use Facebook then please go away. So emulation and virtualization technologies have been a feature of the information technology landscape for a long time going back at least to the IBM 709 in 1958 but their importance for preservation was first brought to public attention in Jeff Rothenberg's seminal 1995 Scientific American article ensuring the longevity of digital documents. As Apple was using as he wrote emulation in the transition of the Macintosh from the Motorola 68000 to the PowerPC the experience Rothenberg drew on was the rapid evolution of digital storage media such as tapes and floppy disks and of applications such as word processors each with their own incompatible format. His vision can be summed up as follows documents are stored in offline media which decay quickly whose readers become obsolete quickly as do the proprietary closed formats in which they're stored. If this isn't enough operating systems and hardware change quickly in ways that break the applications that render the documents. Rothenberg identified two techniques by which digital documents could survive in this unstable environment contrasting the inability of format migration to guarantee fidelity with emulation's ability to precisely mimic the behavior of obsolete hardware. Rothenberg's advocacy notwithstanding most digital preservation efforts since have used format migration as their preservation strategy. The isolated demonstrations of emulation's feasibility such as the collaboration between the UK National Archives and Microsoft has had little effect. Emulation was regarded as impractical because it was thought correctly at the time to require more skill and knowledge to both create and invoke emulations than scholars wanting access to preserved materials would possess. It took Nick Lee about four hours to get this emulation of Mac OS 7 running on his Apple Watch. Hacking jewels followed with Nintendo 64 and PSP emulators on his Android Wear. Simply getting one of the many available emulators running in a new environment isn't that hard but that isn't enough to make them useful. Recently teams at the Internet Archive at Freiburg University and at Carnegie Mellon have shown frameworks that can make emulations appear as normal parts of webpages. Readers need not be aware that emulation is occurring. Some of these frameworks have attracted substantial audiences and have demonstrated that they can scale to match. So this talk is in four parts. First I will show some examples of how these emulation these frameworks make emulations of legacy digital artifacts, those from before about the turn of the century usable for unskilled readers. Next I will discuss some of the issues that are hampering the use of these frameworks for legacy artifacts. Then I will describe the changes in digital technologies over the last two decades and how they impact the effectiveness of emulation and migration for that matter in providing access to current digital artifacts. I'll conclude with a look at the single biggest barrier that has and will continue to hamper emulation as a preservation strategy. So a digital preservation system that uses emulation will consist of three main components. One or more emulators capable of executing preserved system images, a collection of preserved system images together with the metadata describing which emulator configured in which way is appropriate for executing them and a framework that connects the user with the collection and the emulators so that the preserved system image of the user's choice is executed with the appropriately configured emulator connected to the appropriate user interface. And here's where the risky part starts. So from 1995 to 1997 Teresa Duncan produced three seminal feminist CD-ROM games, Chop Suey, Smartie and Zero Zero. Rhizome, a project hosted by the new museum in New York, has put emulations of them on the web. Playing these games has proved very popular. Several days after their initial release they were being invoked on average once every three minutes. So what was going on when I clicked, well what would have been going on if we'd waited for a while, while I clicked Smartie's play button. So the browser connects to a session manager that's running in Amazon's cloud which notices that this is a new session. Normally it would have authenticated the user, but because these CD-ROMs are publicly open access, it doesn't need to. Then it assigns one of its pool of running Amazon instances to run the session's emulator. Each instance can run a limited number of emulators. If no instance is available when the request comes in, it can take up to 90 seconds to start another. I think that was the first part of the delay. Then the emulator starts, short delay, the user sees the Mac boot sequence and then the CD-ROM starts to play, which is where we left it. At intervals, the emulator sends the session manager a keep alive signal. Emulators that haven't sent one in 30 seconds are presumed dead and their resources are reclaimed to avoid playing the cloud provider for unused time. So Rhizome and others such as Yale, the DNB, and ZKM Karlsruhe use technology from the BWFLA team at the University of Freiburg to provide emulation as a service. The GPLv3 license framework runs in the cloud to provide comprehensive management and access facilities wrapped around a number of emulators. It can also run as a bootable USB image or via Docker. BWFLA encapsulates each emulator so that the framework sees three standard interfaces. First, data IO, which connects the emulator to data sources such as disk images, user files, an emulated network containing other emulators, and the internet. Second, interactive access. Connecting the emulator to the user using standard HTML5 facilities. And third, control, providing a web services interface that BWFLA's resource management can use to control the emulator. The communication between the emulator and the user takes place via standard HTTP on port 80. There's no need for a user to install software or browser plug-ins and no need to use ports other than 80. Both of these are important for systems targeted at use by the general public. BWFLA's preserved system images are stored as a stack of overlays in QEMU2's QCOW2 format. Each overlay on top of the base system represents a set of rights to the underlying image. For example, the base system image might be the result of an initial install of Windows 95. And the next overlay up might be the result of installing WordPerfect into the base system. Or, as Cal Lee pointed out yesterday, an overlay might result from redacting the underlying base, the underlying image. Each overlay contains only those disk blocks that differ from the stack of overlays below it. The stack of overlays is exposed to the emulator as if it were a normal file system via fuse. The technical metadata that encapsulates the system disk image is described in the paper presented to the IPress conference in last November, using the example of emulating CD-ROMs at the DNB. Broadly, it falls into two parts, describing the software and hardware environments needed by that particular CD-ROM. And it's encoded in XML, and it refers to the software image components via the handle system, providing a location independent link to access them. So, Ilya Kramer has used the same technology to implement Oweb today, so let's try that one. Okay, so there's a pull-down menu which allows you to choose a past browser, and I'm choosing IE on Windows, and then we can, if the gods are with us, which they may not be, we can look at old websites using a contemporary browser. Okay, so it's booted up Windows, and here's the BBC News front page from October the 13th, 1999, and as you can see, not a lot has changed in politics since then. So, behind, as well as the emulation behind here, there's another interesting point. This is using Memento to collect preserved resources from a whole range of web archives. In this particular case, as we see in the bottom left here, all these resources come from the Internet Archive. It's also looking at the Library of Congress, the BLs collection, I think almost a dozen web archives now, and one of the interesting things is that if you use different old browsers to look at the same web page, they don't necessarily all load the same set of resources. This is going to give me hours of harmless fun playing with it, figuring out what's going on. Let's go back over here. So this is an old web today is an excellent example of how emulation frameworks can deliver useful services layered on top of archived content. Here is, if I can try this, oops, here we go. This is Windows 3.1 and this is 1997's TurboTax. So let's see what my tax situation looked like 20 years ago. Let's go to the forms, David S. You notice how the IRS doesn't understand that some people have two middle names? I probably better stop there before I reveal anything too embarrassing. So the outside window is from Chromium Browser running on my Ubuntu 14.04 laptop, which is actually a Chromebook. It's incredibly affordable way of doing it. The top and bottom bars in the window here control the emulation and between them is the QEMU emulator emulating Windows 3.1 PC. So the interesting question is where's the system disk that has Windows 3.1 and TurboTax installed on it? The answer is that the system disk is actually a file on a remote Apache web server. The emulators disk accesses are being demand paged over the internet using standard HTTP range queries to the files URL. This system is OLLIF developed at Carnegie Mellon University by a team under my friend Professor Mahadev Sacha Norayaman and released under GPLv2. VMNetX, which is the framework, runs a sophisticated two-level caching scheme to provide good emulated performance even over slow internet connections like the one we were just using. One reason this works so well is that successive emulations of the same preserved system image are very similar. So prefetching blocks over into the pristine cache, the low-level cache that contains the unmodified image, is effective in producing good performance. If the system writes to the disk, it's captured in a high-level cache which maintains the written system image. So because the emulator is actually running on my laptop, the only network traffic is for the system image and it has this sophisticated caching scheme, you get good performance. In fact, you can get really good performance even over cellular networks. But it requires a non-standard software install and an up-to-date Linux or Windows-based system. The system can run in the cloud, but it needs to be a cloud that is really close to you so that the latency is good. Okay, so this is, I'll escape over here, is, this is the VisiCalc, Dan Bricklin and Bob Frankston's world's first spreadsheet. This is running in the previous two emulations used QEMU2, QEMU. This one is using mainMess. Main was the emulator developed by the game enthusiasts for emulating arcade games. And people, another separate team, took that emulator and moved it to emulate computers, which is mess. And recently, after an enormous effort to get everybody to agree to the license, they have merged the two systems. The important thing to notice about the world's first spreadsheet is that in typical Microsoft fashion, when they launched Excel, they changed all the key bindings. So you will not be able to know how to use this spreadsheet until you found the reference card, which Dan Bricklin fortunately put on the web. And in the, I think, yes, in the reviews, I put a link where you can find the reference card. But it is, once you've found it, it is perfectly usable spreadsheet. So what was going on as far as it was? So the browser loads, when I clicked on the go button, the browser loads a JavaScript program that then loads metadata describing the emulation. And then the program loads the emulator, which is in JavaScript and the system image. And then the emulation runs inside the browser. The emulators are compiled from their original source code into JavaScript. And you might think that the performance of running the emulator locally by adding another layer of emulation, the JavaScript virtual machine would be inadequate. But this isn't case. Two reasons. Firstly, even my Chromebook is enormously more powerful than the Apple II that it was emulating. And second, the performance of JavaScript in the browser is critical to its success. So large resources are expended on optimizing it. This framework is the one that's underlying the Internet Archive software library, which currently holds nearly 36,000 items. Including more than 7,300 for a Mercedes DOS, 3,600 for Apple, 2,900 console games and 600 arcade games. Some can be downloaded, but most can only be streamed. The oldest is an emulation of a PDP-1 with a deck 30 display running the space war game from 1962, which is more than half a century ago. As I can testify, having played space war and similar games on Cambridge University's PDP-7 with a deck 340 display seven years later, this emulation works really well. The quality of the others is mixed. Resources for QA and fixing problems are limited. With a collection of this size, problems are to be expected. Jason Scott crowdsources most of the QA. His method is to see whether the software boots up and if so, put it up and wait to see whether visitors who remember it post comments identifying problems or whether the copyright owner objects. The most common problem is sound. Problems with sound support in JavaScript affect BWFLA and to some extent Oliver as well. So all three groups can share a set of concerns about emulation technology. The first is about the emulators themselves. There are a lot of different emulators out there, but the open source ones that use for preservation fall into two groups. First, there's QEMU. It's well supported mainstream open source software, part of most Linux distributions. It emulates or virtualizes a range of architectures including x86, x8664, ARM, MIPS and Spark. It's used by both BWFLA and OLLIVE, but both groups have encountered irritating regressions in its emulation of older systems such as Windows 95. It's hard to get the QEMU developers to prioritize fixing them, since emulating current hardware is its primary focus. There's an interesting paper, I think in SOSP, from a group at the Technion who got together with Intel to use the tools that Intel uses to verify silicon to verify their emulator. They found and fixed a lot of interesting bugs. Second, there's the enthusiast-supported emulators for old hardware, including main mess, Basilix 2, Sheep Shaver and DOSbox. These generally do an excellent job of mimicking the performance of a wide range of obsolete CPU architectures, but have some issues mapping the original user interface to modern hardware. Jason Scott has done great work in encouraging the retro gaming community to fix problems with these emulators, but for long-term preservation, their support causes concerns. Another concern is metadata. Emulations of preserved software such as those I failed to demonstrate require not just the bits forming the image of a CD-ROM or system disk, but also several kinds of metadata. First, there's technical metadata describing the environment needed in order for the bits to function. Tools for extracting technical metadata for migration, such as Jov and Droid exist, as do the databases on which they rely such as PRNOM, but they're inadequate for emulation. The DNB and BWFLA teams, I Press 20 paper, describes an initial implementation of a tool for compiling and packaging this metadata, which worked quite well for the restricted domain of CD-ROMs, but much better broadly applicable tools and databases are needed if emulation is to be affordable. Second, you need bibliographic metadata describing what the bits are so that they can be discovered by potential users. Third, you need usability metadata describing how to use the emulated software. An example is the Visicalt reference card describing the key bindings of the first spreadsheet. And fourth, you need usage metadata describing how the emulations get used by readers, which is needed by cloud-based emulation systems for provisioning and for page rank type systems in discovery. The web provides high quality tools in this area, although a balance has to be maintained with user privacy. The Internet Archives' praiseworthy policy of minimizing logging does make it hard to know how much their emulations are used. There are no standards or effective tools for automatically extracting the bibliographic or usability metadata. It has to be hand-created. The Internet Archives' approach of crowdsourcing the enhancement of initially minimal metadata in these areas works fairly well, at least for games. Fidelity. So in a Turing sense, all computers are equivalent, so it's possible, and indeed fairly common, for an emulator to precisely mimic the behavior of a computer's CPU and memory. But physical computers are more than a CPU and memory. They have IO devices whose behavior in the digital domain is more complex than Turing's model. Some of these devices translate between the digital and analog domains to provide the computer's user interface. A user experiences an emulation via its analog behavior, and this can be sufficiently different to impair the experience. Consider the emulation of space wars on the PDP-1. The experience of pointing and clicking at the Internet Archives web page, pressing left control and enter to start, watching a small patch in one window on your screen, among many others, and controlling your spaceship from the keyboard is not the same as the original. That experience included loading the paper tape into the reader, entering the paper tape bootstrap from the address and test word switches on the left, pressing the start switch at the bottom left, and then each player controlling their ship with three of the six sense switches on the right. The display was a large round flickering CRT. Another concern is loads and scaling. One advantage of it, the framework such as the Internet Archives and Olives, is that each additional user brings along with them the compute power needed to run their emulation. Frameworks in which the emulation runs remotely must add resources to support added users. The release of the Theresa Duncan CD-ROMs attracted considerable media attention and the load on Ryzone's emulation infrastructure spiked. Their experience led Ryzone to deploy their infrastructure on Amazon's highly scalable elastic beanstalk infrastructure. Class Richard of Frybug computes that Amazon EC2 charges for an 8CPU machine about 50 cents an hour. In the case of Balm Iraq, which was another of their emulations, the average session time of a user playing with the emulated machine was 15 minutes. Hence the average cost per user is about 2 cents if the machine is fully utilized. In the peak, this would have been about $10 a day, ignoring Amazon's charges for data out to the Internet. Nevertheless, automatically scaling to handle unpredictable spikes in demand always carries budget risks and rate limits are essential for cloud deployment. So why are most of the emulations games? Well, emulation for preservation was pioneered by video game enthusiasts. This reflects a significant audience demand for retro gaming, which, despite the easy and formal availability of free games, is estimated to be a $200 million a year segment of the $100 billion a year video games industry. Because preserving content for scholars lacks the business model and fan base of retro gaming, it's likely that it will continue to be a minority interest in the emulation community. There are relatively few preserved systems other than games for several reasons. First, the retro gaming community has established an informal modus vivendi with the copyright owners. Most institutions require formal agreements covering preservation and access, and, just as with academic journals and books, identifying and negotiating with individually with each copyright owner in the software stack is extremely expensive. Second, if a game is successful enough to be worth preserving, it must be easy for an unskilled user to install, execute and understand, and thus easy for a curator to create a preserved system image. The same is not true for artefacts such as artworks or scientific computations, and thus the cost per preserved system image is much higher. Third, a large base of volunteers is interested in creating preserved game images, and there's commercial interest in doing so, preserving other games, other genres requires funding. And fourth, techniques have been developed for mass preservation of, for example, webpages, academic journals and ebooks, but no such mass preservation technology is available for emulations, until it is the cost per artefact preserved will remain many orders of magnitude higher. Okay, so that's legacy artefacts. As we've seen, emulation can be very effective at recreating the experience of using them, but the artefacts being created now are very different in ways that have a big impact on their preservation, whether by migration or emulation. So before the advent of the web, digital artefacts had easily identifiable boundaries. They consisted of a stack of components starting at the base with some specified hardware, an operating system, an application program and some data. In typical discussions of digital preservation, the bottom two layers were assumed and the top two instantiated in a physical storage medium, such as a CD. The connectivity provided by the internet and subsequently by the web makes it difficult to determine where the boundaries of a digital object are. For example, the full functionality of what appear on the surface to be traditional digital documents, such as spreadsheets or PDFs, can invoke services elsewhere on the network, even if only by including links. The crawlers that collect web content for preservation have to be carefully programmed to define the boundaries of their crawls. Doing so imposes artificial boundaries, breaking what appears to the reader as a homogeneous infidation space into discrete digital objects. Indeed, what a reader thinks of as a web page, typically now consists of components from dozens of different web servers, most of which do not contribute to the reader's experience of the page at all. They're deliberately invisible, implementing the web's business model of universal fine-grained surveillance. Tim Berners-Lee's original web was essentially an implementation of Vannevar Bush's memex hypertext context, an information space of passive quasi-static hyperlink documents. The content a user obtained by dereferencing a link was highly likely to be the same as that obtained by different user, or by the same user at a different time. Since then, the web's gradually evolved from this original static link document model, whose language was HTML, to a model of interconnected programming environments whose language is JavaScript. Indeed, none of the emulation frameworks I've described would be possible without this evolution. The probability that two dereferences of the same link will yield the same content is now low, content's dynamic. This raises fundamental questions for preservation. What exactly does it mean to preserve an artifact that's different every time it's examined? The fact that artifacts to be preserved are now active makes emulation a far better strategy than migration, but it increases the difficulty of defining their boundaries. The invocation of an object may include a different set of components from the next invocation. So how do you determine which components to preserve? And then the scale. In 1995, a typical desktop three and a half inch disk held one to two gigabytes of data. Today, the same form factor holds four to ten terabytes, about four to five thousand times as much. In 1995, there were estimated to be 15 million web users. Today, there are estimated to be over three billion, nearly 200 times as many. At the end of 1996, the Internet Archive estimated that the total size of the web was a terabyte and a half. Today, they ingest that much data, roughly every 30 minutes. The technology has grown, but the world of data has grown much faster. And this has transformed the problems of preserving digital artifacts. Taken everyday artifacts, such as Google Maps, is simply too big and worth too much money for any possibility of preservation by a third party, such as an archive. And its owner has no interest in preserving its previous states. While the digital artifacts being created were evolving, the infrastructure they depend on was evolving too. For preservation, the key changes were first GPUs. As Rothenberg was writing, PC hardware was undergoing a major architectural change. The connection between early PCs and their IO devices was the ISA bus, whose bandwidth and latency constraints made it effectively impossible to deliver multimedia applications, such as movies or computer games. This was replaced by the PCI bus with much better performance. And multimedia became an essential component of computing devices, forcing a division of system architecture into a central processing unit and what became known as a graphics processing unit or GPU. The reason was that CPUs were essentially sequential processors, incapable of performing the highly parallel task of rendering the graphics fast enough to deliver an acceptable user experience. Now, much of the silicon in essentially every device with the user interface implements a massively parallel GPU whose connection to the display is both very high bandwidth and very low latency. Second, smartphones. Both desktop and laptop PC sales are in free fall. And even tablet sales are no longer growing. Smartphones are the hard hardware of choice. They and tablets amplify interconnectedness. They're designed not as autonomous computing resources, but as interfaces to the internet. The concept of a standalone application is no longer really relevant to these devices. The app store supplies custom front end to network services, as these are more effective at implementing the web's business model of persuasive, pervasive surveillance apps are notoriously difficult to collect and preserve. Emulation can help with their tight connection to the hardware platform, but not with their dependence on network services. The user interface of mobile devices is much more diverse. In some cases, the hardware is technically compatible with traditional PCs, but not functionally compatible. For example, mobile screens typically are both smaller and have much smaller pixels. So an image from a PC may be displayable on a mobile display, but it may be either too small to be readable, or if scaled to be readable, may be clipped to fit the screen. In other cases, the hardware isn't even technically compatible. The physical keyboard of a laptop and the on-screen virtual keyboard of a tablet are not compatible, as you'll find out if you try running some of the Internet Archives games on your phone. Third, there's Moore's Law. For about the first four decades of Moore's Law, what CPU designers used the extra transistors for was to make the CPU faster. This was advantageous for emulation. The modern CPU that was emulating an older one would be much faster. Although Moore's Law continued into its fifth decade, each transistor gradually became less effective at increasing CPU speed. Further, as GPUs took over much of the intense computation, customer demand evolved from maximum performance per CPU to processing throughput per unit power. Emulation is a sequential process, so the fact that CPUs are no longer getting rapidly faster is disadvantageous for emulation. Then there's architectural consolidation. Brian Arthur's 1994 book Increasing Returns and Path Dependence in the Economy described the way the strongly increasing returns to scale in technology markets drove consolidation. Over the past two decades, this has happened to system architectures. Although it's really impressive that main mess emulates nearly 2,000 different systems from the past, going forward emulating only two architectures, that's Intel and ARM, will capture the overwhelming majority of digital artifacts. And fifth, threats. Although the Morris Worm took down the Internet in 1988, the Internet environment two decades ago was still fairly benign. Now, Internet crime is one of the world's most profitable activities, as it can be judged by the fact that the price for a single zero-day iOS exploit is about a million dollars. Once a vulnerability is exploited, it becomes a semi-permanent feature of the Internet. For example, the seven-year-old Conficker Worm was recently found infecting brand new police body cameras. This threat persistence is a particular concern for emulation as a preservation strategy. A paper called Familiarity Breeds Contempt by Sandy Clarke et al. shows that the interval between discoveries of new vulnerabilities in released software decreases through time. Thus, the older the preserved system image, the exponentially more vulnerabilities it will contain and the more likely it is to be compromised as soon as the emulation starts. OK, warning, I'm not a lawyer and this part is US specific. So most libraries and archives are very reluctant to operate in ways whose legal foundations are less than crystal clear. There are two areas of law that affect using emulation to re-execute preserved software. Copyright and except for open source software, the end-user license agreement, which is a contract between the original purchaser and the vendor. Software must be assumed to be copyright and thus absent specific permission such as a Creative Commons or open source license, making persistent copies such as a needed to form collections of preserved system images is generally not permitted. The DMCA contains a safe hardware provision under which sites that remove copies if copyright owners send takedown notices are permitted. This is the basis on which the Internet Archives collection operates. Further, under the DMCA, it's forbidden to circumvent any form of copyright, copy protection or digital rights management technology. These constraints apply independently to every component in the software stack contained in a preserved system image. Thus, there may be many parties with an interest in an emulation's legality. Streaming media services such as Spotify, which do not result in the proliferation of copies of content, have significantly reduced, though not eliminated, intellectual property concerns around access to digital media. Streaming emulation systems suit have a similar effect on access to preserved digital artifacts. The success of the Internet Archives collection, most of which can only be streamed, and Rhizomes is encouraging in this respect. Nevertheless, it's clear that institutions will not build or provide access, even on a restricted basis, to collections of preserved system images at the scale needed to preserve our cultural heritage, unless the legal basis for doing so is clarified. Negotiating with copyright holders piecemeal is very expensive and time consuming. Trying to negotiate a global agreement that would obviate the need for individual agreement would, in the best case, take a long time. I would predict that the time would be infinite rather than long. If we wait to build collections until we have permission in one of these ways, much software will be lost. An alternative approach worth considering would separate the issues of permission to collect from the issues of permission to provide access. Software is copyright. In the paper world, many countries had copyright deposit legislation allowing their national libraries to acquire, preserve and provide access, generally restricted to readers physically at the library, to copyright material. Many countries, including most of the major software producing countries, have passed legislation extending these rights to the digital domain. The result is that most of the relevant national libraries already have the right to acquire and preserve digital works, although not the right to provide unrestricted access to them. Many national libraries have collected digital works in physical form. The DNB's CD-ROM collection includes half a million items. Many national libraries are crawling the web to ingest web pages relevant to their collections. But it does not appear that national libraries are consistently exercising their right to acquire and preserve the software components needed to support future emulations, such as operating systems, libraries and databases. A simple change of policy by major national libraries could be effective immediately in ensuring that these components were archived. Each national library's collection could be accessed by emulations on site in reading room conditions, as envisaged by the DNB. No time-consuming negotiations with publishers would be needed. If national libraries stepped up to the plate in this way, the problem of access would remain. One idea that might be worth exploring as a way to it is lending. The Internet Archive has successfully implemented a lending system for their collection of digitized books. Readers can check a book out for a limited period. Each book can be checked out to at most one reader at a time. This has not encountered much opposition from copyright holders. A similar system for emulation would be feasible. Readers would check out an emulation for a limited period and each emulation could be checked out to at most one reader at a time. One issue would be dependencies. An archive might have, say, 10,000 emulations based on Windows 3.1. If checking out one blocked access to all 10,000, that might be too restrictive to be useful. So I hope I've shown that the technical problems of delivering emulations of preserved software have largely been solved. Concerns remain but most are manageable. The legal issues are intractable unless the National Libraries are prepared to use their copyright deposit rights to build collections of software. If they do, some way to provide off-site access will be needed. But at least the software will be around to be emulated when agreement is reached on that. Okay.