 Next talk is non-visual access to Lenox from boot up to shutdown. Our speaker has been using Lenox since 1998. He's been involved with various accessibility working groups including the W3C from 2000 to 2004. He has recently submitted his PhD and is waiting for approval. Can you please welcome Jason White. Thank you very much. Now the idea from this talk derives from the quotation which I present in the abstract from Yenina Syker who is a specialist in the accessibility of Lenox free and open source software and open standards based in the United States. On her website she says that equal access is every blind computer user's right from boot up to shutdown. And Yenina argues quite correctly in my judgment that a boot up to shutdown review of the operating system and of its applications is the ultimate test of accessibility. My purpose in the presentation today is to attempt such a review. I start with the boot process. I discuss various types of user interfaces that are used by the operating system itself and by applications and I examine what the accessibility options are at each stage of the process. I don't plan to dwell extensively on the tools which are used for text-to-speech synthesis, for screen reading and for Braille access by means of refreshable Braille devices. The details of these I discussed much more fully in my LCA 2008 presentation, the video for which is still available online. However, given that many of you probably weren't present at the LCA 2008 presentation, I do wish to present an overview of the tools just to provide some context because they will necessarily be mentioned in the course of the central discussion. So, obviously I'm interested in output via synthetic speech and by refreshable Braille displays, which are electromechanical devices that can present a single line of text at a time in Braille. There is a variety of text-to-speech systems available, including hardware devices, many of which are no longer manufactured, and more recent software-based text-to-speech systems, including free and open source ones. The situation in regard to the software which is used to provide a user interface by means of these output methods can be understood by considering the various projects that they're currently on offer. Now, in the notes to accompany this talk, I summarise what I consider to be the central projects in this area. As far as Braille is concerned, there's one all-encompassing project, namely Braille TTY, which has been around for quite a long time. It has a highly active community of users and developers. It provides Braille reading functionality at the level of the Linux console, but also by way of the Braille API. It can interoperate with Orca, which I shall discuss shortly, to provide access to applications under graphical desktop environments. Next, we have SpeakUp, which is an interesting program. It's actually a kernel module that provides speech output by means of a hardware or software synthesiser, and it allows a user to review the screen interactively using speech and using the keyboard for input. Now, SpeakUp recently entered the staging tree of the Linux kernel as of 2.6.37. There are bugs which need to be resolved in order for it to migrate permanently into the mainline, so if you know any kernel developers who might be interested in accessibility and who have some spare time, then perhaps that's one project to which they might be interested in directing their attention. Emacs Speak is another project, a very different one from the preceding. I might actually mention it at the more detail later on, but basically it's a speaking extension to Emacs that was written by TV Rahman, and when we talk about Emacs applications later, I'll describe briefly how it works. What makes it different from the other project under discussion is that Emacs Speak is not a screen reader, it's not intended to provide a non-visual rendering of the user interface. Rather, it intercepts Emacs functions directly to provide a highly customized speech interface to Emacs itself and to its extensions. Finally, there's the most complex project of all, Orca, which is a screen reader for the graphical desktop environments under the X-Windows system. It's a GNOME project, so therefore it provides access to GNOME itself, but as we'll discuss later, it also offers access to various X applications. Orca is written in Python. It uses the GNOME accessibility API to interact with the desktop environment and with applications. Basically, what happens is that there are scripts in Orca that receive events from the accessibility API and it constructs its Braille and speech interface based on these events. Essentially, it operates by tracking the focus and by reading the current item, which is in focus, as the user navigates the graphical desktop. So those are the essential projects that we're concerned with and now we can move on to the substance of the talk, starting with the boot process itself. Now, obviously, this is somewhat important, but we're assuming here, of course, that we're dealing with a user who has a confident understanding of the operating system and who is administering their own system and wishes to have access to it. Now, in this situation, you don't want to be in a circumstance in which you apply power to the machine, wait for the amount of time that the boot process usually takes only to discover that nothing happens and there's no Braille or audio feedback. That kind of failure situation is highly undesirable. Unfortunately, it can happen and indeed, it can happen very early in the boot process itself. The CPU starts executing and the first thing that happens is that firmware is run. Now, this is not something which the Linux and Free and Open software community has yet been able to address, but there are issues at this level. Now, of course, if the firmware has support for a serial console and if the user has a second machine which can be connected over a serial port, which has Braille or speech access available to it, then, of course, the user is able to interact with the primary machine over a serial console and operate menus or command line interface in the firmware. Now, serial consoles are quite common in non-X86 systems and in servers, but unfortunately, the typical X86 case lacks them. Sorry to say. The second option that's worth mentioning is IPMI. IPMI 2.0, as I understand, it mandates support for serial over LAN. So essentially, you get a serial console over the network and that's obviously another option which user equipped with a second machine can employ. But in the absence of either of those, then we have a completely inaccessible user interface to the BIOS, which is obviously not helpful when problems occur or when parameters need to be changed. So what we really need is, first of all, core boot or a similar project to find its way into the firmware of everyday systems and secondly, for some accessibility project to overcome difficulties that are associated with providing Braille or speech access at this level. Once we get past the firmware and into territory where the Linux community has had an influence, we're in a little bit of a better situation. So we move on now to what happens in the boot loader and in particular, I'd like to talk about Grub. Of course, there are other boot loaders, but Grub is widely used and it has significant accessibility features, so why not talk about the boot loader which gives us the best support in this context. Now, first of all, it should be pointed out that a serial console is available in Grub and in the notes accompanying this talk I refer to the relevant section of the draft Grub 2.0 documentation at the FSF. Also, Samuel Teeble, who is a highly experienced researcher in kernel development who happens to have an interest in accessibility and who's been contributing to accessibility related projects very extensively, has added code to Grub 2. And I'm quite sure this is based on mailing list posts that this is his work, whereby you can configure Grub to play a tune when the Grub menu appears. Now, this is useful because not every system will beep at you when it begins booting from the hard drive. So to provide access, we really need to know if we can't get serial console access, at least we need to know when the Grub menu appears so that the user can start typing. For instance, they might just press the down arrow key and enter to select the single user mode menu option of their boot menu. Or they might want to type text into the Grub command line without any accessible feedback. I mean, if you know what to type and you're accurate about it, then yes, you can recover from some tricky situations. So this configuration option can be set and you'll get a sound with the right pitch and duration as specified in the option when the Grub menu is displayed. Let's just suppose that we get past the Grub stage and the kernel loads. The initial RAM disk is loaded, the init.rd in most distributions. And it should be noted at this point that we can in fact start, speak up fairly early and likewise we can also install Braille TTY into the initial RAM disk if we want to into the init.rd image. I haven't actually tried that, but of course if you were to do it, it would provide access at a very early phase of the boot process where occasionally matters can go wrong, especially if you're playing with interesting new file systems such as ButterFS for example. Also, I should mention that in relation to Grub and the entire boot process, if you're actually not booting a physical machine but a virtual one, then there are interesting accessibility options available. KVM and QEMU, which is of course the user space, element of that has the ability to display a text console on a terminal using end curses. There's a dash curses option to QEMU-KVM, which you can use and that will display everything from the virtualized BIOS through to the boot loader and everything that subsequently takes place. That was the first time I actually read a grub prompt, was when I installed KVM on a system. So, but if you're using a non-virtualized system, then of course you have to work through the accessibility techniques that I've talked about. The rest of the boot process is fairly straightforward. Speak up, as I said, can start quite early. So can Braille, TTY. If you get to the usual console login prompt, then of course everything is fine at that point. It's entirely accessible by means of those tools. GDM is also accessible and in fact the latest version of GDM, which in Debian is called GDM3, it's actually the version which is implemented with GNOME 3 technologies, has support for Orca associated with it. What that means in effect is that it creates a fairly minimal GNOME session and in that session it can invoke Orca to provide Braille or speech output from the boot, sorry, from the login menu of GDM. So you can in fact configure the relevant variables using gconf tool to make this happen. And that means that a graphical login is also accessible should that be desired. Now let's suppose that we go to the console and we log in. What's the state of accessibility here? We're at the console and if you'd like to know what my reasons are for choosing Linux over other operating systems then one of the central reasons has to be this user interface by which I mean the user interface that Linux inherits from the Unix tradition. The shell utilities, incurses applications, configuration files, text editors, all of the usual Unix tools. And it's the power and efficiency of this coupled with the accessibility software that I've mentioned earlier that really make Linux a pleasure to use for those of us who know how to use it. This is the real fundamental user interface advantage that Linux has over many other non-Unix type operating systems. So the console is highly accessible. Now there are some features that can be added to applications to make it even more accessible. And in the notes I've prepared a table that lists some of these. Essentially the idea is that the screen access software cannot itself handle certain features, for instance attribute curses whereby the real terminal cursor is parked at the bottom of the screen and highlighting is used as a substitute for the cursor. Now of course it would be possible to, as was done in the days of DOS many years ago, it would be possible to write code in your screen access software that would track these. But it's even easier just to write a patch to the relevant application to enable the cursor to be used properly so that you don't have to try to identify and track highlighting. And indeed many applications already have this built in and they indicate what the configuration options are for links, for instance, and Alpine. There's also a single column folder list option in Alpine that many speech users find to be valuable just so that they can read their folder list one line at a time and navigate through it effectively. Also another issue that arises quite frequently concerns automatically updated text. If you're using a speech system then you really don't want to have the line and column information in your text editor read out every time you type a character. So for VIM there's an option to turn off the ruler which just disallows that kind of status line in which the line and column information is presented and updated frequently. So there and a final option that's really worth mentioning here is the one in MUT. This is in fact very helpful to MUT users who are relying on either Braille or speech. It's a Braille friendly option so called. Essentially what it does is to place the cursor on the first line of the body text of an email message when the message text is displayed. That means you don't have to read through the headers and find the start of the message you can just begin reading the message immediately and that obviously saves time. So there isn't really much else to say at this level. Some people like to create aliases for single column LS output for example if they're using speech there are all kinds of ways in which one can personalize and customize a Linux environment and that that of course is very much part of what Linux is all about. So having discussed that I'd next like to talk about graphical desktop environments where the situation is rather interesting. As soon as our user ventures in here let's just begin by making it clear that their access options are highly variable. Now I have mentioned that Orca is part of the GNOME project and I've described how it interacts with the GNOME accessibility API. This is an API that enables that has to be implemented within user interface libraries and applications. Often it's very much within the user interface library GTK plus for example but sometimes there are function calls that have to be made within the applications themselves and this library provides information in the form of a tree of objects and events as those objects change through to your assistive technology and in this case to Orca which is the screen reader. So the idea is that if the applications are on the one hand keyboard accessible and on the other hand support the accessibility API correctly then it should be possible to navigate through and use the graphical user interface by means of speech or braille output or indeed both. Screen enlargement is also supported by Orca and there are difficulties there in the migration to GNOME 3 which I mentioned in the notes and I should in fact point out that even though the focus of the talk today is on the operating system itself and applications but this entire GNOME accessibility infrastructure is undergoing transformation. ATSPI 2 is a complete re-implementation of this which uses debuts rather than Corba and Bonobo as its inter-process communication mechanism. It's part of what they call the Bonobo deprecation project. I suppose that's what happens when you name your software after an endangered species. So anyway we now have this new debuts interface that's coming coming of age. There are performance issues that people are working on. Low latency is very important and by the way a lot of work associated with GNOME accessibility has been experiencing resource difficulties recently. Orca in particular lost its only full-time developer immediately after Sun were acquired by Oracle and unfortunately there isn't a lot of funding around for this work at the moment so it really would benefit from development effort from people who are interested in this kind of problem and who are experienced in writing this sort of infrastructure to or who are interested in becoming self-experienced to come in and help with it. So all of this is in flux. What that means from the perspective of our hypothetical user is that what works today might work better or worse tomorrow depending on what happens as they move toward GNOME 3. It's also interesting to note what's not included here but I'll mention that in a minute. I suppose we should start with the the applications that are accessible. Obviously there are core parts of the GNOME desktop itself. G-term, Nautilus, the desktop and the core desktop and so on. There are various GNOME applications such as Evolution and others. But also openoffice.org and Libra Office are accessible because they implement the GNOME accessibility API. Mozilla Firefox and Mozilla Thunderbird implemented. The Mozilla developers have been cooperative and responsive over time and in fact I think they're still funding some projects in this area. Eclipse, what else is there? There's an entire list here. I think I've mentioned the central projects or at least the ones that get the most attention on the mailing list. Pigeon is fairly popular among some users but what I should also say is that even though a project might appear on the list that I've given and I've given a fairly comprehensive list in the notes, even though a project appears there that shouldn't be taken as implying that the accessibility works. What it means is that there's a script that is a Python script in Orca which is supposed to support this application. It means somebody's written the script for it, somebody's tested it to some degree but that doesn't mean it's necessarily going to work reliably. And that gives rise to the question of how extensively and how well the accessibility support in the graphical desktop environment actually does work. There were interesting questions raised about it earlier this week on the GNOME accessibility mailing list. There are people intending to coordinate more extensive user testing and as well as more extensive automated testing so that they can try to catch regressions. Because of course there are regressions and I think one of the central reasons for this is that it's entirely possible for the accessibility API interface to break without any corresponding breakage in the graphical interface that most developers are testing with. So the result of that is that the accessibility support is somewhat fragile. You'll notice that QT applications and KDE are not included on the list. There's a very good reason for that. The support for the GNOME accessibility API and the whole accessibility infrastructure under Linux is not yet connected up in relation to QT. There've been efforts in this area. Occasionally there'll be posts on a mailing list about it. I'm not sure exactly what the status is other than that there isn't anything out there ready for testing at the moment. I know there are people who are intending to do it however which is obviously a good thing. Support for WebKit is well underway. It's still very much developmental. I understand that there are patches around and that changes have been integrated into WebKit GTK. So I suppose that means that at some point assuming that all goes as planned we'll get access to WebKit. So that's effectively the situation in graphical environments. They're not really in any way my preferred style of user interface. It's not so much the matter of there being graphical. It's really the way in which the user interfaces require you to interact with graphical user interface elements. They don't provide the same expressiveness that you get on a command line or in an editor such as Emax or VI where you have hundreds of key bindings that you can use to get work done efficiently. So I find that having to work in a graphical desktop environment which I do quite rarely. I do it for example to access Firefox so that I can work with JavaScript intensive web pages that require the document object model or XMLHTTP request or any of the other modern JavaScript APIs that console-based web browsers don't implement. But essentially my personal approach is to avoid that kind of environment as much as I can. Obviously for users who are coming from certain proprietary operating systems that graphical environment is somewhat attractive in the sense that it's similar to what they're used to but I'm not sure how much of an advantage that is over the long term as someone becomes a confident Linux user. At any event the next kind of user interface I'd like to discuss is Emax and this is for one very important reason that TV ramen has provided a superb program in Emax speak and it takes a very different approach to providing access. What's interesting here I think can be made clear by recognizing a contrast between the accessibility API and what has to be done in connection with Emax and that is that as I mentioned before the user interface tool kits and applications need to support the accessibility API. In Emax you have such flexibility that you can actually extend Emax itself and Emax applications to provide spoken feedback without having to modify any of their source code. I've actually worked with this. This is a really superb development environment. You can write advice functions for Emax speak, test them immediately, find the bugs. I always seem to introduce bugs and then fix them with a very short development and testing cycle and I find this to be extremely liberating. Basically what happens is that Emax provides an advice facility. The way it works is that you can write advice functions that are called before, after or around any given Emax function and those advice functions can perform useful work such as telling Emax to read a message for example or to play an auditory icon. The idea is that you write a fairly small amount of code that customizes the speech interface for any particular Emax package. It should be clear I think at this point that the extensibility is excellent and that the auditory and speech user interface is likewise excellent. The one class of applications and this is a class which is of growing importance that I haven't described in detail comprises web applications. We're getting more of these JavaScript, HTML, CSS combinations that provide application functionality. They've been around for a very long time. They've been growing in sophistication especially in recent years. One of the problems here is that the user interface components implemented in JavaScript don't use standard HTML form controls. It's not the only accessibility problem but it's one of the central difficulties and if they don't use standard form controls then how is your software which is providing the braille and speech access supposed to know what's going on when JavaScript is making modifications to the document? How is it supposed to know for example that that div element with a lot of JavaScript and CSS attached to it is in fact a checkbox? Now there's a well developed standard for solving this problem namely ARIA accessible rich internet applications. It's a W3C specification. It recently achieved candidate recommendation status within the W3C. Already it's been implemented by a number of web browsers and libraries including Mozilla Firefox and WebKit. So it's a very important specification and the idea is that you add and maintain in your JavaScript code various attributes which are included in the HTML document and these attributes tell the assistive technology what kind of element it is, what kind of update might be happening, whether the update should interrupt what the user is reading for example or whether it's just can be ignored for now. All of these details that you need in order to provide a non-visual interface that at least works are provided as long as the ARIA attributes are maintained and kept up to date and indeed of course present in the document. There are various free and open source libraries for constructing JavaScript applications that already provide support for ARIA attributes and the two that I highlight in the notes are Google's WebToolKit which is under an Apache license as I understand it and Dojo or Dojo however they pronounce it Dojo IBM developers if I remember correctly contributed the accessibility support for that. So the idea is that if you use one of these toolkits and you pay attention to the accessibility requirements then you can write an accessible web application and the way it works in practice if you're using Firefox and Orca is that the there's a mapping between the ARIA attributes and the accessibility API so that Firefox itself processes the ARIA attributes and updates to those attributes it makes the required accessibility API events available through the event registry which then goes to Orca and then it provides the Braille or speech feedback. So what we're doing essentially is we're taking the accessibility API and the GNOME accessibility infrastructure that I talked about before and we're adding another layer of accessibility API on top of that for our JavaScript application. Now if all of those layers work and everything is fine then we have an accessible web application. Obviously there are plenty of failure possibilities here but we don't need to talk about them right now do we? Let's be optimistic for a moment. So our users now had a tour of the console graphical desktop environments, GNOME in particular, Emacs, web applications, I think that essentially covers everything that they're likely to use. The shutdown process we hope is uneventful. After all it's very much a reverse of the boot process. Kernel paddocks are an interesting kind of situation. One technology that I haven't mentioned so far is Net Console. It's a rather useful little kernel module especially if your machine is like my laptop and it doesn't have a serial port. Net Console essentially provides you with access to the kernel log messages from another machine using UDP packets and that can be useful for accessibility if you know that you have kernel problems. I've been lucky I mostly don't have kernel problems but there are situations in which people do. I mean for example they could in fact be hardware problems. There are cases in which failing memory can cause all sorts of panics and crashes. SpeakUp I understand because it's a kernel module that it can read the oops message that takes place on the console when there's a kernel panic. I haven't actually experienced that. I'm not a heavy SpeakUp user. I'm sure people who've been developing and using SpeakUp extensively have encountered their share of kernel crashes over the years but I understand that that should be possible as well. I think that essentially covers the boot up to shutdown review. I also believe it should be quite clear as to where the central problems and challenges are and that there are opportunities for the development community to step in and help to solve those challenges in cooperation with a very vibrant existing community of developers and users. I should perhaps preempt one fairly obvious question in connection with the GNOME Accessibility API and the accessibility infrastructure associated with it. Namely the question of what developers should do if you're writing a graphical application or if you're involved in one of the one of the graphical user interface toolkits what should you do in order to implement the API properly and make sure that everything works. There are test tools available. Documentation for the GNOME Accessibility API is somewhat lacking. Most of it is rather old and out of date. It should be possible to generate some reference documentation from the new debuss-based implementation based on the XML files that I looked at when I was examining the source tree. Also there are some people who are actively writing documentation which should be useful to developers. So I'm hoping that particular problem emphasizing the importance of documentation which of course has been mentioned on a number of occasions throughout this conference. I'm hoping that that will be of great help to developers interested in fulfilling their side of the accessibility scenario. So that essentially covers my talk. I hope that time is looking in our favor because it would be good to open it up for questions and I'll happily elaborate on any particular matters that you'd like to discuss. Any questions? Okay, I've got a couple. Has there been any work on like simple audio cues like beeping for panic or shut down to notify success through the PC speaker or something that's not likely to also have crashed like your audio driver? Sorry do you mean audio editing? For panic notification like the old sad Mac sound? I'm not aware of anything of that kind as yet. There are interesting issues around audio as well as to there's a project called speech dispatcher which is supposed to allow multiple speaking applications to share an audio device without interfering with each other. That's an interesting project that's had a difficult history including a fork that lasted for a while. That seems to have settled down but at the same time there are still issues that they're trying to work out especially in their interaction with distributions that load pulse audio servers by default. I'd like to say that you're not alone with kernel panics being absolutely annoying. The text of a kernel panic is not displayed when the graphical interface is loaded all the time so your system just locks up and plays dead with no message of what has happened to it. Yes that can hardly be a useful camp, right. I think we can understand that. I suppose there's KXEK and you're supposed to be able to get a core file. One of the other difficulties with all these solutions of course is that you really need to set them up ahead of time and you're only likely to do that if you believe there's a reasonably high probability you're going to experience one of these events. So if you're doing kernel development which i'm not but if you were then you're likely to do this if you're experiencing problems of this kind on a reasonably regular basis you're likely to do it but otherwise not and i'm sure that the interesting ones are those crashes that are completely unexpected that just take your system down. I've been getting one about once a week but you know i think that's my fault. Yeah Jason you were talking about the regressions happening in what was it in the the GNOME accessibility API is that and so is there a is there a test sort of test suite to kind of guard against those regressions that you're aware of i mean is is that an area where some work would benefit the the accessibility tool API? Well there are there are tools that will there's a tool that will show you what the API is providing as you interact with the application and Orca itself has a debugging facility that i've found useful in the past when i've been submitting bug reports and the way that works is that it logs every event that it receives from the accessibility API and then it logs what it did in response to that event so that you can try to work out whether the bug is in the Orca script that's trying to interact with the application or whether it's in what the application is providing via the accessibility API in the latter case then you really need to get the author of that application or somebody associated with the project or at least somebody who knows enough about the code to to fix it to go in and address the problem and there's a lot of testing and fixing involved it's a bit like having established a theorem we have a good general solution in principle to the graphical user interface accessibility problem and it's been around for a very long time the difficulty lies in the implementation of that solution and yes we do need more automated testing of these matters that's the reason for the discussion on the mailing list earlier this week we need automated regression testing and obviously better documentation better implementation but it's a very long way away from a situation in which somebody can run any x application of their choice and expect it to be accessible whereas of course at the console level they can run any console application of their choice and they can be reasonably confident that it will be entirely accessible unless it does something really odd. Okay our last question. Jason you've mentioned Google Web Toolkit and Dojo other JavaScript frameworks jQuery and the like can you briefly say what they would need to do to support ARIA you've mentioned that a div is not a checkbox is there anything else that you can highlight. So with all of their user interface components they need to include the ARIA attributes and then as the states change of in other words as the user interacts then they need to update the ARIA attributes that has to be done by manipulating the document object model so that's as I understand it that's essentially what they need to do I'm not sure whether they need to do something special in relation to keyboard access and focus I'm not completely sure of that but ARIA itself simply specifies what the attributes are how they're going to be interpreted by web browsers and assistive technologies and it's up to the JavaScript application to support that. ARIA is pretty simple to work with I have played with it a fair bit in the past and now could you please put your hands together for our speaker Jason. Now as a sign of appreciation from LCA to Jason I have this bowl which is made from Queensland Macadamia nuts. Thank you. Thank you very much.