 Welcome again to the tower. This is the talk on topper by Maxim Alt and Dario Rapizardi. Again the ILC channel for questions is hash dc6 that hyphen talks hyphen tower on free node and also could I remind people with questions to wait until they've got a microphone because otherwise we can't hear you. Ratio, there you go. Hi, so this is the project that we've been nursing in in Intel probably for past few months, three four months and nothing is definite here we don't know what's what is gonna be like we're still forming and scoping it and I would be great to get your opinion on what what what are we doing and if you like it and get some discussion going. So in the end we'll have some demo but for now I want to talk about what what topper is and what it isn't so that's a lot of words here but in general it's we realized and you know as a big corporation realized that free software gives a lot of choices here and at the opportunities with basically grew exponentially and the definition of in the definition of solutions that any of those companies could provide one of those things is that Intel hardware has got a lot of a lot big big spectrum of different combinations and available in the market same time the software that enabled by the free software right the softwares the choice also the spectrum of choice has become really really large so the challenges that we faced it's that having our ecosystem quite large and and disperse and including operating system vendors and independent software vendors and people who are using it and and working on defining solution and and creating a solution there is problems with maintenance and support and we also realized that every single time that somebody try to try to test their devices and try to test their their computer and this information is gone we don't know about it whether it worked or whether it didn't work if it's a big company we probably hear about that it didn't work but we don't this information is basically gone and it's disperse and in in general we would like to also have some kind of an intelligence built in in terms of hardware and software dependency in the relationship between them so and the project has been it was born just from first steps that Intel did for with the free software I mean we just realized that using using Linux OS we got a lot of questions from people who never who never were working with Linux before and system integrators and they were asking whether well how do we know if some things will work with with your equipment and if it doesn't work what what do we need to do in order to make it work so we we were thinking about something more generic but now we scoping it down to be a little bit less generic and having a solution that is in encompasses the open source database or I would say knowledge base that and captures all the tests that have been done previously with various kernels with various hardware that tracks and knows software and hardware dependencies and the system requirements that for every single device that that we're using we also wanted to see if we can apply open source philosophy into the support and maintenance so this is quite new to to the corporations is that if we can create a collaborative communities and equipment with the tools to and utilities to resolve the problem so it's not just a website or a database or an expert system it's it's a collaboration it's a community that that we we basically form with every single problem and device and we will utilize this way also dispersiveness in localization comes customization let's say there is a some device that it's particular for Spanish market right and it's the scalability of those issues it's very difficult to do if we're not deferring the support with we're not customizing localizing the support structure and it definitely has to be opened so this is just a demonstration of the complexity of the problem and if we have a client OSP client Linux OS system and we have bunch of is see this cloud of the open source packages there and different versions of kernel of mess X also it's all in that cloud so the question comes I was don't what do we need to do in order to support ICH 7 if we have to if we have Nvidia if we have Nvidia and high definition audio and SATA drive what kind of packages do we need to use and problem is actually get worse if we start to increase the granularity to features let's say if we have did we talk about suspend we want to see the suspend to RAM or suspend to disk if we're talking about the power management and if we have integrated device into a chipset or it's a non it or a non-integrated device so things get really really messy and the choices are really not really clear what do we need to have in a system to provide the right the right set of packages so the question is what's the minimal set of packages when we need to pick up from this big cloud in order to make it work so there's a lot of definitely goals and benefits we're looking with this knowledge base but most of our partners we realize that they are really Linux non savvy they don't know what to do so we need to provide them with these tools right so this is also has to be a trusted resource of information that they know that if they go into this use to this solution then they are going to to get a credible credible resolution now in terms of operating system vendors so we believe that's a lot it's also very important solution because if they would use that framework then they could use this knowledge to of relationship between software and hardware in order to build distribution in the right way to support to support hardware from you know in the first place so it is a trusted resource of information and also it could help direct their development to tackle most of the critical hardware issues let's say I want to know which device has really supported the least and I want to target the development of the OS to cover those issues so this is will this will expose the gaps much faster and much easier I think so this is the questions that that basically the framework supposed to answer so if supposedly we have us we have a solution software solution that software stack identified so the question is is the specified hardware fully supported with this OS and the software stack or what hardware could be fully supported with what I have or if not what do I need to do in order to ensure the support and we need to make an exception for both skilled and unskilled users and for operating system vendors what software do I need to have in my distribution to support any given set of hardware so here's the terminology with what also with that framework what there's many many different analogies that we could come up with first one is is the internet right I mean in the beginning in the 90s we had a vast spectrum of data that was pretty much chaotic until Google came up and you know we could just right now type in the question or type in something we'll try to find and we have some kind of organization systematic way of ordering the information same thing is with open source software right we have analogy of creating a food right here so let's assume that we have this cloud of open source software and it's raw ingredients for us it's a bunch of packages right and we have the restaurants that we call them restaurants right the red hat and Oval so they basically take these packages and and prepare for us the food right we need to pay and it's a good maybe test you or not it depends on the on the on the taste but no hustle you know no wasted time on cooking so Debian represents this supermarket we can go and pick stuff up from the shelves question I'm not sure why Debian itself isn't a restaurant in your analogy we do provide systems or end users to use it could be it could be a rest it is but is running a Debian system now she didn't go to any restaurant I didn't precook or anything for her we I just took the latest Debian installer showed it and she installed it I didn't install it for her so she was eating repackaged food from Joey Hess well what this probably more most means is that if you want to cook at home if you if you choose to use you it's both say restaurant that also has they do it yourself kitchen area where you can go and cook your own food if you want to yes so that's well it's more about I want this slide kind of makes Debian steam less complete than say Ubuntu and red hat that offer things for the end users and I object to that I think we are fully capable as any commercial offering the second point I want to make is I should have asked this earlier I'm confused like is this driver framework isn't it mostly to get closed source drivers into an operating system well I'll stop a second question is that nothing has to do with a closed source or open source drivers and we just want to make the system work right this is this nothing has to do with that in first question is a little bit more complicated right we do want to we do want to cultivate people to cook at home because they want to make they they know what they like right they customize their taste to whatever they want to do that they want to do meet you know the vegetarians they they like spicy food I don't know if we have a flexible enough infrastructure to make it happen for people who don't really know how to cook right now right so we're talking about people who don't know how to yeah they just they come to this big supermarket and I see so much stuff and it's like well what am I supposed to do to do my taste so that's that's what it's not to put anybody down right and not to say well this is the restaurant and you better pay together good food is basically well I don't know what to do if right if I have to satisfy my taste so and that's basically what the last three three bullets says it's like we want to have a recipe book and we want to give them a kitchen supplies to to to cook and satisfy what their needs so what is topper right so we envision it doing vision it is like a device pedia Wikipedia like I don't know in US we have opentable.com it's also analogy to restaurants so anybody can go and say all this restaurant was good the food wasn't good the servers you know service release suck and you know and put put different opinions on on on opentable.com same thing with Amazon you can give your ratings and anybody could basically say if they like to the book or not and eBay same thing whether it's a set of papers you know whether it's a collaboration tool it's a collection to recipes you know the answer here is all of the above and the topper is the framework to identify the device readiness to use so that's and what it is exactly Dario made a prototype here and he's gonna show a demo but in the beginning we'll we'll talk about what actually prototype look like so here thank you well I will talk a little bit about the prototype in itself and perhaps it's a little bit rough in the way it looks but well topper actually it's an expert system a core set of libraries that provide different kind of information about the relationship between the hardware and software components and as any expert system it has information and data in it so in one of the first things that you have to make sure when you have lots of lots of data in expert system is to verify and make sure that the integrity and viability of the data is it's correct so basically I will start showing a little bit what kind of data will topper achieve we have a very diverse scope since sources can be wide and diverse it could be any kind of user of any distribution large amounts of data and there's a need for automated validation process of land validation and so there's a proposal to have a two steps validation process first the first one is an offline process and which will apply techniques and very common in data mining and then an online process which is more like and to correct the lectures and fix stuff in a more interactive way like Wikipedia or something like that so speaking about the offline validation process let's imagine that you are a 23 year old guy and had your driver license one year back recently moved to a foreign country and want to buy a 16 ball for a car now also think that you've been in jail one day because of speeding when you was a teenager actually this is almost real I had a black car not a red one now we imagine that you go to an insurance company okay so for them since all star you will likely have an accident in the next year so what do they do well option 8 you won't get assured at all and option B the insurance company will assure you but perhaps with a more expensive policy rate so at the end it's just a matter of less short okay the analogy with this okay let we are gathering data from many users let's suppose I mean lots of people input information about some kind of specific video car and when you collect data you always need to have some kind of tool I mean to collect the hardware information and the software information is pretty easy but then when you enter the field of what is videos working then you are more in in a cloud so basically it depends of of what the general consensus of the user is this process what it does basically is to try to detect if you have a like you know hundred guys with a certain video car saying that it works good for for them and then shows up one guy with the same setup saying that it doesn't you had the the outside that that guy is actually okay not being quite honest in the in the input of the data and you discard it or not actually you set up the threshold to set up to see when to take into account that kind of information or not so more technically speaking when you receive this anonymous information you have a make several validation process where a classical ones syntax error empty attributes data with illogical values I mean this is like the the features that you put on purpose that I mean if some system some test says that it doesn't boot but has x working the guy is probably lying so you discard it then you have a removal of non-relevant data attributes when you have a test case and you have several test case and you are looking for the behavior of you know networking Ethernet and you can easily see using relief measure if those some of those attributes are be able I mean have any influence in the in the scope that you are looking for then you can test and evaluate the data against the data you already have using well come on algorithm such as base for training and like tenfold validation and well with very large set of data we can support a very high reliability threshold so basically it's it's about having loads and loads of of data okay then there's the online validation and have community-based efforts in information related projects just as Wikipedia which seems to be very accurate think of it as a device Pedia but unlike Wikipedia it is not meant to be entirely dependent on community input so basically what it does is first of all okay the automated process will process information trying to clean it up and any glitches or any missing information will come then we fixed by human intervention so I don't know no I can show them a little bit prototype about what will be like to one of these tools to gather data I have it a little bit difficult because my monitor is not working so if I get so basically this is cherry topper I mean topper itself it's the data and set of libraries to gather the information then you can easily build applications applications is very pretentious like this to access the data discuss a very few things just to play it out I will first take a look to the feeding process actually this well then you have a program top scanner I will just run it one second so meanwhile I would wanted to say that we've we've done the prototype based on a database the hardware database that was given by Ubuntu and was we had to about 200,000 entries from from Ubuntu and a few entries that Dario made from by by Linux from Ximodora thank you well this is basically basically how the data looks like these are prologue facts actually I'm an XML hater so I prefer this kind of things and you have the definitions of everything PCI devices USB devices software packages loaded current modules CPU whatever you can also have them definitions about features just adding the I mean this is very handy handy because you can solve the constraints with just ignoring some of the of the facts or just and looking for the missing facts etc and you can have an application that also adds the formation about the features for example if the sound works or doesn't work but or have something like this for example that just reads the the data and according to the data that I just uploaded it asks me questions if he fights a display controller questions about the ex-environment work on the work I mean for the prototype I didn't want to get into the details about what is for example sound what is sound working I mean perhaps I'm playing a wave file it sounds but not very good quality perhaps some guy has a fight a 5.1 audio car and doesn't work at all so this start and it's a very difficult area to get into so this is mostly and I think about the tools or the definition of the users it's not quite the scope at this moment but once you answer all the formula you can put comments and this information gets into a QE to get them validated and in an offline mode okay so what once you have the data what is the logic behind topper I mean in several systems expert system regarding hardware and software always the the minimum subject of information of I know information is the device and here in topper say that the minimum subject of information is the the test and itself I mean a whole set of computer with hardware software and its behavior in itself to put it a little more clear we had a relationship between hardware components which is established by standards like HEP bus PCI bus different socket types for CPUs then you have relationship between software packages established by package dependencies but and how to define the relationship between hardware and software I mean to say that to make this kind piece of hardware to work you need this and this and this software packages installed so you have easy cases for example with the hardware behavior is defined by just one software component ID support you just have support in the kernel and you will have it you have hard cases when the behavior depends on not only the kernel but also in some external software packages accelerated X for example you have the kernel and modules you have an extra modules you can even have some proprietary malls I mean who knows it can get very difficult and then you have impossible cases which are most likely to be bugs but which happen for example some components that work but if you plug in some device in the machine some kernel modules loaded and the feature stops working typically for example yes in software suspend you put something that doesn't be high and doesn't reload well on resume and it just screwed the feature so big question again how to define software and hardware dependencies the short answer is that you cannot define but you can track and hopefully tracking will lead to a definition this is the logic behind toppers so let's think that the user a is told our next Linux distro with a two six fifteen kernel and the software suspend it's working out of the box so we can easily say that Linux two six fifteen kernel makes and software suspend works okay now we have a user be installing the same distro and same hardware but he forget to set up the partition so even when we have the same software packages we can say that you don't you don't only need to succeed but I saw partition to make the feature work we can keep going because and user since those same distro with a support issue on a different PC but his video card has problems with frame buffers on suspend to UE locks and he never resumed so same constraints as before but you don't you you cannot have them this particularly video card and you can get even more more hurry for example user D with same setup as a blocks a USB keyword that doesn't work again on resume so looks like he has security patches apply so she has different versions in core libraries and then you have a problem here because you don't really know if what it's raking the feature is the USB device he blocked or it is some kind of regression in the software packages and you start becoming nervous but if several users with same setup as T but without the USB device can software suspend then the possible causes are narrowed at the end is what developers do in BTS I mean trying to figure out the problems this is kind of automation of the problems using the same logic that we we use behind so one part of toppers logic is to identify those constraints and the other part is to retrieve the information in a smart way so yeah I'm talking a little bit about information retrieval well this is kind of in the expert query and screen it will be useful is if I were a better web designer and the thing is that here you just pick up any constraints that you want I mean this and this that's okay you have a PCA devices you have a USB devices software packages distributions ID devices current modules features CPUs and well machine types it's a like tax so basically you can ask thing like okay am I looking for this Wi-Fi car and okay just that does not work in okay and you have a one result and this is what I meant about the minimum type of information being the test and you have information about the test the distribution CPU PCI bus ID bus USB software packages feature etc you can also have a look for a bigger one if you have a real one you can retrieve transverse and information etc I mean you can gather all the data that you want this is the roughest example because you have access to everything let's look at another one about how to focus that kind of information retrieval well several parties seem to be interested in this kind of information and in the particular case of Spain there happen to be lots of OEMs and clone builders that just want to ship their boxes with Linux and out of the box I don't know if you are aware of but in Spain like every community has its own Debian based distribution so they tend to install those kind of distributions but sometimes the distribution don't work I mean for example Linux cannot work with that kind of hardware while Linux do and the OEM gets very nervous because he cannot deliver so government government agencies that also wants to buy hardware for for the distribution need to take into account that most of the times people in government that buys hardware are public servants I mean people not savvy technically so they will they need actually some kind of central information of a central repository of information to to pick up hardware and that that works also it's useful for localized these two builders to know what made the neighbor to make a hardware work or in a more general sense we can well abstract developers looking for regressions in their software big distributions interest in the general quality of its distribution and its derivatives big distributions interested in possible causes of derivatives you can also have a statistics about the most used and hardware piece not being supported and that kind of stuff more on that later so just an example about local OEM I mean OEMs usually just want to the answer the question is which distribution should I install to to make it work okay you can fill up the forms or you can actually for example this is the data test that I just got I will erase any kind of information about software distribution whatever so this is what a constraint solver is all about I mean you have one piece of information and the guys looks I mean those information data was specifying mostly sure my my laptop that's why I'm running and and it gives very too many possibilities to to make and for example I know for example this one well this is another he found two distributions anyway you can make you can focus the question read accordingly to the okay thank you to the scope so what do we need to go forward well we need to finish architecting the solution show me we contracted Linux savvy or open source software partners the telescoping of the data structure it's good for both packaging words Dev or RPM and have a publicly accessible knowledge base and populate the expert system with existing open source solution we're also I'm working with conversations with OSDL for regarding topper it's will be nice also to has to have an topper hosted on Iliot if Sunday these are interested in in this and well begin that again gathering I mean most of the data has been taken given by you want to from from their first Howard detection tool that they had the other part was from the Linux in their own labs just before finishing it for example this is pretty neat and I did this last night on the hotline for it's sometimes love takes like 30 seconds to gather data just this is looking for the top 20 most uses most used and Howard devices that don't work so this is kind of a shame list because shows the vendors that currently don't have too much support and Linux and I don't know if I will able to tell you but actually topper it's a Python library and which you import and you can start asking normal questions like get all PCI devices get this front desk it also has some pretty neat and free query and form when you just send it the cold block of prologue to the underlying engine and this is it it's ugly okay just showing the first column well this is the class of the devices the vendor when it says no it's well looks like these devices are not present in PCI these database so looks like this is even useful to complete that kind of information so mostly looks like well PC TV cars multimedia cars are working we have some well some blasters here or there well looks like creative support is not very very nice well at the end you you can I mean it's so modular than you can build any kind of interface to to the information I mean statistics and scopes whatever so what are that's what I finished with this so so the main challenges of the this framework it's that it has to be saturated with data because that that's the main property this is main intellectual property of that that framework is that data that's not a code itself it's not tools it's not scripts that we write it's the data it's very similar to Wikipedia and what makes Wikipedia Wikipedia and usability of it is that vast volume of data that's in there right so it's the same thing is here so it has to get to a saturation point that people will start using it and there are issues with saying who here how is this data about the data integrity and trusting the validity of the entry data who's the authority if we need a committee to commit the changes to commit facts into this data knowledge base right then scoping is a problem saying well what does it mean as device works or device doesn't work right if what if it works slow what if it works not fast enough what if only four channels of audio work and all it's quality is quality a problem right generic interface to a validation process is also would be required and what is a product going to be what's a framework is it just a website you know what is a knowledge management solution here you know it's always a struggle how to define it's a good usable usable package so people would use it from from every single one who uses a desktop or laptop to a system integrator as OEMs as yes we said the concept that says it depends on the mass adoption and it could be mirrored to our PM world from yes and the problem is that the linear curl doesn't like separation of the core from the driver code and that's at the problem that represents a big problem because dependency are going to continue representing the complex systems so here why is it top or so that that's the name that Dario came up with and that's the that's why and would like to have your questions and feedback yes how is the data going to be gathered is going to be gathered anonymously or is it going to be identified somehow by email address or something because I can see for example that people will send data and maybe they have made a wrong selection and they're going to make a really going to resubmit it with the corrected information can you then still match the original report with the new report so that you can say okay the original report is invalidate or not valid anymore because if you get a long a lot of error in reports that are later corrected you will get dirty data well actually for the prototype I mean since most of the data was coming for kind of tested sources I mean we didn't really even start thinking about it that was one of the reasons and of the proposal to host it on Elliot's I mean to have a suggestions and help and that kind of stuff I mean because it's can get pretty neat I mean the possibilities of this but this will very you know so many specialties and that would be useful okay but so far nothing regarding that I was also wondering about the day the gathering whether you plan to make it completely automated or whether there's going to be a lot of user interaction during it and I think it's going to reflect on how much data you get and also whether we can tie it in with things like the installers so you automatically get a report of an installed system and that kind of thing what your plans are and I also wanted to congratulate actually well actually since this is tended to be like I mean this is useful for multiple distros for example so at the end actually right now many distros has something like that gathering so will be just a matter of fixing or adjusting the format so it will end up being you know depending of each distro and or you can you know put some kind of program to download I know that's got to be it's not possible to execute something directly from the browser on this you know shall update signer that's not very nice stuff so it will end up being yes and depending of each distro the tool that they have and what they do to define okay and sound is working or sound is not and the adjustment will be I mean if you for example this you have the two distributions and you have users using the same hardware and the same software packages and some distros users says that sound doesn't work mostly perhaps because of the detection tool it's not very accurate about that and well better for distribution he will be able to fix to fix it I mean it can leverage a lot of things these things even the list of statistics I mean it can leverage a lot I also just also wanted to congratulate you on working on this because I was part of an unsuccessful thing at two dev comps to go to do some kind of hardware database and I'm really happy to see it actually happening so congratulations thank you okay time's up thank you very much