 All right Last year I was in the Pro Workshop and I was the only one doing numerical work I had discussions with people about what you're doing Different representation on stuff that I'm doing as numerical work I'm using the pro-day language to be honest And I want to give you some ideas about what it is I can use And what kind of people can use it And give you some examples that I, my company, use before What I'll discuss is what I just answered There's lots of other numerical works Doing some benefits of PDL versus other solutions Performance, some other uses and then what I do for myself So first what is PDL? PDL is an extension to Go I assume that you know Go And in PDL you'll have basic storage of arrays So the basic type of PDL is an array which is stored in memory The way you would do it in a conventional language I see So it's basically all bytes after each other And the basic types you have, bytes, integers, flows, doubles, you name it Is it comparable in ROD sequence? I don't know, I don't know ROD sequence The usual code operators are open like this So that you can use the code syntax Like addition, subtraction, etc. Also on these arrays The calculations are in binary speed So basically if you want to have an addition of these arrays It gets calculated in a C-coded style And PDL itself will also distribute the processing of multiple processors If they're present in your system And there are interfaces to the main numerical libraries If you're doing science you'll have Often variations forms of LAPAC numerical libraries that you use And PDL provides interfaces to these standard libraries So if you have a data set and you want to convert the matrix or whatever You can use the normal, actual, optimized solutions You can use the normal numerical libraries that are present in your system So short piece of code I have a variable A I assigned it I made it a PDL pin Initialized it with two arrays I add two to every member of the arrays And the output is here An array of two by three Where you have the numbers all handed by two This of course is not a very interesting example But it gives you an idea about what happens So in this assignment The individual pro data items are taken And they're converted into these binary forms That's internal to pro, but to PDL So you get here an array of six values And PDL scores done Okay, I have six values and you should interpret it as two by three matrix But it's done relatively efficiently And if those arrays would be very large So the administration of what's in there Is not any more done on a per-member basis As it's done in a normal pro array But it's done on the entire array So you have the different data types What's very convenient about PDL Is that taking parts of the matrices Or transposing it Making of two by three matrix Is all done by manipulating the administration around the data So if I do a transpose in PDL The basic six values stay in memory And it's just the administration About how it should be interpreted That is an adjustment And that makes it for large matrices Very efficient in terms of memory Because you don't copy data And it makes it very rapid Because of the administration of it You can really easily re-dimension it So if I have those six values And I want to have the matrix one by six or six by one You just tell PDL a bit of administration If you do it And that's convenient if you want to do For your transform some parts of the data Or if they have different meanings Most numerical libraries are there Matrix multiplication That's in the core of PDL But solve finding optimum values So minimizing problems of our matrix inference Are used by matrix multiplication And what the use is often for Is that I use pro-hashes To re-code the values Into numerical values Into array indices So that they can do my calculations In the matrix form in PDL With all the IO I can do an input There's a very good introduction To really what's inside PDL On proTV.org It's an hour long All the internals of how PDL works I don't want to cover it here Because I just want to give you a test Of what's possible Rather than tell you okay You can do it this way You have an example with a code This way you transpose it This way you calculate it Other solutions Basically the things I'm doing Is numerical calculations So it's basically what scientists do And in the 80s It was all Fortran and Pascal Which were the languages of that moment Which were really only had These numerical data types Those very poor at string handling At IO and all that See she was supposed to go for In the 90s in science But still a string manipulation Recoding things is a hassle You see a lot of solutions Based on Matlab Or the other source equivalent of TAFE In science Those languages have very concise ways To do all the operations You can really write in few life codes What you want But PDL in the end Doesn't need any more lines However I found Matlab very constrained In the way to import external data And to export it to data types you want Pro is way more personal Of course they have the Maximum thing It's symbolic manipulation That's completely different for all bark And what you see now is That in science there is a lot of Python and NumPy What you see is that for example Astronomy has built large libraries Based on Python And so that's in a lot of science fields The way that science has come What you see is that All PDL are faster But the libraries And not many libraries are built on top of that For the different user groups Have you ever worked at Wolfram's In the universal definition of physical stuff? I haven't Wolfram is using... He's the creator of Matzemanica And he's also made a certain thing So he's doing a lot of good stuff I think that I'm mostly involved in Is taking data from databases and companies Trying to make sense And what I found is that Data was errors in it And those universal languages Aren't very good at having databases Which, from version to version, have different fields Different definition and all that And that's probably very good In gooding those things together I miss SPSS in today I asked because yesterday It was important to data and SPSS And I wonder if what we're going to do there Is that to do with SPSS? Oh, I know people doing that My Higgs is doing that All the stuff that you would do in SPSS Do it in PGL I'm not sure if the TCS And the normal SPSS test Should be libraries for it But I'm not aware of it Because that's not the stuff that I'm doing I know that PBL is very fast Because it's one memory But in data, I really love PBL If there's something coming in You need to re-configure things Or if data columns are different You have all that stuff Ashes in memory is great And PBL has a great own way Of handling binary files So if you just have binary files Going from some source, for example Wave, an audio file You can really easily import that And that's all in PBL As for performance I'm very pleased By PBL PBL is really fast in text handling Putting things into hashes And concatenating things Perhaps you've been smart Where you see that The PBL is even faster And CC was close to that So it's really bad And I'm very good in that And PBL performance is also Very good because Of the way that the loops are Implemented in variable source code It's very optimized in memory And in speed Of process speed And it also can Use your different processes In your system In order to load that So as far as performance I'm very pleased with this solution You said that it does Seems quite lazily So if you multiply everything by two And just Put it in the administration And it says these values Have to be interpreted by the fact And two But then Not multiplication Big ordering So you have You have read the memory The values So if you add two to it What it will do Actually Is if you create Find two processors To take both the half Of the array And do all the additions In memory However if I say It's not Two by three But it's just six values Because I want to have the average of it Or whatever Then Reordering it So just changing The administration of those values Not changing the values But changing The layout It's doing it Multiply everything by two Then And read that together Yeah So that's done There are a couple of data stacks That are known I've taken this from The PDL Prologa site There is a nice Program that plots the Pollutions in different European cities And the original developers of PDL Are coming from Astronomy and Astrophysics So there's a lot of stuff Done there Where people actually Create their calculation For scientific papers in PDL What I do In my company Is that I'm Focused on the real sector So what we are doing Is we're providing Consultancy services To the real sector What I have there Is data sets like Number of travelers In each train in the Netherlands For the last couple of years So there's very many data points And then Forecasting how many travelers There will be in every train Or evaluating how much Power all the electric trains In the Netherlands have used Based on speed profiles In times that they were In certain sections In order to do that I need to split up my data easily And what I myself use there Is the main facility Of the Unix system Which can put Different jobs Different processors And lets all the difference I usually make small jobs Which run for Something like 15 minutes And collect all the data together So what do I do Passenger forecasts Up till February All the forecasts For how many people There are in the trains Were based on calculation That I've done So what we've taken there Is people that Pass through the trains And count a number of people We have a couple of trains Where the amount of Kilos of passengers is Waged and we convert it Into a peripheral passengers And we use the chip card data In order to forecast How many people there Would be in the next train Because we have all this Different data sources It was very convenient to use Perl but then the calculations Are done in memory In the data language And we'll paper on that In the Dutch journal start Another thing that I'm evaluating is the Amount of energy usage By the Dutch trains And this is a graph That we use internally We have a baseline In the beginning of 2000 We had no savings In energy And as for 2010 Onward we are training All the drivers of Dutch trains Drive more efficiently So what they'll do is They'll speed up first And then switch off the engine And let it roll out To be exactly in time At the end station I am doing the monitoring For that So for each of the groups Of drivers in the Netherlands So the Netherlands is split Into say 10 regions I determine How much savings they have As compared to the beginning of 2010 And so you see this On the average It's in the order of Magnitude to 3% Which is a lot of money But certain groups Of drivers are even better At saving energy than others And so what I'm doing in the In the next months Is really calculating How efficiently All the drivers have Driven their trains And reporting that back And this is used As management information For the management of these groups And their reach theory That's the drivers Drive efficiently Do you guys keep back From drivers that Drive very efficiently So they train Other train drivers Yeah, we have a couple of moves And we have a couple of people That are front runners So what you see here Is that this group actually Are the 20 people That we have very close To the central organization On eco-driving And they're actually developing New ways also for the other drivers In order to improve their performance But also committed to driving Yeah This is Grouped by region Yeah Can it be that it might be easier To drive More eco-friendly Or using that as an energy In certain areas It's normalized On where they drive On the routes And we see that It has more to do With management attention Than anything else Besides also What happened in August It's normalized from 2005 2008 And I see The tax adults Increase in savings Starting in August We've done Set management Yeah We've been training All the drivers And we thought that it would help It didn't a lot And at a certain moment We have presented to the management Which adds this target In the target letter That it was very unlikely That it would reach the target And as managers go They react to that And some managers More than others And they often squeeze Interest into That's your opinion This is really about How you drive yourself To drive The other thing that I'm doing Is monitoring The amount of time It takes for passengers To walk over stations So this is an example Graph of the station If you're in Utrecht Where we have at zero The time that A train arrives And here the amount Of minutes that it takes For the people to reach The end of the station It's also based on a chip card data And what we've seen We see here is that There are two platforms It's platform 18B and 19B Where the drivers Take a lot more time In order to reach The end of the station Those platforms are Rarely the long way From the centre part of the station And we're using this In order to optimise the station To rearrange platforms To understand what's happening With the drivers They actually listen to you Because the station is all The best now Yeah They aren't listening Or they're not listening They've got like five years To rebuild They are changing things For good or for worse I don't know So this is The platform centre Going to the centre That is This will get a new platform Which is a new platform Which will go On the local area So visually improved You know that This is a temporary situation You've been there for five minutes Over time We're very happy So Will you be able to speak? So my summary is I enjoy using PDL With that I can say Congratulations To my users for that Thank you for your attention Oh really? You're so quick Or you're not Questions That's what's interesting Here's your extension Oh extension libraries You wrote something yourself For this but it's also a bioprol Yeah there's bioprol Which uses PDL So what you see is These are very dedicated Projects that are No retransferable I do this for a real company And that's it But if you have A bigger community For example There's lots of Things in genetics Or whatever Where there are many Researches working What you need there Is really on top of PDL Basic databases Basic libraries To read basic data For most ages In order to improve Their algorithms And you see that The Python community Is now A way stronger Supporting groups That are building things Onto the environment Than the PDL communities So the way to go Is to be compatible With Python Pro Have importance for that No not really Because it's also the algorithms You write in the language That you want to transfer And modify And I found that For me PDL works great But in order to be Can show me Larger and wider groups That actually use it And what I see now Is individuals enjoying it But not really Broad communities Where they say Well we As those 100 or 200 To a thousand Researchers use it And I never see them On conferences University speech Python Yeah So in my university All the physics students I've taught Python Is it that In the use of optics Because optics use Really large margins Yeah You can use it A lot of transfer Yeah I'm not aware Of using the people Using that I mean this is Basically just A mathematical language I need this 20 years Yeah And what you see Is there a great Packages and model For that Once those packages Are there That really defines What you're doing But it also defines Limitations of what You can do Yeah Is it possible To use it To give you Acceleration I haven't seen But you might Think it should be possible I'm not aware of that So it's not Yes or no