 Hello everyone, welcome to this session. First I'd like to thank the organizer for letting me present the project that I've been working on since three months now. So I'm going to keep my phone with me because I might receive some phone calls. I might have some notes here. I need some reminders. Before going deep into the subject, I would like to start by asking you a question. Who is relying on public transportation to go to work today? Who is taking the train to go to work? I guess that you are used to this station. If you are not used to this, I'm going to explain my experience with the train in Belgium. So I'm living in that part of the Belgium, somewhere here. I used to live there since 36 years and since one week now I'm living much closer to Brussels. Mainly because of this issue also. As you can see, I was living in the worst part of Belgium when it comes to punctuality of trains. So that was a big issue for me. From door to door with the train, it took me less or more one hour and 15 minutes. And with the car, one hour and 15 minutes. When I was living rather early in the morning because if I lived at 7, it took me three hours to go to Brussels. So if you do the math, it's four hours lost in public transportation every day. And what to do in the train for four hours every day? What can we do? So the first thing that we can do is of course sleeping but it's not always possible because it's so much noisy in the train that for me it was almost impossible. The train 96, 97 that I was taking from most as are always using the old trains without automatic doors. And it's really, really noisy. So the second thing that we can do is reading of course. Third thing is listening to music. Of course not too loud to not bother other people but this never happened. You can work. For me, this is my favorite things to do in the train because I can quickly focus on something else and that's what I was doing in the train every day since years. Or you can learn by reading or listening to some podcasts. So in my case, as I told you, I'm always working in the train when I'm not sleeping but it's really rare. And being a contributor of the Drupal communities in PHP, I usually maintain and develop my own projects on there, committing some patches and reviewing what's new on my projects. And just for your information, Drupal is just a framework written in PHP and created by Driz Bajdar, a fellow bachelor. So how the story of SNCB alerts began. So of course it started because of this. I was wondering why in 2018 there is no way to have consistent information about trains delayed. When you go in the station, you see that on the live world that there's a delay in the train, then you check on your phone and you don't have the same information. How this can be possible nowadays. So my idea was to create a kind of platform where you can have all your train delays and alerts in real time, almost real time. So I had the idea to create a tool that would alert train users of the delays and the alerts. So the tool is named SNCB alerts. So there is no fancy name. I didn't know I had to find a really fancy name. It is open source. It's using the symphony framework. It's relying on Git for the source hosting and deployment and it must be plugged in. I will come back on this probably later. So before going deep into the application, the first thing to do is to understand what is a delay. Okay, a delay, everyone knows what it is. But this is the definition that I found on the internet. In our case, a delay is when the train doesn't leave the station at the right time. For me, a delay starts after one minute. After one minute, if the train hasn't started yet, it's already delayed. But for the SNCB, it's 15 minutes later. But that's another subject. So where to get those data? Where to get the delays? Where are the train delays? Are these data available somewhere? Actually, yes, you can find them in the station, of course. But in our case, if you want to build an application, we need to have the raw information of these delays. And of course, I rely... Yes, what can you find first on these live boards? You can find the destinations, the departures, the delays, the cancellation. But you cannot find the line number, for example. It's striked right now because this is an information that I wanted to provide to the users, but it's not yet available on... It's not really easy to find them. But I will explain how I found them later. So the idea behind the tool is to provide information about delays and alerts in a standardized way. So of course, I got my data from iRail.be, which is from Peter Colpart. And these data, what can I get from these data? I can get, of course, the departure station, the destination station. The train line is not available, and I can get the delays. Okay, where to get the train line? We'll get to there. In order to get the train line, of course, what I have to do is to create a matrix of less than 600 columns on 600 rows, where all the stations of Belgium were on the columns and on the rows, and on each cell, I had to fill in manually... Okay, line 96, line 97, that's most, and I had to do that for every station. If you count, it's 36,000 cells to fill in manually. But of course, you can divide it by two, because the line from station A to station B is the same as station B from station A. So, but still, 18,000 cells to fill in manually is a huge work. So it was completely crazy. I started to do it, and then I submitted the document to the guys from iRail, and we found another solution. I first asked the question on GitHub, and then had some talk with Peter about this, and how can we get those data, those train line data, because they are not available on iRail. And I guess, but I'm not sure that if they are not available on iRail, it's because the SNCB and MBS is not providing them directly. So, I've been told that I could use wiki data that I didn't know about until December of last year to complete the train data and all that stuff. So I started to manually edit all the 600 stations that were on wiki data. Of course, some of them were missing, some of them were outdated, and I completed them. What are the information that I added in? It is the iRail identifier. Each station on iRail has its own identifier. So I copy pasted the URL and added it into wiki data. I also completed and fixed all the adjacent stations, all the connected lines to the station, and I made some name normalization. So in order to show me the percentage of completion of that work, I created a tool that you can see here. It's just a map that shows all the stations in Belgium. So you have green markers, yellow markers, red markers. Green markers means that all the information that I had before were complete for that station. That was on the 24th of December, and I was already working on this since maybe two or three weeks. When I started this, it was mainly red and yellow points. Then the yellow points means that most of the information are there, but there are some information missing. And red means that nothing is there. We have to complete the data. And today, this is the status today. As you can see, most of the Belgian stations are complete, except some of them. You can see some yellow stations, because these stations only have one adjacent station. That's because this is the end of the line actually. When you have two adjacent stations, it means that all the information are there, because at the end of the line, you only have one adjacent station. For example, if you take most, you can see all the information are there, and it's a green marker. So that was the first tool that I used to see if all the data were complete. In order to fill in the map, a Node.js library has been written just to fit the map. The Node.js library is available as a package on NPM, and it can be really useful to query the iRail.be API. It's fully working, and you have a goodies with that library, is that it can also get the data from WikiData and from iRail and merge them together. So this is what I'm using for feeding the map that you see here. Also, I learned a new kind of database, which is Neo4j. It was used to display the relationships between the stations, because seeing this information like this is fine, but you cannot see the relationship in between each station. It means I cannot check if the data adjacent station is valid or not. I cannot check the consistency. So I created that kind of map, where you can see all the link in between stations, and that was really helpful to check if a station was correctly connected to another. I like this tool so much that I did the same for the London metro station, but unfortunately I was not able to do it for the Belgian metro station because all the data are not there yet. But for London it's easy to find the data. So the result. You have now three gateways for the small tool that I built. You have a Twitter gateways. It's a tweet account that tweets all the delays and alerts greater than 15 minutes. You have a telegram public channel that you can join, and you have all the delays almost in real time. It's really spamming all the users, actually, because as soon as there's a delay, even if it's one minute, there's a message. And you have also a telegram bot that you can customize in order to get specifics alerts. For example, you live in Leuven, and you want to get all the alerts regarding Leuven. You can customize by just sending some comments to the bot and you get the alerts in real time. So on Twitter right now you have almost 28,000 tweets since I think the 6th of December. I was not able to go back to the history. There were so much messages that I think Twitter removed some of them. So this is the first version of the... This is a message from the first version of the software on Twitter. So this is the default template. You have also information, the geographical location added to the tweet. And this is the second version displaying the line. So it's now possible thanks to all the information that I added to Wikidata to get the line where you have the delay actually. So you can easily subscribe to the keyword on Twitter and get also the information through that way. So on Telegram you have a public channel and a bot. So here is the web version of Telegram. You can see the public channel and all the messages. There's a lot of messages there every day. And here you can have a glimpse of what is the Telegram bot where you have some specific comments like alert reset and you can add your alert or remove your alert. So here for example you have only two alerts for the Line 97 and for the keyword Carnion. And so when there is a delay I get the message directly on the phone. You can find the URL of the Telegram bot on the Twitter account. There's a link on the left where you can click and join the bot. So what did I learn by doing this? I learned a new framework which is Symphony 4. I did this on purpose because I wanted to learn this first. Then I learned to do some Node.js libraries. I learned how to use Wikidata. I also learned SparkQL which was really difficult actually for me. I learned also how to use Eroku which is really nice to push an application in PHP. It's really, really powerful. I really enjoyed it. And I also learned the graph database Neo4j. What I give back to the community, it's a Twitter account spamming its subscribers with strings delays greater than 15 minutes. A Telegram channel spamming subscribers with delay greater than 1 minute. And a Telegram bot that subscribers are able to customize to receive their delay and alerts in real time. So what about the future? I'm taking my car to go to work. Basically. I just moved in so I'm not taking the trains anymore for now. But the application is really, is pluggable and we could imagine to have multiple gateways to it like for example the SMS gateways, an email gateways or anything like that. Everything is open source, it's on GitHub. You can check out the application, you can do whatever you want with it. And that's it. So if you have any questions. The question is a suggestion and not a frustration when you are taking public transportations and when you are taking the train of the connections. So I found a way to circumvent this. It would be for people who are regular travelers who have buses to have an intelligent chip. And when they enter a carriage, they scan their buses and the information is sent to the central system. And so when the first trains come to the connecting station so the people there in the station know exactly how many people are travelling within that carriage and need to do a connection. So maybe there are huge numbers of people who need to connect. So the second train will wait a little bit more. So taking the information into account. Technically, it's possible, I guess. But there's so many ways to improve train stuff in Belgium. I think it would require another session just to talk about that. But yes, I agree that it's a good idea and it should be done. The entire IRL project accepts full requests. It accepts full requests. That's in the language of open source. That means that you can do a contribution as well. So you can write it, you can write it in software and you can also fill that open source and then we have it. So instead of solving the root codes, did you have some contact with SLCB to kind of integrate your application in one way or another or refer virtually something like that? I had no contact because I am really busy with my moving since two months now. I haven't been able to work on the application since, I think, one month and a half, maybe two months. And I lost a bit of... I put that aside right now. But no, I hadn't any contact with them. I know that they released a new site for giving statistics about... what's the name? Mobicules, I think. And I think that could be integrated as well in the application because we can scrap the page twice per day and get the data and send them on different gateways really easily. But I had no contact with SLCB. Maybe if I don't know how your relationship with SLCB got involved and if there's now some more openness from the institution to go for innovation I think before talking about that with SLCB we should publish the application and instead of providing telegram and Twitter gateways we should provide another kind of gateway which is probably email, for example. We should let people subscribe to, for example, some keywords, I don't know, most back-end carnival, and then receive the alerts by email which is more formal than Twitter or telegram. That would be maybe much more interesting for them. But right now I have no time to work on this but it's in my head so maybe one day I'll do it. Maybe one day I'll do it. So this wasn't just a side job? Yes, of course. I'm doing that in the train. I have four hours to spare. That's why I said if you stop you cannot do it in the car. Exactly. It's more there in a timer. Yeah. So yeah, as the initial creator of Irel but then of course a lot of other people joined as well like Brecht and Bert were sitting here also contributed to Irel. I'm really happy that someone started to be using it and just thought that we created this thing. I believe that's really the core of what we wanted to do like stimulates people coming up with creative solutions for their own frustrations. It started from frustration. So that I find really interesting. My question to you is if someone now in the audience is triggered to also start using the Irel API and the Irel data that's out there what would be your hints? Did you make errors that they should not make right now? Yes. I made some mistakes with Node.js library actually. I was technically I was sending all the requests at the same time to your server. So at a certain point I'm done. I was blocked sometimes instead of getting 200 I was getting 500 something like that. So I'm using a library which is called promises for Node.js where you can send a specific parameter which is the concurrency and you can send for example only two requests at a time and I really improve the performance thanks to that. I also use the cache in Symphony 4 to say for example the station list instead of requesting it every time every two minutes to your server it's cached for 24 hours. Thank you for that. For the PHP application, the main application a lot of stuff has been done for the cache because it needs to be fast. It really needs to be fast because it needs to process all the data in two minutes and all the data needs to be sorted by time actually. So I cannot broadcast a message as soon as I receive it. I have to first get all the messages sort them and then broadcast them. So this was a challenge. So that's 600 requests for each two minutes? Yes, exactly. To the live ports, yes. But I think you are already using some cached tags in your server. So this is taken into account in the query so if it hasn't been modified it doesn't do the request. So yes, this is taken into account of course. Thanks for that. Yes? The data you created already exists but it's closed. The line data linked to the station. No, it's not closed. It's on wiki data. It's just that when you have a delay it's always from station A to station B. So I'm keeping all data from wiki data somewhere in the cache and then when I'm receiving a delay from iRail I'm just checking what are the connected lines on station A what are the connected lines on station B and then I'm doing a div actually to get the line which is impacted. This is how I do it. For some stations it's not possible to get these lines because they are passing through multiple lines so it's not possible. But for most of them, yes, I have the line data. Any more questions? Yeah, we're perfectly on time. Thank you. Thank you everyone. Thank you.