 So, two years ago we had a discussion at the Dev Summit about moving towards a service-oriented architecture and the main arguments that were brought up were around decoupling teams, establishing interfaces between teams and services, improving security by isolating functionality and reducing the privilege each service has to the bare minimum, testing, improving testing by having these interfaces to test again rather than have one big monolithic application, fault isolation so that if one service goes down only ideally a particular feature is affected rather than the entire site, then an increased ability to leverage other platforms and projects so we can use existing libraries tools even if they are not in PHP and finally have an incremental path towards narrow interfaces from the monolithic, fairly monolithic application we have in Mediawiki. Two years later we have a lot more services now. The green ones are basically new since Passlead as one of the things that people see of them as one of the services was already deployed by then. So, we got Mathlead which is using Mathjax, mostly it's basically a super thin wrapper on Mathjax and that is maintained by a separate foundation. We have Graphlead wrapping Vega, again third-party client-side code, Saitoid which is wrapping Zotero which we are not super happy with but at least it got us started which is actually a browser plug-in usually, was it a browser plug-in. Yeah, a couple others that we built ourselves like REST Base and it is query service, ORS uses machine learning things in Python. We get a query service, uses a Graph database, Quantran translation is a NodeJS service, offline content generators and NodeJS service which is generating PDFs when you click on this article as PDF that is basically what is producing that and finally reading has created a service to massage content for the apps and increasingly also the web experience. And Event Bus is the latest arrival which is a small wrapper around Kafka and Event Bus, Event Queue. And there's a couple more in the pipeline like Thumbboard, using Thumbboard for Thumbnailing. It's an existing Python Thumbnailing service and security I know is considering to move out password storage and into a separate service to minimize the code that has access to this information. And there's discussion about maybe having an API-driven front-end service. And there's a couple more services that we don't typically consider services that are mainly maintained by third parties, but there's also a couple of new ones like Elasticsearch, Kisandra, Logsash, Kafka and of course MariaDB, MySQL, Epachi, the classics, Varnish. So a lot of services actually and a lot of them new. So what has worked well? This is a couple of things that I came up with. I think the clear interfaces that we defined APIs have helped the couple development. Early on for example in Parsuite we had one team developing the parsing side in Parsuite and one team working on visual editor. And there was an interface in between. There was a DOM spec that allowed both teams to coordinate at this boundary and test independently at this boundary and so on. So I think that has overall worked well. And we even had third party users use these APIs for things that we didn't foresee. So content translation, for example, came out of that, out of an experiment that third party users just set up. Testing is working generally fairly well. I think most of these bigger third parties, bigger services have pretty good test coverage, like RaspBase and Parsuite and so on. There's mocking that we're doing to basically mock out existing API calls and so on. And in some cases we just use the existing infrastructure, just hit production basically. We used quite a few third party projects that are already mentioned. We got to share some things between client and server, including both code and skills. For isolation we had cases where services like OCG went down and it only affected the PDF render feature. It didn't bring out the entire site, even though it was a catastrophic failure for that service. That is a good thing. If we basically it doesn't make a deploy potentially everything breaks because one minor feature had an issue problem. Then we had open swagger, open API specs. So they are now renamed. The Linux foundation actually adopted the swagger spec and it's now under their auspices, I guess, and renamed. We use that for API documentation and generate documentation and there's also derived monitoring, testing and client generation. So you can run basically a tool to generate a Python client for an API that makes it easy to interact with based on the spec. And we have managed to share some infrastructure across services by standardizing a couple of things like service runner, service template node that is a project to make it easy to get started building a node.js based service. And there's a puppet module that is used across these services by exploiting this uniform interface. Issues. There's also it's not all awesome. Basically bug number zero. Documentation could be better. And infrastructure could also always be better and more streamlined. That's a perennial thing and so we are working on that. It's not an unfixable problem but it takes time. Installation and maintenance for third party users is something we haven't really, we have mostly ignored for quite a while. We didn't really have a plan for how to tackle that. And recently we've created a Docker based prototype that basically makes it easy to install and keep a media wiki install up to date. That is very early days and we don't have a clear ownership for that and we don't have any resources for it officially so it's a side project at this point. But I do think that we need to figure out who owns this and what our strategy for this is. We had a session about it yesterday and she'll presented a straw man there that we might, that we could consider setting, basically solving this problem and then setting a date at which we say we don't support PHP only installs anymore. We now require services. They have become part of media wiki and that requires us to actually solve this and demonstrate that it's solved. So I think we need to actually get serious about it, make a plan, get the resources in place and do it. And tell everybody that they're committed to that rather than it's just an experiment. Then ownership and responsibility for some services is not clear. That is not necessarily super special to services. That's an issue in other code as well. But it needs to be solved nevertheless. One example is LCG. It's a service that was started with a lot of time pressure because some other data center depended on the shutdown of a data center depended on this being ready and now there's no clear team owning this and we basically need to find a process to either find a new owner to adopt the service and to take on the responsibility of fixing issues and also get the freedom to actually make decisions that are necessary. And those are often product decisions like do we need PDF rendering or how good does PDF and rendering need to be? And those are prioritization and resource allocation questions that's not just technical questions. And so we need to make sure that the team that owns such a service and has the responsibility to run it also can make these decisions I think. Ultimately it's about lining incentives I think if you build it you run it kind of. So if you mess up and build something unreliable you will get the pager and I will actually teach you very quickly. That's a nice talk by Randy Schup. The slides are linked from the talk description so if you want to have a look there's a link. It goes into depth about the lining incentives and nudging people in the right direction making it easy to do the right thing rather than have a central big planning thing. And now I'm done. So discuss. This is Matt Flaschen regarding like being able to develop things in new programming languages and environments I think that's definitely a benefit in some cases because it allows you to use a tool that's better for the job. But on the other hand having like a wide variety of programming languages and environments in use impose a greater maintenance cost in the long run so that has to be considered by the overall organization and ecosystem and considering balancing the need for maintainability with the need to use the best tool for the job or which considered the best tool for the job. So I think we need to bear that in mind. That was something that was pushed by ops especially initially that we should limit the number of platforms that we support because there's a cost to supporting each of those. And right now we have basically PHP, Python and Node.js primarily. Some Java now but there was a lot of resistance. There's a lot of resistance so. So you just, this is Rob Linefier. You just mentioned the resistance to other programming languages and you're being a champion for us having like basically use the right tool for the job use whatever programming language is appropriate for that particular job. Do you see there being sensible limits to how many programming languages that we would introduce? Yeah, absolutely. I agree with having a limited number because I think the benefits, there's a diminishing returns. I mean if you, what do you gain from having 25 programming languages apart from novelty factor? There's ultimately not that many platforms that actually have decent libraries and so on where there's a strong use case for, or a strong case to be made for using why that is the strongest, best tool for something. I'm not sure that we use the camera for a long time but I think most people are happy that we didn't use, depend on it much anymore because we didn't have a lot of people maintaining it. And we have a lot more JavaScript developers. Hey, thanks for this. I think it's quite excellently put. I think I'm worried by the way, I'm with the performance team. What we should remember is the point is not to have 16 programming languages or 17 programming languages or to identify the precise numerical amount but to, I think, have a shift in our thinking where we put more value on the power that individuals and teams can have when we give them freedom to use their talents and to use their skill and to use their experience in the way that they know as opposed to exercise an exaggerated level of mutual oversight that often leads to a stifling of innovation and a diminution of overall productivity. So, I will definitely say a critique also of services. I don't think we've quite lived up to the security promises that we've done. I feel like we have not done the segmentation that we should do. I think that's a critique of, well, just not having enough time to do it all. So, I think we should keep pushing on that and I would certainly encourage that. On the languages side of things, definitely as we bring in new languages, I like how with Node.js we have a template that we're starting with. I think it's a sane way to start, especially with the security review process. It's easy to say, okay, they're using the services template. I kind of have a basic expectation about how this is going to be, you know, stuff's going to happen here. When we start over fresh with the new Python framework, then we have to start over from scratch and review, do a lot more review to do that. So, I would definitely strongly encourage, let's keep doing templates for anything that we decide we're going to roll out in production. I want to push back a little bit on the team autonomy thing because while teams definitely can work better in the short term, if they have like more autonomy on exactly, I'll use the library I know best because I have experience with that and I think it's a good tool for the job. So, I'm going to add another library to compose and just go ahead and use it. But in the long term, that team is not going to be maintaining that feature forever. People are going to move on, get assigned to a different feature, maybe leave the foundation, stop working on midiwiki. So, the autonomy has to be balanced with the desire to have one ecosystem that we're going to be maintaining long term and have a critical mass of stuff that we're willing and have either the knowledge to maintain or within like touching distance of stuff that we can learn to maintain rather than just everyone picks their own library and has full autonomy of absolutely everything. Yeah, I agree with that also from a vertical perspective. So, with teams like security and the performance team, being able to have more operational insight into different teams, what they're doing, I'm able to assist them when they need help. It's quite important when there's not too many languages at play. I'm Murray Pesser. I'm working with the Department of Agriculture using the midiwiki software. And I didn't see it up here, but I know it's in everybody's thinking that Composer is an important tool to use here. And I'd like to just also say that maybe we should have something that lets us talk to shared hosting providers as well, because not all of our users of midiwiki software are going to be able to get to the command line and use Composer effectively. So, if there's some way to insinuate that into the conversation a little bit, I'd like to do that. Yeah, we had that discussion yesterday to some degree, but I guess the question is if you start from a solution or if you start from I'm a user with these technical skills and I want to spend this much money and I really want to get as much nice functionality and performance as possible. And if you treat it more as a high level optimization problem, then you can kind of choose several solutions and maybe there are some others that like virtual machines that are now coming within reach, but there's also still, yeah, there's problems to be solved there. And I think I personally think that using shared hosting as an environment that we don't control is very difficult to build upon, because we have to deal with a lot of variability and we can't do most of the things that we would like to do while in some other more constrained environments where like a virtual machine with containers, we actually can use very limited effort to by standardizing on a couple of things to provide a lot more functionality for the users. So I personally think that's a more promising approach, but yeah, I take your point on that. I think that the question I'm going to get is an application developer is going to come from the people that run a data center and they're going to say what do you need for you to build your application here? And if I could hold up a piece of paper that had the whole list on it, that would be very convenient. That's just the way the government's going to do it. Yeah, a recipe. Yeah, I'm Chad. I just want to add to what you said about the, that's the way the government's going to do it. It's also how a lot of private industry is going to want to do it as well. Even if we go the route of, you know, properly packaging everything into a nice container, you know, so it's effectively, you know, throw it on a VM and everything's there for you already contained. That's not how the security review process in government and a lot of private industry, especially like banking industry works. So if we have users in those areas, you know, even though we've built the container for them, they're still going to want to review the different environments that we're then exposing them to at that point. So like if, you know, we have REST based, you know, for example, you know, they're going to want to know, you know, like no JS is going to have to be acceptable in their environment at that point, not just, you know, the core PHP, you know, lamp stack that we have right now. So that's just something to consider as well. Like even if we, even if we make it easy to install, like there's still going to be cognitive overhead for third party users who are going to have concerns about installing these extra things in their environments where they might have, you know, real security concerns as well. So it's just something to keep in mind. It's kind of two ends of the spectrum, like the super lockdown, want to move really slowly. And the other is people who install the wiki ones that never upgrade even if there's major security issues because it's just too hard. And I guess we can maybe focus more on the small installs because I think those have the hardest time keeping up right now. I just want to add to what he said. Forget what his name was. Let's comment a few times ago. I think it was Roy. So I mentioned a few small installs on shared hosting. And I just wanted to add that, like I said, like there's a balance between how much user has access to and how much they are capable as an engineer or side administrator. And there tends to be some connection there but not necessarily there. For example, I manage things in production at wiki media but I also have some shared hosting wikis because I've had them for a long time and it's just convenient to keep them around that way. Also for the customer to not have to have the overhead of restarting the server or something like that. I guess it's very convenient. But I wanted to add that it doesn't necessarily require command on access to install media wiki, right? So if you want to make contributions back upstream, you might need Composer and Git and all those things. But if you just want to install media wiki, aside from all the many shared hosting providers that have a one click install solution which is even easier, even aside from that, we provide tar balls that already include all the component vendors so you wouldn't actually need to run Composer install manually. Like we provide a package that already contains all the PHP file statically rendered for you just to install a new server without a container that is. It's just a tar ball of some PHP files basically. Hi Max. So I would like to point out that in general services are much harder to debug and tune as opposed to just scripts that you drop into a patch root simply because it requires more qualification. You can't simply hack something quickly in a service because it would be harder to get the data out of it so. And I mean even talking about simple sysadmin tasks, not about development, but just you started Parsoid, something goes wrong. I would say that debugging problems in Parsoid would be harder than doing the same for media wiki. Well I guess if you juxtapose a simple script and rather complex service like Parsoid then I agree with you but I think the alternative is not that but one huge blob of a lot of complex functionality that interacts in random ways and several smaller blobs that do complex things but they are relatively smaller and their interactions are easier to understand because you can look at the traffic and their interfaces. So that's basically what this is aiming at. The scaling issue of how can I still understand what the service does by limiting its scope and defining clear interfaces. So context question, what was the purpose of the meeting? Is it more of an educational or solution finding? So the reason why I ask is it feels like there's just a bunch of laundry lists of complaints right now and I'm wondering if you wanted more of a let's talk about what we should do next type conversation because in my view from release engineering I think services are going well, the only things that I'm really wanting to see improved from my perspective is kind of things that you're wanting to improve which is the ownership and you get the pages, you get responsibility but also who's responsible for what, so going through that, doing that exercise of who's responsible for which services, what's the actual person's name, not even a team, a name or team, whatever, but that kind of conversation and I think that might help push forward some of these other aspects that people are complaining about languages or whatever that might make those conversations actually practical at that point. More accountability. Yeah, I think I was hoping to basically see if there's other things that people see or if this is mainly the ones that most people see as pressing and then maybe brainstorm. So I have two things to say, first of all saying that you either have a big monolith blob or a small herd of services is something of a false dichotomy because you can have a well library-ized code that runs in PHP. The second thing that I wanted to say was that I like services for performance and scalability. Not all that fond of having bespoke services that are required for core functionality. And again, that would be a nice case for a PHP library that we could use in core or we could use from a service that's running in PHP for higher scalability and performance. But I don't think that that has been considered at all by anybody. That's a recurring discussion, I guess. Could be right past it in PHP, basically. We had that a couple times yesterday and it's not easy. There's no, as Subu mentioned, Subu here. Subu, do you want to say something? Closing words? If this is specifically about parsoid, I said this yesterday. So right now, even today, there is no HTML5 parser in PHP, so we cannot do it. We cannot do parsoid in PHP. That's all I have to say about that right now. I just have one question. Is there a HTML5 parser in C or C++? There are some, but interfacing them with PHP. Okay, why not make a PHP? Sorry. I apologize. Why not make a PHP module or a Apache module that interfaces with those? We actually investigated that. Anyway. I did sort of review a whole lot of HTML5 libraries, including looking at the C and C++ modules. And really, there aren't many. The one that is in Mozilla is actually generated code. It's generated C++ code, which depends on a whole lot of random stuff throughout the Mozilla Firefox code base. So you can't really easily split it out and put it in a library. So really, there aren't C, C++ options for HTML5 parsers. There is actually a PHP option. It's not perfect. It would need a couple of months of work probably. But yeah, it's not far off.