 Okay. Hello, everyone. Today I will tell you a little about Python from a C++ programmer's perspective and why it's cool, even for me. And before I go right into it, I just want to show you what I will talk about. First of all, I will give you a short motivation why Python is cool, even for C++ users. Then I will tell you how to embed the C Python interpreter. I will tell you how to interact with Python code from C++. And I will give you hints on what you can do to improve the user experience, so to say, to include batteries. And then I will finally come to the end of my talk with a short summary. Okay. So, Blue Yonder is a data-driven company. We do predictions for clients and do automated decision making. And for this, we have developed an in-house prediction platform. And at the core, it's a distributed task scheduling system, which also handles a few other things, for example, access to data sources, such as databases and key value stores. We've designed it for reliable 24-7 operations. And it's written in C++. And while our platform simplifies and reduces many chores for model developers, it's actually not a real help for model development. Because C++ is a statically-typed language, and iteration cycles tend to be pretty slow, and library support is not that good as somewhere else. So, we looked beyond C++ and basically immediately found Python. And Python has a thriving community with great libraries and toolkits, such as scikit-learn, which basically gives us machine learning algorithms for free and also provides a framework for putting our own algorithms in there. Python can also be quite efficient by the use of optimized libraries such as NumPy and Pandas. And the core language of Python provides us with a quick feedback loop. You can just open your Python interpreter or an iPython notebook and just tag about and see what happens. So, the solution we opted for was to embed Python in C++ to somehow combine the words we already have, so the safe reliable operations from our C++ application together with the fast model development Python allows us. And we also get the added benefit that we are quick to production because there's no need to redevelop the Python model in another language. Okay. So, how do we go about integrating the Python interpreter? And this is probably the hard bit of the talk where I start throwing C++ code at you. And there's a few essential C Python reference implementation calls we need to make. And what you would do in Python is you would basically create a context manager class, and this is what I also did in C++. We define a C++ constructor which corresponds to the enter method you would have in a Python context manager. And you basically just call this pi initialize x function and disable signal handling. This is useful to gain control over the interpreter. And of course, where you have an enter in a context manager, you also need to have a destructor or an exit method. And this exit method we just call a pi finalize method of C Python. Okay. So, how would you use this? It's pretty simple. You take an existing C++ program, for example, here at the main function. You just create an instance of this Python interpreter class. This would be similar to opening a with block in Python. And then you just do anything you want with the Python interpreter. So this is pretty neat, but it doesn't come without cavits. Because interpreter re-initialization can be a problem that's well documented in the C Python API. So we tend to keep just one interpreter alive for the whole lifetime of our process. Okay. Things become a little more involved when we go to a threaded environment. And this is basically due to the global interpreter lock. The C Python implementation comes with. And due to this, we have to make some adjustments. And in the main thread, we have to make sure that the interpreter is actually aware that other threads might be using it. And we have to release the global interpreter lock for other threads to acquire. And obviously, when we spawn other C++ threads, we have to make sure that anytime we execute Python code, we acquire and release the global interpreter lock accordingly. Okay. So what do we actually have to change? This is the Python interpreter class I showed you before in a little condensed form. And basically, we have to add some operations to the constructor. First of all, we have to call PyEvalInThreads, which makes the Python interpreter aware that other threads might be using it. And then we have to save the current thread state for later use. This is done with the PyEvalSafeThread function. And when we save the thread state, we also have to restore it at some point. And this is done in a destructor where we call PyEvalRestoreThread to make sure everything is properly matched. Okay. So this is basically the only change we have to do for the main thread. And for the other threads, we need another context manager. Let's call it global interpreter lock. And in the constructor, we call PyGilState and sure. This will basically make sure that there's a thread state for the C++ thread, the global interpreter lock is acquired, and it does everything that's necessary for us. And in the destructor of this class, we will release the GIL state just to match the bracket, basically. Okay. And to use it, you would just go into your worker thread, which might be some function, which is executed by a thread. And it just creates an instance of the global interpreter lock class. And this will hold the GIL as long as it lives. And you can do anything with the Python interpreter in the state. Okay. So far, I've explained how the actual embedding is done, but I haven't shown you yet how to interact with the Python interpreter. And for interaction, we have found a boost Python C++ library most beneficial. You can download it for free on boost.org, which is essentially home of a collection of mature and commercially usable open source software, which is the go to place, basically, when you look for a C++ feature, which is not in a standard library. Okay. And boost Python has lots of interesting feature, which we use. For example, it wraps lots of functions of the C Python API. And it wraps it, for example, to read C++ users of manual lifetime management of Python object. This is automatically done for you. What is also done for you is a type conversion between C++ and Python. This is pretty convenient. It also comes with some rudimentary exception handling support. And there's a few convenient features to expose Python C++ code to Python. So you can use C++ code in Python just by using import and the standard Python syntax. Okay. So how could I evaluate some Python code with C++? The first line basically introduces BP as an abbreviation for boost Python, much like you can do with the import statement. Then I define some Python code, which is two times 21. And to evaluate this, we use the eval function provided by boost Python. And what we get is a boost Python object as a result. And to do anything in C++ with this boost Python object, we have to convert it to some C++ data structure. For example, an integer. And this is done with the boost Python extract function. Okay. So CPP result would just be the integer 42, obviously. Okay. So we also have to add a little bit of error handling. And for this, it's typically common sense to create a try catch block or try accept, as you would say in Python. And boost Python will throw an error already set exception any time it encounters an error. And here's one of the weaknesses of boost Python. The exception is pretty useless. It doesn't contain stack traces or the error message or the type of the error. It basically just indicates that an error has been encountered. And you could do something else about it, for example, add some improved error handling. Okay. And this is why I suggest that whenever you use boost Python features, wrap it again with some decent error handling of your own. Okay. So that was the basics of interacting with the Python interpreter. Now it's time to include the batteries to make Python users happy. And I think this is probably the most important slide of my talk besides the summary. C++ developers should always think of embracing Python. So whenever you have a cool C++ data structure and you spend months developing it, try to think how you can expose this to Python. And there's a few defaults you should always consider. And one of the defaults is make it a list. And the other obvious default is make it a dictionary because, I mean, Python users laugh the dictionaries. And whenever you have more than just simple data, you should check for existing standards in the Python community. For example, when you expose something which is range-like, then use the standard iterator interface. When you have a database connection and you want to pass it somehow to Python for performance reasons or whatsoever, take a look at the Python DB API. There's a pep for it, I think. And try to make it look like a real Python object. And this leads basically to the general guideline that Python code which you are provided by your users and you're executing your embedded environment should never know or need to know that is embedded. And I will show you an example of this. And I call this example logging out of the box. The idea is most of you will know about Python's logging facility with a standard logging module. There's actually some Python code here. Just type import logging and then something like logging.warning and some error message and it will be logged. So what if we could not log this error just to some logger users have to configure in Python? What if we would automatically forward all log messages to C++ and also log it to the same storage which we have configured in C++? So this would basically remove the chore to configure the log module both for C++ and for Python. And when you know a bit about the logging module, the obvious idea is to register a custom logging handler which would do exactly this. Okay. So how would it look like? We have some CPP logger class on the C++ side which provides log function. And then we have the logging handler class on the Python side which provides basically the interface for all logging. And now we somehow have to bridge the gap between both worlds because it's not trivial to implement the C++ class which somehow implements the Python concept. And we opt for the solution which is quite often the best one in computer science. We just add another layer of interaction. So on the Python side, we create another class, let's call it forward to CPP. And this forward to CPP class implements the logging handler interface. So basically it just has to provide this emit function which is about the only thing you really must provide for logging handler. And the task of this forward to CPP class is basically to receive records and pass it to some C++ reference which we have stored before. And I will briefly show you how this looks in code. I start with the C++ part. So think about it. We have a simple CPP logger class. It exposes nothing but a very trivial log method which takes a message as a string and just prints it out to the console. So this is quite a stupid thing to do for logging. But you can replace it with anything that's more sophisticated. And now we have to expose this class to Python so that Python knows what a CPP logger is. And again, we rely on boost Python to do exactly this. We use the boost Python class template. And we basically tell it to take the C++ class CPP logger and expose it to Python as a class of the name CPP logger as well. This Bp no in it basically indicates that whenever we call the initializer of the CPP logger it should throw an exception so that Python users don't really initialize this class themselves. And finally the last line indicates that we want to expose the log function of our CPP logger class as the call method of the Python class CPP logger. Okay, so once this code is run, Python knows exactly what to make of a CPP logger instance created in C++. And on the Python end we have to define this intermediate class, this forward to CPP. Obviously it needs to be derived from logging handler to satisfy all the necessary concepts. And in the initializer we take one argument as a receiver and just store it in a local variable. And then we have to define the emit function just to fill in the missing gap which is missing from the standard logging handler. And the emit function will take some record, this will be passed to you by the logging framework. And we just extract the message from this record and pass it to our receiver. Okay. And so the only thing that's left to do is we have to pass this instance of our exposed C++ class to the Python class. And we would have to do this in Python. So first of all we create a C++ instance of our CPP logger class. Then we need to basically traverse through the main module, the dictionary of the main module to finally obtain the forward to CPP class. And the end result of this will be a boost Python object called create handler on the CPP side. And we can call this create handler function. And by calling create handler and passing it to the C++ action logger, quite a lot of things happen. Boost Python will check for the CPP logger class if it's already registered for automatic type conversion. Because we have done what I've shown you two slides earlier. This is the case. So to translate this class to a Python object with the signature as I specified before. And this object will be passed to the constructor or initializer of forward to CPP. And so it will know that there is a C++ object living somewhere and that I can call the bracket operator to call it. Okay. And once I have the handler, I can just use other methods to register the handler with the logging framework. And users can just use the standard logging commands. And everything they log will be forwarded to C++. And this is fine because you don't have to configure things twice. Okay. Let me come to the end of my talk. It's a short summary. And what I want you to take home from the last couple of minutes is that embedding Python is not really that hard. We have the basic API calls. The API calls are well documented. So basically there's no secret and no reason not to try it. When you try though, we have found that Boost Python helps. Actually it helps a lot. Even though from time to time it may be a little clumsy to use, especially when it comes to error handling. So you have to somehow add this for yourself. And I think the key message is whenever you embed a Python interpreter, make sure that your users of the embedded interpreter are still allowed to write Python code. Don't force your C++ data structures on them, but instead adhere to the well-established Python standards and conventions. If you do this, your users will basically love you. And if you do this most importantly of all, you still maintain a very important property, namely unit testability. If you can still run the very same code in your Python interpreter or your iPython module, you can just use the Python unit testing module to test your code. And this is something you basically have to do. Because otherwise all the code that is written and embedded in your interpreter will not be used in the future. Okay. So if you have any questions, please ask them now or just visit us down at our booth and obviously Blue Yonder is hiring. Thank you. Hi. So you said a couple of things like, okay, so you should be using Python protocols and you should embrace Python. And so I'm a little surprised that you're using, so you chose to use boost Python for embedding Python. And that actually has a very C++ centric view, so it sounds like a somewhat of a contradiction to me. So you showed this logging example for example, which goes through kind of three different pieces of code and kind of complex. So could you explain a bit why you choose boost Python instead of embedding something very Python-like like Python? Okay. So we decided to use boost Python, mainly because the boost library is heavily used in our code. So there's lots of other modules there. And it's been around for quite a while. So we were pretty sure that it works. And to be honest, it didn't prove too much of a problem so far. So I mean, from the perspective of our Python users, they don't really know that they're running inside of C++. It just feels natural for them. For me as a C++ developer, boost Python feels naturally a language to program in or a library to program with. So I don't see this paradigm problem. A C++ programmer has a C++ view. He can use that. But it still feels Python-like for the Python user. I don't see the need for a C++ programmer to use more Python than it's required. So first I just want to say I'm glad to see more boost Python. It's a beautiful library and I think deeply underappreciated in the Python community. So I had two questions. The first is to answer the phone. One, I noticed you said in your Python interpreter class, your instructor was calling py finalize. But last time I checked, it's actually not supposed to be done if you're using boost Python because there's some static initialization issues that might occur. Are you guys not seeing any problems with py finalize or is it just not an issue for you? It's not really an issue for us because we only do it once. I've seen issues when you do it multiple times. And you're right. There are some issues with static things. It depends on really what you do. But so far we basically have a main thread. We start the Python interpreter there. And it's all tidied up correctly. I mean we don't see segmentation faults when we quit our programs. Okay. That's all fine. Okay. And I guess my second, it's not really a question, I guess more of a pointer. There's a library called Acquart. It's a C++ library and it's designed to simplify embedding Python into C++. And it exposes a lot of things like, you know, bytes or objects and bytes array objects and it and the logging library, for instance, is exposed into C++. And in fact, you can create logging handler subclasses directly in C++ without any infrastructure in Python using this library. And have you guys, have you ever even heard of it? Have you used it at all? Actually, I haven't heard of it yet. It sounds very interesting. And I will definitely talk to you after. I'll track you down and talk that way. To leach all the information I can get. Yeah. Okay. Thanks. Okay. Thanks a lot. Could you explain a bit why you've chosen to embed Python in C++ rather than wrapping your C++ code and make it available as a Python module? Yeah. I can try to elaborate a bit. I mean I mentioned that basically we have this with the tasks scheduling system. And let's call it legacy projects even though they are still in development and stuff like this. They were already written in C++ and they had all of the C++ tasks. And we thought about that it's probably easier to embed the Python interpreter in some separate plug-in to our framework. And actually it was easy because it was like done. The basic prototype stood in like two or three hours and of course then you elaborate on it. And it also gives us additional benefits. I mean we don't have to rewrite all our C++ code. We can use our existing other infrastructure which was already adapted to the format of our executable to the command line parameters to the configuration we can pass to it. And it also allows us to mix C++ and Python tasks. So I can say here after school optimize C++ code, it will return something and then it will automatically be converted to the Python code. And Python can then do some cool transformations and just return it to the next C++ job. And you can chain it without even knowing that there's Python in there. So it seemed easier. I have another remark regarding this question just. So some people kind of think that embedding is something very different from extending. And so you can either provide a module or you can embed Python. So the only real difference is who starts the Python runtime and from that point on it's exactly the same thing. So there's no difference anymore because you have interaction between C++ and Python and Python and C++ in both directions. And so it's really just who calls Python initialize. Who starts up the Python interpreter? Is it the person on the command line who runs it or is it the C++ code starting it? And from that point on it's really no difference anymore. It's exactly the same kind of code and behavior. So it's something. Thanks for your remark. Can I just speak to that a little bit? 90% of what you said is true, but there are some really substantial differences between the two. One is GIL management which you brought up is something that Python.exe does for you or Python on Linux. But then you have to actually take direct control of in an embedding situation. You only have to take care of the GIL management as soon as you have threads started by C++. When you use threading in Python, so it started in the Python interpreter and this is all automatically taken care of for you. This is basically just something we had to do because we basically handle requests in separate threads in our main thread. I agree with you. We're saying the same thing. I think the other substantial difference and this is a development issue is that PDB can't be attached to embedded Python very easily. You have to insert things. You can't start a C++ program under the control of PDB and this is actually a strong argument that you can use in a lot of cases for not using embedding or in fact taking your C++ programming making it into an importable module even if it's 500 megabytes. This is something you can do so that you can actually get multi-stack debugging more easily. That's another substantial difference I've seen in cases where embedding has been used. So just quick remark on that. I buy the argument that debugging is a bit different when you have an embedded interpreter and can't start a C++ thread at all because you can just run C++ threads from an embedded Python interpreter. You can just start up a C++ thread somewhere and the other way around as well. So you have a C++ module in Python and that starts up a C++ thread so no difference there. I think we're all saying the same thing. The only point is that you have to manage the GIL if you do that. Okay, thanks for all those questions. Okay, thank you also for your talk. That's already interesting. An applause.