 in which Konstantinos Mechanetsidis will talk about QNLP in more depth and It will cover the experiments that we've realized and what we're working on at the moment This will be followed by a session an interactive session by Richie I was going to give you a demo of Lambeck our QNLP toolkit and Hopefully will make this Interactive a bit. He's already sent you the notebooks on Slack So you can already download it. I mean focus on what Konstantinos is saying But if you have a bit of time you can download it and install it All right, the floor is yours. Thank you very much Can everyone hear me? Yeah Okay, hello We will close with experiments on QNLP. I will show you how we realize in practice all of the stuff that Bob was talking about especially and specifically for for Realizing some quantum natural language processing on actual quantum computers as well as numerical simulations and I will close with the vision for the next year and the experiments we will Dazzle all of you in a year or so so At Quantinium we we are in the Oxford team as Bob has probably said a big group Focusing on experiments for QNLP and I will guide you through the Theory how we build models for QNLP and then at the end I will show you Specifically some of the results we have obtained very briefly if you want more details we can talk later and then at the very end I Will show you a specific experiment only one And basically a blueprint for an experiment, but that doesn't mean that I will be the only one that is only the first step for for large-scale QNLP with Text circuits, so let's get started. So I Have here Bob. I have John Firth and Jim Lambeck and each one has contributed a specific idea and If you get these three ideas You get quantum disco models, right quantum from Bob Distributional from John Firth who says that you shall know the meaning of a word by the company It keeps basically the guy was saying create word embeddings from large text by counting how they occur In each other's contexts and Jim Lambeck was saying I love algebra So I will make a quantum. Well, I will make a grammar model based on algebra and We will see how quantum and algebra play very well together I'm sure both talk about this briefly But here I will show you how we use this to build quantum model specifically. So John Firth's Idea for word embeddings here specifically will mean I will create my word embeddings I will stick my meanings inside Hilbert space Right and my model will manipulate these meanings so that I can make a task in NLP work And the compositional aspect is what gives us some sort of science science-based some sort of scientific handle of what's going on in language because I don't want to be Doing the mainstream thing which is stick a bunch of layers together in huge neural networks train it in a Okay, initially stupid task, but then it does impressive things like the things you have seen in the news GPT whatever however What goes out of the window is interpretability and knowing what's gone what's going on in inside this giant inscrutable matrices of real numbers so To tie things together let's start from where Bob left off But I want I won't dwell too much in this because it has been covered. I will just say that My tools will be boxes and wires as it has been throughout the whole day. I have states I have effects. I have processes. I have scalars and I also have an operation that allows me to kill wires, which is the discount and it's all boxes and wires and always has been So this is our base for for thinking about things and building models If you remember from the previous talks today, you should remember There's two ways to compose boxes one is sequential and one is parallel So one process after the other and one is at the same time right and by these two Ways of composing diagrams and Richie will show you during the demo how you can do this in software with disco pie With these two ways of composing diagrams Well boxes you can make any bigger diagram composed of small boxes, right? So from small processes you can compose them to make big processes And there's special kinds of boxes that if we open up inside we see that they're just wireings for example a swap the identity a cup and a cup Most importantly, I want to focus on what happens when you stick cups and cups together to make a snake and Bob has talked about this already. This is like in quantum teleportation, but here abstractly. It just means that wires can wiggle, right? I don't care about their shape. I just care about what they are connected like how they are connected and The big selling point of this is that I want to be build models such that the wiring actually means some Right, so that when I look inside my model when I open it up and I can see what is connected to what I can at least somehow understand something because if you for example see these Wirings that people draw in big neural network diagrams these these wiring actually don't mean anything You have to then we have to train in a big task and then go back and inspect I don't want to have to do this I want to have my model be such that its architecture actually means something and here The wiring will mean something it will basically mean how information is flowing around in quantum It's quantum information But more generally just information it depends on what type of distributional semantics I'm giving to the models today will only talk about quantum semantics So the wirings will tell us how quantum information I meaning is flowing around in my models Linguistic processes Let's go from word to sentence. I have some words in a sentence I compose them together according to grammar like Jim Lambeck was saying and then I make a sentence Right, I make the meaning of the whole. This is how pre-group grammar composes things every word gets Some types assigned to it and then the types have some algebra that says and with n to the R which is its right adjoint compose together and this composition I will show with these cups I Wirings so my grammar will only be wirings all of my meaning is inside the word states and Grammar is wiring and it just tells me how to compose things. So if I have the meanings of the words I compose stuff according to grammar I get the meaning of the whole sentence and That meaning is flowing on the s type wire The first obvious Model native nlp task you can do is check how similar to sentences are and this is how you do this You take one sentence sun melts gelato. You it's a state. It's a sentence state You take an effect, which is some other sentence upside down upside down stuff are effects And that other sentence is ego like what is it monk dissolves ego, right? Maybe melts is similar to dissolves, but the nouns are quite different. So this will thing will have some overlap But I cannot evaluate my overlap yet. This is just the shape of it. This is a blueprint for me to build my model yet Later, I haven't showed you that yet. And then if you want to go beyond sentence level if you want to go to text because There is this rock star who I follow on Twitter who says that it's the future and I believe him So if you compose sentences, you will get text. How do you compose sentences? Well with this Cossack? What does this Cossack do this? This Cossack says I don't believe it's sentence space. I actually sent in space is basically a bunch of nouns If you look inside So this Cossack says nouns are first-class citizens you have some You can initialize a bunch of meaning vacuums. Let's call them like this and Then every every one of these are distinguishable. So to every one of these wires a specific noun is corresponding Then all of these nouns they go into some text process as we had before for a sentence But now it's for a whole text. There is a grammar that we are developing in Oxford But it's for a text level grammar not only for sentence And then the nouns after they get modified by that text process they exit modified It's a very dynamic thing. Now. Let us see how these text process is. What is it composed of? The most basic thing is let's make some noun states, right? I have this meaning vacuum and then there is a noun preparation Box this prepares noun states. So basically you have the the meaning vacuum You have a noun preparation box and this prepares a noun state For this now sorry for the squeaks Okay, this is how I make noun states They are initial states. They enter a text So now I have order one processes these these are boxes. They are not states. They are processes, right? If you remember before at the sentence level I Had every word be only states everything was zero order and then grammar was telling you how to compose here I'm making distinctions. I said nouns are first-class citizens that makes them states zero order if I go to other type parts of speech like adjectives or verbs They will be processes that modify the lower order stuff, which is nouns an adjective modifies a noun, right? So if I have for example an adjective, you will ask. Oh my god. So let's see if this helps I will have some noun entering and This will give me out a modified now For example car This is the state of red car The intuition here. It's basically the point is that we are following our intuition to build models and it's not just our intuition It's not just loose things. All of this stuff is based on formal linguistics and there's a lot of Formal language theory behind it that that Formulates it in a way that it generalizes Now, let's see how a verb would look like say you have a verb like Loves standard examples You have dog of human dog and human enter loves Dog loves human right? So this is the combined state for the phrase dog loves human right So this this thing this loves box Is coupling them together and then they are not separable anymore? They entered as two independent things. They got Interacted by some verb and then now they are in combined state that state that if it was a quantum state I would tell you it's entangled that's exactly what I'm gonna tell you later, but now The pictures tell you exactly the same story without me having to tell you anything about quantum So you say it's a feature of the model for language in the abstract Okay, this is order one stuff order two stuff. Yes, the order matters, of course It matters if dog is here and human is here or the other way around Of course, we also love our dogs, but sometimes love is not two-way street and you have drama, right? Okay, no, I'm going to Yes, that's right But the places in which I have inputs and outputs are distinguishable. I mean, I'm not drawing my boxes like this right I'm not drawing it like this. They're not like the spiders of ZX right this Is different from this? This this box is different. I will go only one one order higher, which is to order second order Let's let's think of an example here an example would be an adverb, right? An adverb is a modifier of a modifier, right? An adverb modifies a verb and verb modify now so for example, I can have quickly and I can have runs This is quickly runs The higher-order thing modifies the lower-order thing This is the point of going order two. There are also order three things sometimes, but I'm not gonna get there today. Ask me later And now if you have a text like a story This is my simplified version of the matrix best movie ever This is the script of the movie, right? You have the nouns that go in neo morphiostrinity matrix kung fu Right, and this is this is how it goes like the verbs become boxes that modify nouns you have You see neo What do I have morphiost finds neo I have Then neo goes into quickly exits matrix then I have trinity loves neo and so on right We have the characters with their initial states There is a story that happens to them and then they exit modified by the way our team is developing a tool That does this in an automatic manner for a large fragment of English It's basically all based on CCG parsers they exist. They are trained on huge CCG three banks and then our team has figured out a way to To use co-reference resolution to see what nouns are the same between different sentences and then you string them all together into a big Text diagram. So this is not just drawings I mean they are my drawings, but you can you can see how it works also in software and richly can show you some of this later, but I Don't think we can make it available yet, but Coming soon So you see the bigger the story The bigger the text diagram So I'm not gonna draw you huge text diagrams But I really want you to to to remember that this is how it's gonna be my scaling parameter for my problem size here Is the text diagram? This is gonna be important later for when I talk about advantage and and why we want to go quantum even so The biggest the story the bigger the text so the the wider means more nouns and the deeper means more stuff in The story happens right nouns adjectives verbs and stuff like that One model native task text similarity you take two texts. You want to see how similar the stories are There are text that belong there are nouns that belong to that that are common to the two stories There are nouns that belong only to the one text and they're now that belongs only to the other texts All of these sets of nouns create a noun universe Okay, I put all of the nouns side-by-side in tensor product The nouns relevant to text one enter that text the nouns relevant to text to enter the other text from the bottom Text to has become upside down because that's an effect, right? And then if you take this overlap this should have the meaning of text text similarity in the same way as Sentence sentence similarity similarity worked. It's just the same generalized. It's a generalization of the same idea So compositionality allows us to take small ideas and generalize them to larger context without thinking more. That's the point Another model native task that is not global It's not taking all of the nouns in the universe and doing something with them. This is a local task I discard all of the other nouns that exit from a text. I don't care about them I just care for example to check How much the noun at the position I? What is its similarity with its past self after it exited so this is like Okay, I gave it a cute name quantifying the character arc, right and if you take this idea of Throwing out all of the nouns you don't care about and doing something local with One or some of the nouns you can do something which is actually done in the real world There exists real-world data out there for this task. This is called question answering So what do you do? You take a context text that says a bunch of facts That that's text t1 One there shouldn't be there just t okay You discard all of the nouns that are not relevant to the question you take the question you make it into affirmative statement You stick it as an effect So basically you do text similarity between the text and The question but the question acts locally as an effect so you can throw away all of the other nouns Now imagine my text my context text scales my question will not also scale most likely usually questions are Small knowledge and Context can scale but usually you ask questions of some finite amount So or like finite support here to be to be precise So I have local question and then you should imagine that the text grows in size if my context text grows So now let's finally go quantum all of this was abstract for model building. I started half past, right? Okay When we go quantum we want to respect the tensor products If I want to ask a question about an attribute how does this work? So if I want to for example, I have a red car whatever second-order thing Right, and if I say give me the color of the car. That's quite tough because it's not alone Yes, you can say car is red. It's a statement stick it as an effect to red and car So red is an adjective Well, let me think Yes, car is red will be a legit text diagram. You can flip it upside down Dagger it now. It's an effect So car will exit the text that says whatever the car is You stick the car is red as an affirmative as an effect And if you have a high overlap and you're confident that it's red If the context text said Car is blue and then you stick car is red. You should have lower. This is the idea Any other questions before I go to model building? Yes, I need the microphone. Otherwise, it's not going to be recorded so when you mentioned the overlap within the entry in state and the effect is it about I Mean you have to define it quantitatively, right? So what is the overlap between red and blue or happy and sad? Because otherwise everything is or to normal if it's not equal. So there should be some metric I guess This is the point of going to model building. So I haven't said anything about how you make anything quantitative right, this has been all abstract and It has nothing to do it's only the compositional part I have been only talking so far about the compositional part but This framework is called this call distributional compositional the distributional aspect is exactly what you're saying is making everything quantitative Right now we can give distributional semantics We can decide what are the spaces on which the states are defined? What are the state spaces that the process is modify? Right, we will decide that they that we want them to be quantum Hilbert space and quantum processes Right states will be defined in quantum Hilbert space processor will be quantum maps and so You can get quantitative results by doing But like exactly the same way that you get quantitative results if you do anything in quantum you take overlaps of states You measure operators stuff like that Okay Now why do I have a big picture of the tensor product here is because quantum theory inherently is Is is built on the tensor product sisters compose according to the tensor product and I want that because my grammar theory my my boxes and wires that describe my grammar model the compositional part are Also behaving as if they compose with tensor products because Because basically I have non separability when I compose things This is what Bob was calling shredding your compositionality or quantum compositionality. I'll just call it composition so now Right tensor product before with the diagrams means just things happen in parallel When you when you give the model semantics tensor product will be chronic your product the usual So if you are simulating these things with linear algebra Because linear algebra is how you simulate quantum theory on your laptop or by hand or on a blackboard It's the chronic your product the usual outer product the chronic your product of two matrices or vectors or tensors or whatever Quantum theory if you have a quantum computer and you put quantum systems together in a controlled setup. They will just compose By themselves with a tensor product, right? So yes, the usual tensor product here. I have I'm really beating the dead horse to to show you that tensor product is flowing around And so this F is is what I'm calling my semantic functor jargon aside, it's basically a map that Does there's a mistake here? I'm sorry about this. They but okay. It's a map that does box wise substitution of The boxes and the wires, right? So every wire gets Assigned some qubits. So for example, we had before a sentence level that every word gets a type The noun type will get Q and qubits the sentence type will get Q s qubits. So every wire Carries some number of qubits and it's up to us to decide how many they are the number of qubits decides How big does Hilbert space dimension is that flows are along this wire? and If I have Q Q bits on a wire the Hilbert space is two to the Q dimensional, right? Everyone knows them My mistake in this picture is that I have bent the wire on the right To make it an effect, but I didn't do the same on the on the on the right-hand side Negligence on my part But you will understand from context in the next slide what's going on Now I want you to focus on sun that goes into melts, right? Sun becomes a state preparation circuit Unitary so it's a circuit quantum circuit u that is parameterized by a control parameter set theta and I have subscript s so theta s is theta for sun. Okay, so sun gets its own parameters that go into a quantum circuit and They prepare a quantum state For the word sun The same happens for melts. There is a theta melts that goes in a parameterized quantum circuit and Prepares a state for melts and now after you post-select all of these Okay, let me let me show you here So the cup the cup becomes a C not and then post-selections this you can prove with the the rules of ZX and It's it's it's trivial. You just do this You have a cup like like in Bob's lectures. You say I'm gonna put a An identity spider here. It doesn't mean anything. Oh, by the way, I can grow a One-legged spider from either with unfusion and then oh look. This is a C not gate Right. So a cup is basically a C not followed by post-selection on the plus state and the zero state, right? So now that you are experts on ZX, this should be obvious to you. So the cups become C not and post-selection if you bend stuff around Which you can but you have to be careful that you're bending with 180 rotation. So that is not dagger. That is transposition and it's easy to make this mistake when you when you write the code but You have to automate it once and then forget about it But this is this is crucial. This is transposition when you bend states to effects in In disco cut, right at the sentence level And now if you have two sentence states to Two sentence quantum states you can of course take their overlap this this comes back to the question before How do you quantify this you can take the overlap of two quantum states, right? You can use whatever protocol you like. You can use the swap test or something. There are standard things to do this So you have the quantum state of one sentence the dagger of the quantum state of the other sentence And this is a valid quantum operation. You can evaluate it on on any quantum computer You like I haven't told you how what these fetas are that I'm bad my meanings. I will tell you about it later and Then when we go to text level which is which is the most interesting thing Again, let us see how we replace my how I make my circuit components for all of the All of the parts of speech according to their order, right? I said I have zero order one order one order two things So the order one things are the nouns. So what happens to the nouns? Well, I will choose a convention for my meaning vacuum. It's going to be the zero state, right? All of the wires in this co-circ carry noun types, right? We said nouns are first-class citizens nouns flow around and stuff happens to them So all of the wires will have the same number of cubits, right because they're of the same type So Qn I choose it once. It's the dimension of the wire of the Hebrew space flowing around my wires And you see a noun preparation Circuit will be this u theta n and this theta n depends on the noun n, right? so Dog will have its own parameter set House with have its own parameter set will be different and so on and This this thing prepares a state, right? So for example you have Theta of car This state is prepared by u theta hitting the Vacuum which I chose by convention to be the zero state and it that's without without loss of generality, right? so If you go to the one order things, they will be well The order one things are easy. I'll just make everything be unity. This is this is a choice. I can make them be a Generic quantum maps by adding ancillas and discarding them, but let's not go there It's everything I will say from now on Will still hold if I make these more general quantum up. So let's stay with everything being unitary So whenever I see a box, I will make it a unitary box and And you see whenever I have adjectives or verbs, I will always respect You never have the situation where you have more wires that come in a box than exit a box There's a conservation of nouns right as many nouns enter a text as many nouns Exit the text so it's fine to have everything unitary because Unities need to have the same input and output Dimensions, right? So if I have some adjective, for example, I'll give it a unitary and it will be parameterized by some parameter set specific to that Now the higher-order thing is a bit strange One choice is to say I'll break it like this like a sandwich. I have this this comb thing that is a higher-order Higher-order box like like for an adverb. I will I will break it into unitary is that sandwich something? So let me show you what I mean so if I have They quickly runs was our example before I will have you beta Quickly some parameter set with index one and there will be some other second parameter set for quickly so quickly has two parameters sets and Whenever I have something like runs so you theta run Is just the sandwich by these two guys right so these two guys for quickly sandwiching runs will return to me a Circuit that is supposed to be for quickly runs the whole thing. Of course, I can simplify my life I can say from order to and above I Don't want things to be quantum Processes I don't want everything to be quantum processes all the way to highest orders, but also, you know You won't have infinite order stuff. This is Usually in language everything stops to order three or four, right? I mean how often do you have modifier of modifier of modifier of modifier? It's it's kind of insane. So What I can do is say I don't want this to be quantum maps another choice is to have say so I have here for for runs I have Runs and I can have a classical control for quickly that modifies the parameters of runs, right? So this can be a classical function like a neural network parameterized by its own parameters for quickly and here its input is theta runs So theta runs enters quickly and is modified and enters here So that thing is total is together quickly runs. I mean also this makes sense, right quickly. It just modifies Modifies the the runs, right? It's It's like it's like a knob Or like very very will also be higher order very red I can have like a classical control on the circuit that is very and just modifies the parameters of red and just have very red or like Less red stuff like that But all of these are just design choices and they're all valid Okay similar idea And now to come back to the the point of how you do something interesting with Quantum text circuits quantum tech circuits for question answering for example As I said before I have a context text. It's a big quantum circuit now the context text this and this theta t is all of the parameters it's it's the it's the Coordination of all the parameters sets all of all of the words that involve that are involved in that text Up there in the t-text Theta q the same theta q is the set of all parameter sets of all of the words that appear in the question Q So I have a big quantum circuit all of the nouns go in Yeah, so zero states are initialized on on qn qubits For each wire that you see here Every wire is for a noun. They enter a big text circuit like my my matrix story before then The nouns that are irrelevant The the quantum wires that are irrelevant to the question are discarded. I never measure them, right? That's what discarding means and then I do the dagger circuit of the Question text which has been turned affirmative It's an effect and then basically I can measure everything. I can do a bunch of Measurements I have deliberately drawn these with greens because they are the so-called bastard spiders from the from the Dodo book that That Alex and Bob wrote picture in quantum processes If you want to see how quantum and classical wires interact First of all, you need to draw two types of wires The whites are the the quantum the greens are the classical and then there are extra rules rules of how classical and quantum wires interact and The the green dot is basically a z-spider and And And Very conveniently it models the coherence I measurement in the Z basis So basically this thing tells me measure all of the cubits there if you measure all of the cubits there You will get a probability distribution for all of the bit strings. I in the computational basis states that That are defined by all of the green wires, right? If you go and see the probability of all zeros, this is quantifying the this overlap Right because zeros zero comes in when you prepare a noun When you are prepared a noun You want to see how much zero comes out? Right because this theta q it basically includes the Unpreparation of the nouns that are included in theta q So if I see how much zero is coming out how much how often I measure the zero states there This gives me the overlap of this whole question answering thing. This is very important questions about this because This is basically the whole Setup you can take this and generalize it to other things and since I set up the whole thing Motivated by wanting my wiring to mean something. I set it up so that it has tensor structure I could have said well forget quantum or I don't know about quantum. Maybe I wanted everything to just be tensors If I ended up because the model says that's how you should do question answering, right? It's this is this how this is how the task looks like in the model This is the native way of doing question answering or text text similarity, right? local text text similarity if I had tensor semantics generic tensor semantics Evaluating this thing Grow it becomes exponentially hard if the text context text scales, right because Contracting tensor networks at least exactly and also approximately is hard. You need exponential resources However, I set everything up so that it is Valid as a quantum operation, right? Of course simulating these quantum operations you would use tensor networks and And linear algebra on a computer basically which is exponentially hard quantum theory quantum mechanics is hard to simulate But if you have a quantum computer This is how you do it, right? Now if this beats GPT, whatever It's besides the point here because we started by a motivation We started by motivation to build a model Then you make it work It's it's backwards. It's not Make a brute force all-powerful God and throw it at tasks It's it's it's being more of a scientist about things even if they don't perform Amazingly at first you have a question. Yes So this type of circuit for verify the statement, no you dug you dug Is it's it's it's the docked product of two state So if it's like the fidelity, so if all zero enter and all zero I'm sure I measure in output it means that the segment is true, but there is As far I know if the system is too large This process don't work Don't work because there is the Catastrophic Catastrophe to go on or to go on a little catastrophe. So I think what you're saying is that the bigger the circuits the more random they will look see the more the more it will scramble and Getting the the zero state. It's exponentially unlikely. No, there is Also for a noiseless circuit If the system I'm also talking about noiseless Noiseless, okay. Yes. If the system is larger enough. Yeah, the orthogonality Catastrophe make Turns out to Falsify falsified the statement But the reality is true because the all the state Appears orthogonal for if the system is large enough I don't know if I express correct myself Maybe I can ask this question My understanding about the authentic catastrophe is that you When you solve the generalizing by problem You end up with lots of new dependencies and then your Matrix inversion becomes unstable unstable or is here. You're just solving a variational problem The basis is naturally orthogonal, I think here Because you're using qubits Well before we say anything about variational I need to go and explain two experimental approaches because the word variational is important there This I think this this conversation has has some depth. Let's have it later. I think it's it's basically about How likely it is to even to measure zero? If the Hilbert space grows if the Hilbert spaces grow too much for generic circuits you and UT and and you you right, it's true that if these circuits are random Like generic typical, right? And I'm growing The number of green wires on average. Yes, it will be exponentially unlikely in the size of the green wires to have to ever measure zeros I think this is similar to what you're saying But remember what I said before I don't want to be growing the the The number of green wires my the my questions are always local and The scaling parameter of my problem is the context text the only text that grows here is the t not the q I mean if if what you were saying was was a problem then the whole bqp thing would be a problem the bqp setup The whole of how we define quantum Turing machines and decision problems with quantum circuits Is like this you just throw all of the cubits in a Circuit that grows and you just measure one And here I'm not measuring one. I'm measuring five But I'm not measuring some function I'm not measuring a number of cubits that is a function of the t-text I'm always measuring say five and that's it So so the number of shots you will need is finite for a specific additive error to to this probability p of all zeros independently of how the other things case so I Never said before what is the u of theta is right u of theta is some unitary that I said is parametrized In practice, how do you do this you pick some circuits that? people have studied in quantum machine learning and That they like because they are expressive right expressive means that for That they explore the Hilbert space on which they are defined quite effectively as As if they were random for random Choices of their parameters their control parameters Here the parametrized gates are the rotation gates and the control rotation gates The fetas are not shown, but it should be implied Hadamard of course and CZ are not parametrized But Rx or control Rx and Ry are parametrized. They have the fetas inside One choice of u theta is this thing on the left, right? I've drawn them to go down to be in the spirit of all of this Language circuits because everything is read from top to bottom, but Usually people write in their papers stuff, you know quantum process that go from left to right. So here I've rotated a bit Sorry about this, but these are actually the circuits that we do use in our experiments Another choice is the thing on the right. It's three layers of that block Which is a layer of Hadamard and a layer of control Z and a layer of Rx's and then I repeat this three times How many layers of this block I choose? It's my choice. It's a hyper parameter How thick these things are is my Qn. It's it's a hyper parameter again I choose how many qubits I want to assign to every wire, right? So I said I'm gonna talk about some approaches of how we we do experiments one approach is train everything in task Build my text circuit leave all the fitas free Pick a task my task defines a cost function. No, I'm doing quantum machine learning. However, I Didn't pick some random ones that I found somewhere My my text circuit is informed by the problem. The problem here is language the the structure that The circuit inherits from the problem is syntactic structure, right the whole structure of the diagram It's not says some random circuit black box You make a cost function for your task you evaluate performance One of the thing we do one of the things we do is is binary classification The easiest task you can think of you have a bunch of sentences its sentence gets its own sentence circuit right some of the the words I Mean most of the words will appear in more than one sentence which allows the thing to even train And then you you can do supervised learning you have a test Train set and you have a test set you keep the test set aside and you train the fitas Such that the quantum circuits predict their label every every Every sentence will have a label right? Zero or one. This is a pregroup diagram like before right? If you measure the the qubit there At the sentence wire you will get some probability that it's zero and some probability that it's one Zero for one class one label one for the other you train the fitas such that You measure the correct labels I either labels that the train set says that these sentences have After you have trained you evaluate with these parameters for the words after they have been trained in task You you execute the circuits in the test set and you see your accuracy like how well the model generalizes in unseen data Because if they're unseen because a test set you kept it aside and this is all this variational loop here right the train set sentences become Quantum circuits as I showed you They go to some quantum processor you measure out the class label their probabilities if you're not happy with With with them matching the the train set labels you Iterate you update your parameters with some optimizer The fitas and then you loop around until you're happy right and then you just evaluate on the test set again on a quantum computer You can do this with with disco cut diagrams at sentence level I'm gonna do a parenthesis and show you what we are working on now because before with disco cut We have these these two experiments published in these two papers, but They're toy experiments as in the data is artificial. It's 100 or 300 sentences not not that big it's it's toy right for NLP standards vocabulary is very small But it works. It was a proof concept now if you changed it and you don't use lambic pre-groups if you use just CCG trees You don't need to post select anymore. You can just have three structures. They're easier to train on quantum computer There are no bar in plateaus. I'm not gonna stay too much on this, but it's working progress and it's very exciting So what we do is we can tackle thousands and ten thousands of sentences they can be movie reviews They can be clickbait news titles or even DNA sequences if you don't have syntactic information you can just use this middle thing which is a Binary tree with descent anglers Which is inspired by condensed matter theory for critical systems and then if you combine these two ideas syntactic information with this entanglement filtering thing from condensed matter from If you have heard of them, they're called Mera tensor networks or convolutional quantum convolutional neural networks We can combine these two and make syntactic convolutional neural networks and so on and all of these things work very very nicely And soon we'll run them on actual machines as well And and then and then there will be a paper. So stay tuned and Now when we go to disco Cirque, this is the the interesting thing that I want to close on because it's it's it's something that We have also seen working In task right So I Will tell you exactly what I mean now so Instead of training in task what you can do is instead of doing quantum machine learning like I said Instead of doing supervised learning what you can do is say I want to do question answering Okay, but I don't want to train my fetus in task Okay, I will pre-train my fetus to pre-train your fetus You need to do basically what classical NLP does which is make some word embeddings that are pre-trained task agnostic, right? The usual methods you might have heard of are called word to back or glove things like that We can exactly use these methods specifically. I can talk about glove You don't need to know too much now. I can show you what the cost function looks like we can ask me later The point is that you can pre-train stuff You can pre-train the component circuits that go in a big text and then your trusty compositionality Such that when you stick them together in a big text diagram, which is hard to evaluate classically the meanings compose In a meaningful way like they they don't generate gibberish. That's why I'm I say trusting compositionality you trust in the model But of course, this is not blind trust what we are doing is we pre-train on some corpus like There is a kids book corpus that we found and It's very nice because it has a huge vocabulary, but the sentences are small At the text are small you can take paragraphs and you can pre-train in co-care as matrices And what you can do then is take a question answering task Which your pre-training had nothing to do with and use this as a test of compositionality So this is something we are working on now But the most important thing I want to say now is that when we pre-train We can do this classically because we train small components and I can simulate evaluating small components of a big thing But if I compose them together to make a big quantum circuit, then I cannot evaluate it anymore Then I do need a quantum computer. This is the globe cost function Basically, what you do is the important parts of this of this cost function is this dij for so for word i and word j I have a similarity measure dij. Okay This dij I can evaluate classically It's it's the overlap for example of two known States or the overlap between two adjective Processes or the two overlap between of the overlap between an object an adverb and another adverb Or I can even compare things that are dissimilar an adverb with an adjective or an adjective with a noun I can take all of these overlaps By just taking them together compositionally and then replacing their quantum circuits, but these overlaps are small I mean, I can simulate 20 qubits on my laptop, right? It's fine. I can train these And I want to train such that their overlaps match the other important quantity in this equation Which is this minus log x ij this x ij is a huge matrix A co-occurrence matrix that you gather from a huge corpus Like I said this kids book data set or even you can do it from wikipedia You can take all of wikipedia and you can count x ij is basically the frequency That word I appears in the context of word j That's it. You you want to make this is basically realizing john furth's quote inside Hilbert space, right? I'm making my quantum processes have overlap such that their overlap is proportional to their Minus log likelihood For for for them being in the same context in in the in the corpus Hey, you pick a window. Yeah, it doesn't matter. Yeah, these are details of how you train So people usually fiddle about with these hyper parameters And they just find something that works. This is this black magic machine learning, right? And here we also use that black magic approach You just fiddle about and see what works And then you don't care as long as it works because you just care that these overlaps Are representative of their co-occurrence Now that I trained my quantum states and processes To to actually be quantum word embeddings now what I can do with them Is stick them in a question answering task There are datasets for question answering out there We take them we take the context text that says a bunch of facts We get a text diagram that builds us a text circuit I just replace my pre-trained word embeddings now have a big text circuit I cannot evaluate it. I need a quantum computer for it The same I do for my question And then I just measure on a quantum computer as a test About whether compositionality actually works Right So this is what I what I mean. It's not trained in task If however, the whole thing is small because there exists very simplified toy Question answering tasks. You can also train in task everything Don't do any pre-training create your Circles like this for every different question you get a different Circuit to run you can train train all of these in tasks such that the correct answer comes out Right, you can do this we have done this it works It works even with one qubit per wire It was like a WTF moment. It was pretty cool So what what what you get here is You you you need exponential resources to do this classically if the thing is big On a quantum computer, you can do it in polynomial time I will be bold enough to say that this is a exponential advantage against simulating the model classically, right? And then on top of that, let me go back Let me go back and remind you that This thing When you take sentence sentence overlap, there is a very nice paper by will zang and bob Which started this whole qnlp business and there they said If I have two vectors to compare and I want to see which which effect has the highest overlap with my vector I can invoke closest vector The closest vector algorithm quantum algorithm, which basically is grover under the hood You grover search over the possible effects And you get a quadratic speedup. Okay. It was cool in 2016 Today not many people care about quadratic speedups because they believe that the error correction overhead and all of this stuff will be quadratic So they will they will kind of kill each other off however In my very bold statement that I have a exponential speedup When I do question answering a text level with quantum disco search I can also have my extra bonus quadratic speedup on top Because I can also grover over the answers And this is something that we are putting down in in a theory paper now and stay tuned Yeah, now I think uh richie will will give you a very nice demo of lambek Which is about sentence level qnlp that automates all of the experiments I talked about And you can reproduce all of our papers And he will show you also how you compose diagrams to make text circuits And uh, yes, have you joined the zoom?