 Hi, I'm Rithvich. I run a company called Spiky. We basically build data heavy visualizations for our customers. Now, what we've tried to achieve with PyQuery is we've tried to solve a very fundamental problem with data heavy user interfaces. That is, how do you fetch data from your database? How do you query in browser? And how do you do data-driven interactivity in browser? So before I start about PyQuery, I want to talk about what PyQuery is not. So PyQuery is not an Angular or React alternative. PyQuery does not help with data binding. PyQuery is not a monolith framework. PyQuery is not an isomorphic framework like Meteor. PyQuery is also not a non-isomorphic framework. That is, it has a back-end component to it, but you can write the back-end component in any language that you prefer. And finally, PyQuery does not force you to choose the database to choose a certain database. You can use any database. So what is PyQuery? PyQuery is essentially a conceptual framework to help organize complexity for data-driven single-page apps. What is a conceptual framework? The idea here is to focus more on the concepts and not the code. The code for PyQuery is open source. You can use that, or you can write your own code. What is more important is how do we organize the code around this? Data-driven interactivity essentially is filtering, searching, dashboards, visualizations, maps, et cetera. So why should we be using PyQuery? There are two main reasons. From a developer point of view, you want to write maintainable code in JavaScript. This becomes a really big problem when you have really large single-page apps. From a business point of view, you want to spend maximum amount of time in getting a design right and not writing boilerplate code. So let me show you some examples. So this is a Crickets Core Card app that we built for network 18. It is streaming real-time data every 30 seconds onto the browser, and it's a data visualization. And we are getting around a million hits per day. So it's a high-traffic site. On the other spectrum, you have a visualization. I don't know how many of you all have seen the gapminder video where Hans Rosling is talking about world data. So it's similar to that. It's for 500 startups. It's a VC. This is a fairly large amount of data which is getting animated on the screen, but it is consumed by a lesser number of people. It's an internal tool. So PyQuery handles both these use cases. It can be used for mobile front-ends or laptop front-ends, internal consumption, or high-traffic sites. So what are the types of interactivity? You have visual interactivity. That is, if you click on one element of the DOM, the visual properties of the other DOM element changes, hide, show, et cetera. And then there is data interactivity. You click on one DOM element. The data behind the DOM element changes. Is data interactivity only for visualizations? No. It is also for your consumer-facing sites like this Flipkart search page. You click filter from the sidebar and the data in the center changes. So let's take a very simple example. You have two drop-downs. You change the country and the language. How would you build this? You need to expose at least two APIs in your back-end. One is going to give you a unique list of countries. The second API is it gives you a unique list of languages for a selected country. Now, in your front-end, you are going to make the Ajax call to fetch data from the country's API. You're going to render the country's drop-down. On change event of the country's drop-down, you're going to call the second API and then render the language drop-down. So this is a fairly simple piece of code. But the moment we start building really complex front-ends, it becomes really difficult to manage. So this is a Google spreadsheet kind of a front-end that we've built for ourselves to explore data. I'm changing, I don't know if I have a net. So we have lots of interactivity out here. There's a histogram, like an open-refine. The grid is changing. So when you build extremely data interactive front-ends, code management becomes a big problem. So what is the first big problem with data interactivity? The first big problem is your back-end code is extremely messy. For every drop-down that you need to render, you need to write an API. It takes five minutes to write a drop-down, 15 minutes to write an API and publish that code to production. You might change, remove the drop-down in the front-end, but that API might remain at the back-end. So you end up with dead code. With PyQuery, what are we doing? We've written an API layer. So you download that piece of code, you connect your DB to it, and you switch on the server. And then we have something called as a PyQuery JSON object. It's a query written in a JSON format. So your front-end developer can write any kind of a select query, really complex select queries, write in the front-end, call this one API, it returns back the data, and then you render the data. So it turns out most of the data that we have in the front-end are select queries. All of it, group by count, et cetera. So you end up with one optimized scalable back-end API, which is going to serve 70% of your data. Your back-end team can keep optimizing that one API. The front-end team can write whatever they want without ever disturbing the back-end team. What is the next problem with data interactivity? The next problem is, how do you propagate where clauses? Now, since we have the query write in the front-end, I can now propagate the where clause. The front-end developer can propagate where clauses on the on-change event. So it's as simple as changing the PyQuery object of the dropdown for languages. What's the third big problem? The third big problem is managing really complex interactivity. So this is a front-end from Book My Show. So let's abstract it. You basically have cities, movies, cinemas, and shows. And what is happening is, when you change cities, cinemas and movies are changing. When you change movies, cinemas are getting impacted, and shows are getting impacted, et cetera, et cetera. So what's happening here? Theoretically, this is a graph problem. You have n nodes, and you have n into n minus 1 interactions or relationships to manage. The moment you have so many relationships to manage, your code becomes tough to manage. So managing complex interactivity. What have we done? We have a PyQuery object type called the connector object. This object is basically a JSON object in your front-end which simply holds two things, where clauses, and the relationships between nodes. So you have cities impacting the connector. Connector is impacting movies. So the impacts and impacts by relationship. The connector object does not have a DOM element associated with it. What is the advantage of this approach? The advantage is that now the number of relationships have dropped from 12 to 8. Two, when you want to debug your code, you end up with only one object that you need to debug. Your code is not all over the place. How does the code look? So this is how it looks. Cities impacts the connector object. Is it a cyclical relationship? No. It's just a one-way relationship. Connector object impacts movies and cinemas. Is it a cyclical relationship? Yes. Movies is impacting the connector object, and cinemas is impacting the connector object. And finally, connector object is impacting shows. So we've shrunk down those eight relationships down to just three lines of code. And when you now change cities, automatically all the other elements are going to change. You don't need to write any code that does the change. So the moment the data changes, your DOM element can change. So let me show you some live examples. So these are really simple examples. As I'm clicking on the charts, everything is changing. This is again similar. This is the example of cinemas, the similar UI app. Now here, what we want to achieve is how can I quickly get the data fetching and the data interactivity code written as soon as possible so that I can focus on the design part? So something like this would take around half an hour to get up and ready. This is another user interface, which is written where the rendering is happening with React. Turns out PyQuery works really well with React. With Angular, it becomes extremely slow to use. So we're still figuring out how to integrate with Angular. But with React, it's extremely fast. What is the next big problem with data interactivity? How do you query in browser? Now there are various libraries on the net which opens those libraries that try to address this issue, like Lovefield from Google. We've tried to solve this problem too. And this is more important for mobile since you always have bandwidth issues. So let's look at that. So you have the DB. You are writing a PyQuery object. You're sending it to the DB. It is returning the data once it's in browser or in the index DB. You can use another PyQuery object, change the adapter to in browser, and you can query the data which is inside the browser. So had we tried to implement the movie ticket booking app with this mode, then you have the moment you change Mumbai, the city out there, it makes a DB call, fetches all the data, puts it in the browser, and then all of these other queries that are out there are going to happen on the fly inside the browser. You do not need to keep going back to the DB. So why is PyQuery a better way of doing in browser querying? Turns out, so we support, so there's underscore, cross filter, and angular. Angular does not support this except free text search. With underscore and cross filter, some of the more complex use cases like group by by multiple columns is a little tough to handle. With PyQuery, we've tried to build a near similar to SQL interface right inside of your browser. So you write your PyQuery JSON object, exactly how you write it in the DB mode, change the mode, the adapter from DB to in browser, and it starts working. What is the next problem with your data interactive front ends? You want to save the state of your interactivity. So let's imagine a use case where I have a dashboard, or, let's say, I've built a hospital discovery app. And my friend's father needs this data as soon as possible. So I'm helping search the right hospitals for his ailment immediately. And as I'm filtering all this, how do I take the end state of that interactivity and share it with my friend? Turns out right now how people are doing this is by changing the, by appending parameters in the URL. So even when you filter in Flipkart, the URL keeps changing. But there's a limit to how much you can hold in the URL. With PyQuery object, you just need to serialize the JSON objects, save it in the DB, associate a unique tiny URL code to it, and you can recreate the whole scenario. Let's take it a bit further. We did a funny experiment. We tried to save every single PyQuery JSON object while a person was interacting with the front end. And then we could make an animation movie out of it as he was interacting. So the whole thing could be replayed how he was actually pressing buttons on the front end. And the next big feature that this allows us to do, it allows us to implement undo feature. Implementing undo with interactivity is really complex. So I showed you some of the examples. So what's the objective with PyQuery? It's a fairly technical implementation that we've done. And the objective was, can we get rid of the data fetching and the interactivity part as soon as possible and focus as much as possible on the design aspect? So what is a philosophical construct behind this? So let's say if you're running a company, let's say if you're a telecom company, or let's say if you are a travel company, your data remains the same. As you run your business across the next five to 10 years, your data will remain the same. But if you use the MVC framework, you're going to closely tie your model layer to the view layer. You'll not have the flexibility to keep building new user interfaces on top of it very quickly. What's stopping a travel company from building a microsite which reuses the data only for, let's say, specific tool, let's say, Kumbh Mela or World Cup? You can have a microsite which is optimized for that. So if your main data is that river, is there a way to build a lot of these tiny front ends which extract the data and give specific use cases? This is the objective of PyQuery. How does it stand up against various JS frameworks? Well, again, PyQuery is upstream to Angular and React. It is not competing with the existing frameworks. The objective is everything to do with fetching the data and managing interactivity. We do not bind the data. For that, you can use whatever existing frameworks are there. So it's a completely different use case. So the code is open source. We've used it for multiple projects in the past few months. It's still work in progress, of course. And we are ready to, if you want to use the library, we are ready to support free tech support. We can help you understand the concept, how to apply the concept in your use cases. If you need another backend adapter, we can build it out for you. So whatever is required for you to use PyQuery, we are ready to support you, implement that. So these are the GitHub links. I'll share the presentation. Questions? On the slide, I just saw the database list. And it had only the rdbms. Do you also plan to support any NoSQL databases or something? Or are you tightly bound to the SQL querying? So think of PyQuery as a JSON object, which can be translated into any other databases query. All we need to write is the backend layer needs to support another DB. So let's say if I'm writing the backend in Ruby, or let's say Scala, typically all of these languages have got a gem or a jar, a connector, which is going to help me connect to that. I need to translate the JSON into that query language. It's not that tough. Thanks. Is it audible? Yes. So my question is, so how much of the backend or the schema awareness needs to be there in the front end? So here's the thing. We have two options in front of us today. The first option is that for every front end that we need to build, the backend programmer will be sitting with the front end programmer. He'll write the m and the c part. He'll expose the APIs for you, and then you can start working on it. The other alternate with PyQuery that we are supporting is the backend programmer comes, writes a few JSON objects with you very quickly, and you can get moving. So you can use the, even if the front end developer does not have an awareness of it, PyQuery, you can use the backend developer to help you understand the data. It takes 15 minutes to write a few queries and then get moving. You don't need to push code to production every time. OK. Get it. Q, sorry, I probably didn't understand that. OK, one more question. Is that, is the schema exposed to the client side? So does your storage structure, is that reflected in your PyQAs? Yes, it is. But often in many of the use cases that we've worked on, we typically built views in the database to actually reduce the amount of data that's exposed. Got it. Hello. Something similar, like how much data is exposed to the front end regarding the schema information. I'm asking in terms of SQL injection. Like the moment you expose the backend schema, there is chances of people manipulating the actual query and get others' information, which they are not supposed to get. And it also requires like ACLs, like access controls. Like how has it been accessed in that? So we've used this in, I'll give you three different use cases. The first use case is when you are building a customer facing application. So let's say like Flipkart or any consumer facing site. In that case, any ways that data you're going to expose to the front end. The objective of the data is to be exposed to the front end. So with views, you can restrict it down. Other use cases where people have used PyQuery is for dashboards inside the organization. If it's an inside organization or if it's a password protected, you don't even expose your queries to the external users. And finally, if it's an access control this element, based in this case, you can append ware clauses to these PyQuery objects, which will restrict it. So how do you do it? You do it inside, not in the front end, but in the backend. The API layer can save based on this token. I will superimpose. So let's say if I'm seeing credit card transaction list. The same query is being fired, but when the backend sees that this is the token that's coming in, I will inject one ware clause to it. With respect to SQL injection, the JSON, the backend code which is converting the JSON object to a query, we have a lot of pieces of code which starts restricting. You can't do delete, you can't do this. Based on a little bit of text search, we restrict a lot of operations. So it's not based on prepared statements. The stopping of SQL injection is not done based on prepared statements that most of the database drivers provide. It's done by checking the certain characters that are restricted. So we are already using the gems that are provided by these languages. On top of it, we are doing an extra check. OK, thanks.