 Okay, so now, today is our last session, luckily this particular session is not very long, so we should finish it by about 7-10 or so and then we will complete, conclude by getting an overview of what we call the service oriented architecture or the modern way of constructing systems. To begin with, to understand those things, we will look at distributed systems and web enabled applications. The purpose here is, we have so far been looking at the information system, conceptualization, creation, design, etc. We have looked at database technologies, we have looked at internet, we have also looked at analysis, design, we have understood SQL, but we have not really looked at very large systems. We had hinted at the possibility at many places that there could be applications where part of the application is running on one server here, part of the application is running on some other server there and you need to make these applications talk to each other. When multiple applications or a single application deployed at multiple places, not necessarily replicating the same feature but doing different parts, then you call those applications as distributed applications. We shall see what exactly distributed applications are, what are the peculiar problems associated with them, how they are handled. Since most of the modern applications provide a front end which is purely browser based, these applications are now called web enabled applications. In short, most of the modern applications that you will see will be web enabled distributed applications and this is what you will have to construct or design or manage or use whenever you go out in the field. So, this is what we are going to look at today. We will first look at the web enabling and the concept of e-commerce which is a terminology which is an important notion. Then subsequently we will look at distributed applications, the general nature and particularly the web services model. The later part we will do in the next session, but first let us understand the web enabled applications, particularly the example of e-commerce. Web enabled applications in one way are very straightforward definitions, applications which exploit the web. Web has two components, one is the connectivity component which is provided by internet. I hope you appreciate that worldwide web and internet are two different things. Internet is basic connectivity between different computers across the world and that came about long long time ago with the HTTP HTML beginning, right? And that is what was at the heart of the internet boom with email and other applications. The web came in 93, but web depended and exploited internet. So, web has connectivity which is essentially internet connectivity. Web provides ability to access websites, provides ability to use available search engines such as Google, Yahoo, you are all familiar with those. E-commerce on the other hand refers to commercial activities carried out electronically. Amongst the web enabled applications in general we will be particularly looking at e-commerce applications as an example to illustrate how widespread the use of such systems would be. And in the process you will also understand the need for a proper distributed system. So, let's begin the discussion on e-commerce applications. Many of you would be familiar with most of these things if you surf the web and see what is available on the web. So, advertisement that's the first thing that would hit you even if you do Google search, typically on the right hand side you will find Google sponsored links, right? So, what are those? Nothing but advertisement. In fact, many of you would be wondering, should be wondering how is it that Google and Yahoo are able to provide such huge services to the whole humanity in terms of unlimited email facilities, email exchanges, etc. After all these things cost money. So, who pays money to Google? If you and I don't pay and all of us have either Yahoo account or a Gmail account, now that Gmail account must sit on some server. Whenever I send us email, that email must through internet go to that server from there, go somewhere else. Okay. When people upload photographs, basically huge server firms, when people make searches, do you pay for search? You don't. You just search. So, who pays for that infrastructure? And that infrastructure as I think once I mentioned consists of a server firm of upward of 5000 servers, 5000 nodes. All running Linux and MySQL database, but on a very, very sophisticated architecture which permits breaking of query into multiple places. Some servers are doing web crawling, some servers are answering queries, some servers are handling your email. Who pays for all that? And somebody must be paying more than the cost because Google is a huge profit making company. So, where does that money come from? Any guess? Advertisement. Advertisement revenue is upward of trillion dollars across the world. So, huge revenue. And web is capturing that revenue just as an aside, non-technical discussion. Do you know how the rates of advertisement are fixed? You know the newspaper rates. If you want to advertise in newspaper, so many column centimeters. So, you will have to pay per column centimeters so much money for one appearance, for repeated appearance so much. If you want times of India, New Delhi, Chennai, Kolkata also to be included so much money etc. That's how ordinary people pay money. Now, here what is the notion of a size? Nothing. So, on what basis would people pay for advertising? Number of clicks. Number of clicks by whom? By the end user. So, this is called eyeballs. Eyeballs as in eyes. How many eyes are likely to visit my advertisement or see my advertisement? See, after all what is the advertisers wish? More and more people should know about my product and services. That's the purpose of advertisement. So, how many people are likely to see my product and services? If there are going to be more people, I will be willing to pay more. There are going to be less people, I will be willing to pay less. Just like inserting a single advertisement in times of India, Mumbai, versus inserting an advertisement in times of India, Mumbai, Kolkata, Delhi etc. More people would see. The whole probability base because so many people purchase newspapers but not many read all the advertisements. But still that's the probability. So, number of eyeballs is what is important. Now, Google and Yahoo of this world have convinced all major suppliers of production services across the globe that they have maximum eyeballs. You can easily see that it is trivially true. Just look at the number of people who have Google accounts or Gmail accounts or Yahoo account. Or the number of people who surf or search using Google or Yahoo. So, using that they use the advertisement. Advertising on the website is not a very straightforward matter. Whatever you print on paper is not the same thing as putting it on screen. So, you require screen layouts, graphic designers etc. And you require mechanisms where the server will display not only static information as it appears in newspaper but also dynamic information as changes happen. So, the product changes, you don't have to redo the whole artwork. You may just maintain an application system which actually displays an artwork and inserts the appropriate data values from a backend server. You can see very clearly a backend database working in cohesion with the front end web and small. Coming back to transaction processing on web. First of all, you just understand a payment transaction. You are familiar with the payment transaction. You go to your canteen or the cafeteria here or something. You buy one dosa, whatever, whatever, eat something and then pay money. When you pay money, that's a transaction. In a cafeteria, it is not uncommon that you eat first and then pay money. But if you go to a grocery store, when that fellow gives you 1 kg of sugar, almost simultaneously you pay him. In short, services or goods are exchanged against money. The payment transaction is the exchange of money from the giver to the receiver. This may appear simple. You know, I take out 20 rupee note and give the 20 rupee note. That matter is over. However, payments cannot be made only using currency as you would be familiar with. You write checks, right? So, check payment. Do you know what check payment implies? When you give a check to someone, they say, you give a check to me for whatever, 500 rupees. Do I get those 500 rupees immediately? No. What will I do? I will have to go to the bank, deposit that check in my account. That check might have originated in your father's bank in Kolkata. In which case, somehow a settlement will happen where you briefly discuss that kind of scenario once and then finally the money will land up. There are large number of possibilities. The check may get lost somewhere. The check may go there but the signature will not tally. So, that fellow will definitely do that. The signature tally is but there is no money in the bank account of the person who has given me that check. Variety of things may happen and therefore my 500 rupees are at risk. This is called a credit risk. Now, this risk I have taken, bank is not responsible for this risk because this risk I have taken when I accepted the check from him and I have other legal recourse like going to court, asking police to arrest him, etcetera, etcetera. But I may still not get that 500 rupees. Any payment transaction and any regulator in any nation which supervises such payment transactions wishes to ensure that the risk, if taken by any organized segment, then the risk is covered somewhere. If the risk is not covered, there is a possibility of a loss. Consider for example, credit card payment. I pay by credit card. Let's say 5000 rupees a monthly bill to my telephone or gross or something like that. When I pay by credit card, even in a hotel, when I pay by credit card, there is no cash transaction that is happening. But I am getting an authorization from my credit card banker or credit card supplier. So, the fellow who collects that transaction, which is a digital transaction, actually gets paid within 24 or 48 hours by the bank. Which has issued me the credit card. Have I paid that bank now? When will I pay when I get my monthly bill? At that time I pay the gross to that particular banker saying, okay, this credit card, I have spent so much money, I am paying you this. Again, here is a small issue. How does the bank earn a credit card business if it permits me to pay like that throughout the month and charges me only at the end of the month? So how does it manage to pay money earlier, but collect from me later? Money costs money, right? This is not no free land. These charges are typically collected from the suppliers or goods and services. So all the merchant establishments typically pay a 2% charge or transaction. But they are not supposed to charge you and me. This is a facilitation. The idea through which the banks can miss the merchant is that look, if Fadak comes to your hotel and if he has to pay cash then depending upon how much money he has, he may only eat this or that. But if he comes to the credit card allowance then he may eat lot of things and he will make more money. Because credit card limits are much more than what you and I can eat at one time. So this is the supposed attraction. However, the banker is at risk. Suppose bank has paid throughout the month all the bills and at the end of the month I throw up my hand saying I don't have any money or I don't pay you. What happens? Of course it will charge me, whatever, whatever, whatever. The banks have to evaluate that risk on the totality of how many people will not pay at the end of the month. Or if they can't pay, how many people will not pay interest subsequently? That's their business model. Luckily in a country like India, people pay. In fact, the Citibank credit card chief 10 years ago told me there's a lousy business in India because 95% of my customers pay in time. They're not supposed to pay in time, they're supposed to pay interest by saying okay I'll pay only this much per month etc. Americans do that very well. They never pay in time, they always pay late, they pay interest and many of them don't pay interest and many of them don't pay anything at all later. So there's a huge problem. That society lives on credit, that lives on future money. Indians being part of the oriented culture and philosophy, if we live on yesterday's money, we generally don't spend unless we have money in our pocket. There's a very good thumb rule distinction. You know the orientals live on yesterday's money, the Americans live on tomorrow's money and Europeans live on today's money. So you have maximum debit cards there. I have money in my account, I spend it kind of. Coming back to the transaction on the web, you would have seen some of you would have conducted internet credit card transaction. Right. So what do you do? Typically you would have form to fill in which you say what is the credit card number, some three digit number which is a credit card authorization, whatever, whatever. Many credit card suppliers now insist that you should have an internet password to do internet transaction and then of course you make a payment. However, whom will you make this payment to? You are not going to make a payment to the bank only. You are going to make a payment to somebody else. Say railways. Now if you are going to make a payment to railways, you will first go to the railways site to make a reservation. When the reservation is almost done, almost in the sense you have figured out a birth is available, you have given your passenger names, et cetera, et cetera, then the railways site will say now you want to pay, everything comes from when I say yes, railways cannot collect money from me. They have to go to my bank card. So you will ask me through what credit card do you want to pay? If I say city bank credit card or state bank credit card, they will take me to what is known as a payment gateway. They will take me means who? Railway server. Look at it this way. Railway server is located in Delhi somewhere. My credit card server may be located in Chennai. I am in a transaction. So I am in a transaction with railways for reservation. So it is a connection-oriented transaction. I am in that session. Suddenly while that session is on, without disconnecting me, they are required to transfer me temporarily to a credit card session somewhere else. So imagine the information system and network messaging scenario that must be happening in Tamil. They must go to city bank or state bank with the details of at least a transaction number, some ID by which they can recognize later what were the details. Second, my name, who is paying that money. Third, the amount that they want me to pay. These details they will send to the city bank server. City bank server then through its secure payment gateway will collect an authorization from me that yes, I want to pay this money. And when I complete, city bank will hand over the connection again back to the server but confirming that for this transaction, I mean this much amount has been paid on this date, this time. And this is my idea of transaction, some such. You understand that what must happen internally. So essentially then executing transactions on a web involves pay information collection which may require form filling. It may require power registration. Payment on the net could be credit card payment or it could be a general banking transaction. You can actually transfer money from your account, one account to another account. You can make payments from your bank account etc. Multi-party transactions is what we just saw, a railway or airline reservation. So there is a user involved, there is a reservation server and there is a payment gateway. This type of transaction is called business to consumer transaction. Unfortunately the field of information technology is full of acronyms. I think we are aware of that now. So it is called B to C, capital B number 2 capital C. You can immediately construct multiple such acronyms. B to B will mean business to business. B to C means business to consumer. What is G to C? Government to consumer. So government to business, government to consumer. Of course there are multiple interpretations possible like that. What could be C to C? Consumer to consumer, what kind of business? Well someday I should be able to do a transaction between two individuals. Even without an intermediary. Today when I do an intermediary transaction, I transfer money from my bank account to your bank account, it is still called a B to C transaction because the bank is involved. So these terms are commonly used. Web-based share trading. Anybody does that? Day trading or something. You do that, wonderful, at least one fellow. In Korea it seems every Korean university half the class is doing day trading on mobile, sitting in a class. That is of course very confusing. I hope you don't do that. Luckily this schedule is such that the trading closes before he can come here. So that's okay, bye. So you would be familiar with this web-based share trading. Effectively it's not just payment alone. It's not just reservation kind of alone. I am buying and selling securities. So I have so many shares of large-send to grow to be sold, so many shares of infosys to be purchased, whatever, whatever. Obviously the application systems it will be involved in such a scenario would be the exchange itself, share exchange. Bombay stock exchange, national stock exchange itself. There will be a broker involved. Till today India does not permit direct transaction by end consumer with the national stock exchange. It is only end well. It will happen slowly and it is again for the risk. Suppose I say I purchase 10 infosys shares directly to national stock exchange and national stock exchange says okay, this is the price 10 shares sold to Fata. Now tomorrow when 10 shares have to be delivered to Fata, Fata has to pay. Now you can't say Fata is low because that fellow has sold. His books he has written yesterday is right. And imagine if the infosys share falls significantly next day, who will purchase those shares? So this kind of risk, these are all financial risk that need to be covered and therefore usually consumer transactions happen through brokers. And if you want a end to end transaction, a bank would also be involved. In India for example, there is an entity called national securities depository limited which actually maintains all the stocks in dematerialized form. So there are no shares with me or with anybody. Shares are all with national securities depository limit. In the dematerialized form, do you know the value of the shares that NSDL servers hold? IT systems of the kind that we have discussed here, which maintain the entire data for all the stocks held by all the people. Of course there is another depository CDSL, but CDSL holds something like 20% of the total stocks. NSDL holds 80%. Do you know the value? Almost 1 trillion dollars. 1 trillion dollars worth of net value is sitting physically in some servers. I won't tell you where they are, otherwise you will get some nasty idea. But you can see the importance not only of the systems, but of the security that is associated with such systems of the high availability and guaranteed 24 by 7 operations of the system. That system cannot fail. That system must work, whatever happens. So these are some of the things that we need to worry about for the larger thing. Now even when connectivity is guaranteed, that is you assume that internet is perfect, network connectivity is okay, there are still several issues that need to be addressed. Some of the issues which are financial or commercial implications are mentioned. Issues related to information system is what we are primarily concerned with. Let's look at some of these. First of all transactions in physical space, the transacting persons know each other can otherwise verify the authenticity as also authorization to transact. What are the difference between authenticity and authorization? Authenticity is to prove that I am Deepak Phatak. Authorization is to prove that I indeed have the power to execute that transaction. So for example if I go to bank and sign on a withdrawal slip to collect 2000 rupees, first I have to prove that I am Deepak Phatak. Nobody else should be masquerading. Second my signature which is the authorization has to be verified as my signature. What the point being made here is that when the transactions happen in physical space, when I go physically to the bank, invariably some bank clerk or the other will know me. And invariably they will cross check physically my signature and so on because this is a physical book. It is unlikely that I would have surreptitiously insulted say friend's signature who goes and masquerades as Deepak Phatak and signs. That's not unlikely. There is physical security for all of these things. Goods and money change hand almost simultaneously as I said so the risk cover is less. And signed contracts for goods and services are legally valid. You know you place a purchase order for let's say a computer with maximum computers. When you place a purchase order he starts assembling things and delivers it after 3 days or 5 days. IIT Bombay cannot after 5 days say no I never ordered this. There is a physical purchase order which is legally valid. He can take IIT Bombay to court. So IIT Bombay of course will never do such fancy stuff but no party will rummage on the toilet. Whether it is a contract, whether it is payment, all these contracts, all the execution of the transactions have to be non-repudiable. No party should be able to repudiate. I never said that or I never paid that or I never wanted to pay that. Nothing of that sort can be stated. This happens in physical space because of the signed contracts, legal systems, everything. How do you do that in the digital space? What is the validity of a contract? You send an email order to maximum computer saying though computer based though. And he says that and then you claim you never send that email. How will you prove that you sent email? You can repudiate. So email is not a non-repudiable mechanism. You need something stronger than that. What could that be? Every transaction that you conduct in a digital space, whether it is of services, goods or payment, there are no goods but electronic goods. Say for example you want to buy an electronic book and pay electronically. Now you pay electronically through credit card and electronic book does not come to you. And subsequently that fellow says no, no I have dispatched it on that day. You have not received it. So there has to be an end to a transaction guarantee of some kind in the digital space. How do you provide guarantee for authenticity, guarantee for authorization, guarantee for non-repudiation. These become suddenly important questions. All of these are termed as e-security questions in the digital space for transactions. Security in e-commerce is ensured through for example firewalls. So you cannot, for example from an outside world, you cannot get on to your departmental server typically because there are firewalls there. Confidentiality is ensured by encryption. There are two types of encryption typically. One is called a symmetric key encryption. The other is called asymmetric key encryption. By the way encryption and e-security or network security or transaction security itself is a matter of one full semester course at most places including IIT Bombay. So please do not assume that by this small description you would be experts in security. It is just to give you a flavor. Some of you would have done these encryption. If a message is given, a standard encryption is that you have a key. Key is called a cipher. So you say let's say there are numbers there and you say I replace 2 by 5, 3 by 7, 1 by 9 or there are letters there. You say every A is replaced by Z, every F is replaced by P. Now this is called a key. Using this key you take a message and encrypt it. The encrypted message nobody can understand because it will all be jumbled up. But if you have the key you can understand the message. Now what I do is I send the encrypted message one way and through a secure mechanism I send the key to my friend. My friend looks at that key, receives the message and de-siphers it. It's called decryption. Since the same key is being used for encryption and decryption it is called a symmetric key implementation. Some of the better known algorithms for symmetric key implementation are DES, digital encryption standard and 3DES or 3DES. It's called triple DES simply because the DES algorithm is applied three times. So you encrypt using DES, then encrypted message you again encrypt using DES, encrypted message again encrypt using DES. Why this Godagiri? Please remember that any encryption cannot be guaranteed to be foolproof. You have heard of the spies who decrypt your messages. So they guess the key. If this key is there what? Earlier people would guess the key by hand calculation when the messages were encrypted using hand write. Today in the digital messages you cannot figure out a key easily. But suppose you deploy large computers on which you perpetually run algorithms, keep trying what should be the key. Maybe you can break that key and you can break that side. The purpose therefore of the security people is to ensure that the encryption is as unbreakable as possible and even if it is broken, the amount of time it takes is sufficiently long so as to complete the entire transaction before it can be decrypt. In this sense triple DES is supposed to be more secure than DES. But none of them are completely free of possibility of cracking. A symmetric multiplication of algorithms, the most well known is called an RSA algorithm. The RSA is after those three people who actually created that algorithm. This algorithm is based on a very funny concept which again uses the notion of relative primes and exponentials in digital space. So I encrypt using a key which is called my private key. There is a key pair called public key and private key pair. So I encrypt using my private key, send the message to you. If you have my public key you can decrypt that message. My private key is not known to you. Similarly you want to send me some message you encrypt using private key I will decrypt it using public key. Now this private key, public key pair has to be distributed amongst the transacting parties. There are third parties which provide such keys. Usually the messages which are encrypted using such keys are certified digitally by an agency. So there is something called digital certification agency. The digital certificates are issued by mutually trusted parties. So you and I may not trust each other but two of us may trust let us say National Informatics Centre. The National Informatics Centre will issue digital certificates to you and to me. Using those digital certificates you can sign a transaction which can get encrypted. So when I get your message I will decrypt it using your public key whatever whatever but to ensure that really you have sent it I will compare that certification which accompanies that. It is a very complicated process of merging everything together but that process works. The asymmetric key algorithms are considered to be far superior to symmetric key algorithms and are generally considered unbreakable. Theoretically they are breakable but the amount of compute power that you require you will need to run a supercomputer for many many hours to break somebody's private public key or whatever. So it is not considered worth it. In every country there is a certification authority. So there is an authority which issues such certificates the trusted common party. And there is a chief certification authority which is the apex body which certifies all these certification authorities in terms. The certification authorities which can issue certificates are called CA's. So in India for example Tata Consultancy Services is a CA, National Informatics Centre is a CA etc etc. The chief certification the chief certifying authority of the country or CCA today is one Dr. Vijay Aditya who sits in Delhi and his office monitors all these things. Why all this setup has been done this is called the PKI infrastructure. There is another terminology that you would like to note PKI which means public key infrastructure. There is a public key infrastructure that the country has established and you will be glad to know we have done that ahead of many other countries. Of course United States has a full PKI but India is amongst first few countries to establish the legality of digitally signed documents. Do you know for example that ordinarily in every country if you want to show a proof of let's say ownership of a flag or proof of a contract which has been signed by both the parties. Law requires that actual physically signed copies be submitted. If these signed copies have been scanned or if a digital order which is valid otherwise has been sent to you in an encrypted fashion appropriately certified by digital certificates then such certified copies of digital records are considered valid legal proof in our country. This was done through the IT Act of 2002. This is not very old but we are amongst the first few countries in the world to guarantee legality of digital contents in this fashion. Authentication means both parties prove identity to trusted third party before starting transaction. Nonrepudiation so these are the three things. Authorization, confidentiality, authentication and nonrepudiation. Remember I mentioned nonrepudiation? It means that proof that the item document is originated by you and you only. That is where digital signatures come into picture. So here are some of the nomenclatures that I briefly mentioned. B2B is business to business. Any inter-bank transaction would fall into that category. Transactions between auto industry and ancillary suppliers. Suppose somebody makes carburetors or somebody makes fuel pumps for cars and all those fuel pumps are sold to Maruti. Now Maruti is a large business organization. The carburetor maker is also rather small but reasonably large business organization. When transactions between these two businesses occur whether it is basic or purchase orders, whether it is sending of invoices, whether it is collecting money digitally, transferring amount from one bank to another bank, all of these now can happen digitally. And these kind of transactions are called B2B transactions. Government to business transactions I mentioned. Carpuret tax returns, company filings. Are you aware of this? When you and I pay tax we file income tax return. When the consultancy services pays tax or when reliance industry pays tax, then that tax is of course huge but that is not paid in the ordinary way. So there is a tax filing which is required to be done on regular basis. All that can be done digitally now. Company filings. There is a department of carburet affairs. Earlier it was called department of company affairs. Which is, which requires by law that every legal business entity that is registered in the country must submit information regularly to that department and some information regularly if it is a listed company to SEBI or securities exchange board of it. The information that is required to be submitted to the department of company affairs, carburet affairs consists of your annual reports, changes in directorships, appointment of new directors, any instruction that you issued, any carburet, all of these. Annually they are required to file these things. Again India is one of the few countries to revolutionize this filing and now all such filings by all companies can be done electronically using again digital signatures and so on. Government to citizens. You have seen many things. You want to apply for a water connection to your house to be in Cipro corporation or you want to go and register let's say birth of somebody. All of these transactions can be done digitally. Applications can be filed digitally. Certificates can be issued digitally. They can be printed on some special paper if required. But all the transactions, applications, payment of money, all of these can happen digitally. In fact municipal corporations have identified something like 37 different services which they would like to give digitally to people. One of the municipal corporations which actually won laurels all across the country for being the first one to offer maximum number of such services including complaint registration etc. online was Kalyan Dombeoli municipal corporation. Not Bombay corporation, not Pune or Kalyan Dombeoli. It's exciting. Somebody lives in that area. I request you to go and visit that place. Remarkable IT implementation. Complete computerized systems applications developed return. Of course they outsourced application development but that activity is done. Now there is considered distributed applications. So these have been evolving much before the advent of the web. The moment connectivity happened they were bound to be distributed applications. Consider this the first ARPANET. You remember ARPANET? 1969 Department of Defense United States connected four mainframe computers and then provided connection to various universities and research labs. So if a research professor in some place is running part of the programs on one mainframe, part of the program on another mainframe trying to correlate results isn't that application a distributed application? So nobody called it distributed application then. But distributed applications came much before web is the purpose of making this observation. In general in the real world organizations have multiple applications. So each company will have multiple applications like for example TISCO. So TISCO will have a enterprise resource plumbing application. If it does not have ARP application it will have HR application. It will have accounting application, inventory control. In fact traditionally all small and big companies who started using computers started with these three applications payroll, accounting and inventory. These are the things where maximum pains happen in maintaining information. So each organization has multiple applications and these applications almost always have to run on multiple sites. Take a bank. Now State Bank of India branch deals with 50,000 customers. Now it must be able to run that application autonomously. Earlier when the connectivity in the country was not good State Bank of India Power Branch had a server which was connected our local area network to Terminus through which all the customers were serviced. Similarly Gatkooper branch of SBI would be running the same application in our other server. There may be a branch, let's say Bombay main branch. We should be running not just that application but some other application as well because that does government business. State Bank of India IT Power does not do any government. So payment by people of income tax, payment by people of this tax, whatever, whatever would be done by certain branch. So there could be some applications which are run in some of the offices. A common application which is run in most of the offices but they run autonomously independent. And the organization has to now collect all the data from all these various distributed sites and make sense at a corporate level. Additionally these applications running for an organization need to exchange information with other organizations. Go back to the example of railway reservation. Railway reservation as an application. Imagine that I am doing a booking from Mumbai to Delhi, Delhi to Calcutta and imagine that railway has two different servers. One in Delhi, one in Calcutta, one in Mumbai, one in Delhi. Two legs are confirmed by two different servers. Let's imagine. Yet I would like the facility that seamlessly I should get the reservation for both the legs and I should make a single payment for both the legs. I don't care whether central railway collects money, western railway collects money, eastern railway collects money. So you need these organizations to exchange information within the servers of their own offices plus servers with external things like that, right? These are then the requirements of distributed application. In this context the databases that you have studied, they are all singular databases. You define a schema. You know that schema will exist on a single server. You push data into popular data into those tables. Data will exist there. There may be multiple users but they are all accessing a single database physically as well as logically. A distributed database may exist physically in a freighted form on multiple servers but is expected to give a single logical schema. So as far as you and I are concerned, you deal with the schema. You and I are not bothered about whether this table is on that server or whether five columns of this table are on one server and remaining seven columns are in server in Chennai or some rows are in this site, some other rows are in some other site. You are not bothered about. So you and I want a single logical schema but the data may be spread across servers at different locations. This can be achieved in multiple ways. Schema partitioning either horizontal or vertical which means some rows at one location, some at another, etc. like employee information. So if I have some employees in IIT, imagine IIT is a single system. IIT Mumbai has employees, IIT Chennai has employees. Ordinarily there is no reason for IIT Mumbai employees information to be available in IIT Chennai because 99% of the time activities will pertain to those people. So I can keep the same logical schema. Data about IIT Mumbai employees in IIT Mumbai server. Data about IIT Chennai employees in IIT Chennai server. If the minister for some reason wants to know the professors across both the servers, he, I and the Chennai fellow all are looking at a single logical schema, ironic query. The query should then pull out some data from IIT Mumbai database, join or whatever and supply a single reason. So this is the schema horizontal partitioning. Vertical partitioning does not mean you take a table and split in few columns as I mentioned. That would be stupid, right? And how would you store some columns here and some columns there? But typically you would store some tables at one place, some other tables at another place. And there is a data replication. Replication why? Suppose the database goes down. So what typically people will do is if they have distributed database at five locations, they will replicate data of one location at some other location. So that second location contains its own data plus the replicated data of Mumbai. Kharagpur may contain its own data, replicated data of Chennai. Which means if Chennai server goes down for some reason, it knows or the system knows that the Chennai data is also available in Kharagpur and maybe Bombay data is available in Bombay, something like that. So this is like a red arrays you have for this. We are talking about redundancy across a large number of servers. Of course when you have replicated data then there is a problem. Suppose you replicate IIT Mumbai data in Mumbai and Chennai and let's say a new faculty member joins IIT Bombay. Now when that person's record is updated in the employer records of IIT Bombay it must also be updated simultaneously in Chennai. So whenever you have replicated databases, responsibility of maintaining consistency of data across replicated database becomes an additional problem. Finally all multiple applications which might exist will need to talk to each other in real-time online. What do you mean by real-time online? Here is an example of an academic institution. In IIT Bombay the moment you register or enroll yourself upon your admission with the academic office then even if I have a library application which is independent the library application must be updated with my information. Hospital application is independent, it must be updated with my information. Hospital coordinating unit application is independent but it must be updated by my information. So that is the desire that these multiple applications must talk to each other online in real-time. A business example, a payroll output should update accounts. Payroll is calculated, money is paid, salary is paid to people and then somebody else has accounting entries in the ledger. That is not correct. The payroll system itself should directly update the account system. To provide a single consolidated application covering several domains people develop enterprise resource planning packages. You are familiar with those, I think I mentioned them. They will typically take care of inventory, accounting, sales, marketing, whatever. In spite of having ERP packages because these do not answer all questions there are specialized applications which keep coming up even in these organizations after implementing ERP. So people who have implemented SAP or PeopleSoft or whatever such ERP packages they routinely have 8 to 10 other applications which are either locally developed or are bought from elsewhere because ERP does not provide all functionalized. In this case the ERP now which itself is supposed to be an integrated single application has to further integrate with all these applications because all these applications must interact online real time for the two advantages otherwise there will be no advantage. Please note the online real time means that it is not that if you don't have the applications talking to each other directly you cannot do anything. That is not true. Even in the whole days when payroll checks were prepared, checks were paid to employees there are other accountants sitting doing accounting entries. Now if I have a payroll application which generates checks and accounting application which takes accounting transactions the worst case I can do is take the printout of the payroll and start entering in that application. The best I can do is as the payroll application generates checks I will prepare a file of data records which will have all the payment details and I will upload that file in the other application in a batch mode overnight updating all my accounts. So it is possible that applications talk to each other through intermediate files which are one case you export data onto a file from an application or the case you import data and upload it. But that is not considered adequate when you want all operations not only across your organization but across all organizations to be online real time in which case the applications must talk to each other online real time. This job is now defined as a job of enterprise application integration or EAI and there are variety of tools and techniques which have evolved providing the enterprise application integration. Basically ensuring that multiple applications which were developed in silos can talk to each other and the talking to each other is done through an intermediary software called middleware. Enterprise application integration is usually built around a middleware framework. Middleware is a piece of software. Transactions across applications for example can use what is known as a remote procedure call or PC or they can use persistent messaging. What is persistent messaging? Imagine that you go to academic office and register yourself. You are enrolled. Your data is inserted there. There is a library application which is running independently and you don't want the evening night update by the batch file. So what this application can do is academic office application the moment you register it will not only update the database but it will create a message for the library and send it to library. That message will contain enough details about your transaction. If that message is persistent that means you guarantee that the message will be not lost but it will definitely reach that other application. Now suppose in that application you have written a piece of software which is a listening kind of software which keeps listening to these messages and the moment it sees the message from the academic office that message will be incorporated to update its own information. So there are two completely distinct ways of updating academic office information. One is when you update academic office information that transaction is not committed till you also run a transaction on the academic office server. It is online real time and it is connection oriented protocol. A connectionless protocol is like a post office. You complete this transaction, generate a message, send the message to library and if the library server is running and if the connection is up within milliseconds of that message coming there this system will update your information. So it will be almost like a real time. The advantage of that thing is suppose for some reason the two applications are disconnected then the first approach will not work. In the first approach you are updating academic office database simultaneously logging into the library and doing a transaction which will be committed only both the database are updated. Imagine the connection goes kaput. Then even academic office database cannot be updated because a transaction will be atomic only if it updates both online. So it is hazardous. As usual connectionless transactions permit the luxury of ensuring a transaction is committed at a later time as long as it is done guaranteed. This guarantee is given by a mechanism which provides persistent messaging. So in short two mechanisms to execute transactions across applications a remote procedure call or RPC which is a connection oriented mechanism or persistent messaging which is a connectionless mechanism, message-based mechanism. These are names of some of the pieces of software. Tuxedo which was built around Korba standard. So this Korba is a very old distributed object standard which was essentially providing that malware. Your MQ series from IBM you might have heard of Tipco which is called the information bus company. Total information bus company. So they are the concept of an information bus on which these messages travel and messages are maintained persistent by the Tipco and Tipco bus provides pieces of software which are called adapters. So they send messages and they listen to messages. The best example of Tipco implementation in India is life insurance corporation's implementation. So if you pay let's say premium in Chennai or Coimtore or any place which is done through a server which is located in that branch and that application is not a distributed application in our sense but autonomous application running in Coimtore, running in Chennai, whatever. Independent, like a branch of it. Now when that data is updated that you have paid the premium a message is sent through a Tipco bus to a central server called operational data store in Mumbai where that transaction comes and is accumulated in the operational data store. Every day millions of transactions which are done across 2,000 branches of LIC come online real time nearly because the network as long as connectivity is there it's only a message passing thing and they come and sit here. So the corporate knows every day in the evening how much money has been collected by premium. How much money has been paid through claims. How much money has been paid to agents. All financial transactions come real. Imagine because there are 2,000 branches. 20 branches are disconnected from the network. All that it would mean is that the data from those 20 branches will not come immediately will not even come today but if tomorrow the connectivity is restored the moment the connectivity is restored the persistent messaging which is a part of all messaging middleware thing will guarantee that those messages will then be delivered to my thing and I will get it updated tomorrow. So what is guaranteed is no transaction is lost. That's the advantage of the message based middleware. So usually these are multiple ways in which you can integrate applications. The early efforts in applications across businesses to be integrated was done through enterprise data interchange or EDI. They were EDI standards. They were very peculiar because each application was peculiar. Everybody has one's own data format and nothing was common. So people design you know connectors between one application and another application almost appropriate. So it was fun. I mean lots of programmers and programming companies were making a lot of money first writing individual proprietary data format applications for different people and then provide one-on-one connectivity. If there are any applications which all need to talk to each other how many connections need to be built? Factorial. The number of edges in a fully connected graph. Is that clear? Number of edges are fully connected graph. So first you do end jobs ensuring that each one of those has a completely different data format. And then you get factorial end jobs to do. It's beautiful business isn't it? Now this of course everybody realized including programmers and companies that this is nonsense. So there are some standardized messaging that came out. Banks may have independent data formats but when they exchange financial transactions all banks in the world agreed that they will use a standard messaging system. It is not a single message because there are hundreds of types of transactions that can happen. Swift is an organization which provided for the standardized messages. Almost all financial transactions, cross-border transactions particularly across banks happen through Swift network. So the reason you can wire transfer money from here to New York branch or from New York branch to Tokyo branch is because these branches may belong to different banks but they are all connected through Swift standardized messaging. So whatever application is running in one bank it will generate a Swift message in the standard. Send that message there and the government will receive Swift message, decode it, take it into its own form of data and do the transaction. Such standardized messaging in Swift occurred for banking transaction but still there were several proprietary formats and protocols which continue. After all Tesco has a format in which it maintains data. It is not likely to change that just because State Bank of India wants it to change. Similarly if Tesco has a certain format and if it has 200 ancillary units supplying different things out of which 10 are big ancillaries and they have their own computerized system they will find it very difficult to change their internal data formats to suit Tesco. Or if they make those changes to suit Tesco and they have to also supply to some other steel maker then who may have different format. In general it is a messy situation. In order to respond to this messy situation the world has come up with new standards and new protocols and these are still emerging. All of these standards are web enabling standards because they have come after the emergence of web and these are known as XML we already saw XML, WSDL, UDDR so many some of these terms we shall see in the next session. I will quickly read through this transaction. This transaction will remain on the web so you can contemplate it. The purpose here is to make you understand how a customer like you and me will expect the world to provide services through web. Here is a customer. His name is Anand Deep Singh Pannu. Anand Deep Singh lives in Ludhiana. He is a high net worth client of a stock brokerage firm called Netcar Company. He is a broker company. Anand wishes to sell 12,000 shares of Infosys and National Stock Exchange. Now you know why they call him high net worth individual? 12,000 shares he wants to sell. So how many would have rich fellow, right? He wants to sell 12,000 shares of Infosys on the National Stock Exchange in Mumbai. Wishes to purchase 1,000 shares of Siemens AG on Berlin exchange and 5,000 shares of Cisco and NASDAQ. He further wants that the remains of the proceeds. Proceeds is what? He sold 12,000 shares of Infosys. He bought 1,000 shares of Siemens. 5,000 shares of Cisco. Obviously he is still going to be left with a lot of money. So he wants the remaining money to be disposed of as and now. What is Anand Deep Singh doing actually? He is defining a transaction. But he is defining a single transaction. Sell 12,000 Infosys shares here, buy 5,000 shares there, buy 10,000 shares there and with the remaining money do the following. Pay the semester fees of his daughter to IIT Bombay. Let's say she is studying here. And because she is staying here, credit rupees 50,000 in our account in state bank of India. So pay the fees to IIT Bombay and pay 50,000 rupees to state bank of India. His son has gone to New York for somewhere, says, Papa, $1,500. Remit, he has $1,500 to his son's bank account. His wife says, Kido, I want to go and stay with my BTI in Bombay. So book a wagon to transfer some of his household goods to Mumbai. With all the remaining money, he wants this, this, this to be done. He also wants to pay annual premium for life policy at Life Insurance Corporation, Ludhiana. He has purchased a new BMW or whatever and he wants to purchase a car policy at New India Assurance in Delhi branch. And if any amount remains, he wants it to be transferred to a charitable trust set up by his alma mater. Obviously he is an IITM and therefore he would like to give money to IIT Bombay. High net worth individuals at all times in the past have always been doing these kind of things. Usually they did them as independent transaction but as I share Becho, get the money, buy this there, pay daughter something, send a cheque to daughter, whatever, this is how they did. Now Anand Deep Singh Pannu of Future India will ask you, you guys are wizards, you are all information scientists, you have this web, you have computers, please do all of this as a single transaction for me. Do you understand the implication now? Let's quickly go back to these transactions, see what all is implied. Selling shares here, there, there is like a financial transaction you can understand but paying the semester fees of his daughter to IIT Bombay implies that IIT Bombay system is willing to take an online transfer of fees of whatever 23,460 rupees from somebody called Anand Deep Singh Pannu and credit it to the daughters this thing. State bank of India should be willing to take 50,000 rupees into the savings account of the daughter online from the remainder of the transaction. Your bank will take that money no problem but Rayway should be willing to book a rattle with collecting money in advance from him on this date, this time, etc. Life insurance corporation should not expect Anand Deep Singh Pannu to stand in a queue to pay his premium. It cannot now even expect Anand Deep Singh's servant to come and stand in a queue. It cannot even say send me a check and I will create. It must say yes sir, online real time. So its server must then talk to the broker's server who is receiving all these instructions. Same thing with you in the assurance. Same thing with IIT Bombay. All of this we have to be done online, real time through whatever persistent messaging and you can now see that a remote procedure call with binding is nearly impossible in this situation. Only persistent messaging is the only possible way of guaranteeing execution of this complex transaction. There are other implications of this. Let's assume that the net carp has a customer relationship management system which has to automatically evaluate the change in the portfolio net asset value and is required to inform implications to Anand's car loan provider. Let me tell you something. I mentioned BMW or Mercedes that he has bought. Please understand that rich people do not buy cars by paying cash. They take bank loan and buy cars. Now the bank loan which has been given is not just given against the car is also given against the credit worthiness of Anand Deep Singh Pannu. That credit worthiness let's say is defined by the net asset value of all his holdings, how many shares he has. That is his cost. If I can't get back any money, so I should have some recovery. Now imagine that Anand Deep Singh Pannu has been saving his shares left right and centre for his daughter to study in IIT, whatever. Suddenly his shares have no value. Then what will happen? The arrangement in such situation is that if the net car has given a guarantee to that bank that my net worth of my client is because of his so much holding every time he sells any shares the holding has to be re-evaluated and have to be informed to the bank, more or less whatever. So these are additional complications. The broker system must do that. Things don't end there. Consider this trust at IIT Bombay. This is our IIT Bombay, what you call the donation fund. Imagine that IIT Bombay has to execute a standing instruction to forward different percentages of the donation for identified activities also copying the action to relevant professors and team. So here is an example. Let's say I had a chance meeting with Anand six months ago and I told him that so many bright students come for my Masters and PhD program now that is encouragement for people who after finishing M.Tech would like to do PhD in IIT itself. Apart from the government scholarship, I would like to give them a top-up scholarship of substantially high amount. How much money I will get? 25,000 rupees per month. So that they feel comfortable. So how many scholarship you require? I said maybe I have 10 top students who are interested in doing this kind of thing and he said all right, so this much money multiplied by this. I said no, no, no. Why don't you give me an endowment so that I can perpetually keep paying such 10 students every year. I have negotiated, discussed this, etc. When he finds that Fatah is going above go, he says abhi paisa nahi hai jab aagata baat karengi. And that is for God, right? Now I come back here. Six months later he executes this transaction. He remembers what he has told Fatah. So he takes net cop, okay, send this money to IIT Bombay with this intention instruction. Now imagine in the evening he calls me saying these are Fatah, are you happy now? And I have no clue about it. Now depending on how my day has gone I will tell him whether I am happy or not. But I will take a perfect duffer, I have no clue. Do you think he will be enthusiastic to talk to me again later in life? Now imagine IIT Bombay system. The system has, you know, money comes, it is recorded, it is accounted, etc. But additionally imagine you people design the system such that it does the following. The moment Anand Dev Singh Pandu's money comes he does entry saying that Fatah had approached Pandu for this purpose and these are the students who had said they would like to continue for PAID. Automatically the system calculates the amount of money, allocates that money for these 10 people, sends an email to them saying your PAID scholarship is announced by 25,000 rupees, sends an email to me and to make sure that I take action immediately, sends me a long SNS saying, Pandu's 2 crores received 4 intense students eligible for scholarships and Pandu's phone number is this. Get the point. This happens within 2 minutes of his issuing instructions there or the shares getting sold. Imagine now in the evening when he is having a cup of tea with his wife I call him and say Anand fantastic. What a wonderful thing you have done. Let me read out the names of 10 people who are very grateful to you for taking care of them and I read out those 10. Don't you think you will come back again next year to me? This is how businesses will work in this world and this is what you have to facilitate through actual construction of such systems which will work end to end seamlessly. We will stop here.