 The third important component, the last important component of web-based services model is called SOAP. SOAP is a protocol. You would be familiar with many IP protocols, TCP-IP and such things. TCP-IP, UDP are fundamental protocols just to describe how bits move from this place to another place. But there are, on top of it, there are additional protocols which are application specific. SOAP is a generic protocol which is an XML-based specification for exchange of information in decentralized and distributed environment, generically. So SOAP defines a message envelope. It has some encoding rules and it has a remote procedure call kind of convention defined. That means how to call a remote coval program that remote procedure call convention is described as a part of the SOAP. So basically when a message goes there, if you just let that message act in an interpreted way, it will automatically invoke your coval service, do whatever it is and it will form back the reply. So it's a very, very powerful mechanism of doing this kind of integration. It has a message structure which has an envelope with a header and a body and it has MIME attachments, just standard attachments with any email that you have. The message exchange model has initial sender and ultimate receiver, any intermediary. I might actually do this SOAP utilization through multiple intermediaries. So my application, my message has to go to an application in Chennai, but go through another application of a bank which is in Delhi and whatever. So I can actually, because I might want that intermediary to do something with that message, add value to it or whatever and then send it form. So that is how this SOAP is defined. I'm sorry. Let's go back very quickly to these three names. I just want to confirm that we understand these three names. UDDI is one standard. This is primarily about the registry that is subscribing and discovering services. WSDL is about web services description language. You use this description language to describe your services. You create your services using this. You describe them. You register them. You also can discover other services and you can invoke these services. All of these is permitted through WSDL. So programming in WSDL effectively becomes an essential part for anybody who wants to integrate application. And last but not the least is the basic protocol, SOAP. SOAP is not necessarily the most efficient protocol. In specific situations, you may use some other protocol as well. Let me give you an example. I said multiple applications can be integrated in this model. And I said that these applications could belong to different people. Now just as the same way I mentioned that Swift can be used even for transaction between local entities, why can't I use this model for integrating different applications that I have within my own organization? That way if I am moving a particular application to a new platform such as a database technology, I can take my own time to decide which application I move to another platform when. As long as existing applications I connect to each other using this protocol. So consider this. You have two servers physically sitting side by side. One is running application A, another is running application B. Both are on local area network. There is no reason why both these applications cannot integrate with each other using exactly the same model. So you have a UDDI registry somewhere in the same place or some other place. Now you will have described in WSDL the services on either side but to connect them if you use SOAP which can be done, two machines talking on it, just LAN also can use IP protocols and can use SOAP but SOAP will be very costly. Imagine further that these two applications are not even on two different physical servers. They are running on the same server. The communication is possible through backbone motherboard itself. In two servers in case of a local area network it is possible through a gigabit network. Only for a wide area network it will be slow. So when you have a LAN or when you have a facility for using only the internal infrastructure of a machine, why would you like to use a costly protocol like SOAP? In that case what you can do is specifically say instead of using SOAP I will use this faster protocol here. I will use plain simple TCP IP or I will modify TCP IP slightly. But these are all issues related to making things perform better. Conceptually there is no reason why I cannot use a single logical architecture for any application that I write. Going forward the purpose of all the discussion trails, going forward that logical architecture is a web-based application architecture and all the three standards that I talked about are part of what is called service oriented architecture. A service oriented architecture means any programming service that is performed by my application is now capable of being registered, capable of being discovered, capable of being involved using these standards. So service oriented architecture is nothing fancy. Service oriented architecture is an architecture which essentially permits integration of applications but we don't talk of integration applications now. We talk of developing applications which are compliant with the SOAP architecture. So any new application developed should preferably be designed in this fashion and if there are any performance related issues they should be sorted out by localizing those performance problems and solving them there. But making architecture generic even though might have initial performance issues you can simply throw hardware and solve those performance issues but in terms of flexibility it will be enormous. Going forward in future how the manamari today that you are facing of integrating multiple different applications will just not exist. So now we will very quickly discuss since I mentioned the performance issues how do you measure performance in the context of these applications and particularly in the context of databases. What is the notion of a performance, how is it measured, how it can be tuned, whether the performance can be increased by adding hardware, whether the performance requires modification in my software these are some of the things that we need to answer. So here is some basics of performance first of all what is performance. So let us say here is a end user sitting on the machine you can see is you know many of our users face the same situation they are not in a much happier place. What he sees is called the response time he has given a query or he has asked for a report and he is waiting for that report or he is waiting for that query to be answered or he has completed a transaction pressed his return key is waiting for the next screen to appear typically the difference between these two is called the response time. When he sends a query to a back end system there are n number of other users who are also sending queries say there are 1000 users who are sending queries quite independent of what each individual sees as a response time for the totality of the system the important thing is how many of these queries are answered per day per hour per second how many transactions are executed per second that is called throughput by and large throughput and response time are two very fundamental measurements in any performance. So throughput could be in terms of orders per day orders per minute request per second whatever what these are the measurable and understandable parameters from a end user point out if the end user is a ordinary clerk or officer that application then for him or her it is response time if the end user is the manager of the data center then throughput is the important take railway reservation system. So what is the wait time for a customer or what is the time a clerk takes is the response time how many total reservations per day are handled is what is important to the chief general manager of that region so he is not particularly concerned whether the customer stands for one minute or stand for two minutes he knows that the customer was standing for two hours anyway but for him 10,000 things handled in a day or 20,000 things handled in a day is important overall then both these are important response time is important and throughput is important. In the context of performance some additional terms that we talk about particularly for computer system performance one is called the capacity the capacity is measured in terms of peak throughput and the capacity is measured in terms of concurrent users. So this particular system hardware software application everything together can permit how many concurrent users to operate a system which permits 2000 users if you put 10,000 users on that system that system will crash if you want to go from 2000 to 10,000 users what additional components you may want to add is a difference sometimes you just add hardware but which hardware to add not sometimes any amount of hardware you add it may not make any difference we should understand why or if at all it is so. So capacity is an associated term where we talk about peak throughput for example we take banking transactions we say so many transactions per second that 200 transactions per second is my requirement I calculated that by calculating that total number of transactions executed in a day or so many however if I forget to tell that person who is developing the system for me not just application software but hardware everything that look this may be 200 transactions per second is the average thing however between 10 o'clock and 12 o'clock I execute 70% of my days transaction now that is a different information all so that is called the peak throughput requirement so these are important parameters that I must know at the individual level we talk of disk we talk of resource consumption what are our resources in terms of hardware CPU in terms of IO it is disk in terms of memory it is memory utilization so these are resources disk IO memory the resource bandwidth the other resource on all these resources what is the rate of consumption for a system this consumption rate might have to be specified both as an average and as a peak as in case of capacity so this is the description of all basic parameters is that understood response time throughput capacity and resource consumption now we go to some theory simple but elegant theory in terms of understanding how the system performs we actually look at what we call a work conserving system okay and who do some operational analysis the blue line here indicates arrival of jobs so a job arrives here this job is done after some time this time is called the service time it takes for this job to be completed when the job is completed it goes out another job arrives it goes third job arrives it goes etc sometimes the jobs get completed slowly sometimes the job comes faster there could be a time when the job which has come has just about gone and new job has not come can there be a time when the red line goes above blue line that is the nature of the work conserving system there cannot be if there cannot be a departure before arrival somebody should have arrived before somebody departs so which means this is likely to be the nature of the graph the symbolism we use is a of t is the total number of arrivals up to time t d of t is total number of departures up to time t the difference between a of t and d of t is the number of people in the system at any point in time these number of people could be number of a square queries in the system anything system is a system which is providing service so clients waiting for service is this number and now at if you take any person take this point what is n here 0 okay take this point what is n here 2 because there are two people waiting at any point in time 80 minus dt is the number of number in the system at time is that is that okay number is very easy to understand so difference between we do a complicated analysis now what is the average number in the system go back to the previous slide here there were two here there was one here there was zero here again is one etc etc what is the average number of waiting in time I am using the term number of in system whether that number is a skill queries or use of order doesn't matter so this is this is the sort of activity which is waiting to be serviced what is that total active what is the average activity how do I calculate the average I can actually calculate average by using a simple formula this is the area which represents users in the system right here it is a zero user but here there is one user here there is one user here there are two user so if I take the complete area under this curve this area represents the total users across the total type divide by t I will get the average those of you will remember mathematics will recall that an integral of the curve gives you the area under the curve so if I do an integration this is a complex formula if you are not mathematically inclined you can forget it if you recall the calculus you will remember that this number n okay which is given by 1 by t or an integral 0 to 2 entity will give me the average number in the system the average number can also be calculated this this integral can also be calculated by doing a summation so I sum you know 1 into r i what is what is r i this is the response time this was the arrival that is for a request and this request was sorted out this time so the response time was this this was the response time was second response time was third four fifth etc so ordinarily there is a response time which is nothing but d I minus a I the departure time minus arrival so if I do a sigma of this r i okay over this 1 to d t d t is nothing but the total number of departures up to a particular time I will get the same thing as an integral there is some jugglery of mathematical symbolism which is done here basically first showing that this integral is same as 1 upon t sigma of this integral is nothing but summation and then multiplying and dividing by this d of t which is the total number of departures up to the time t what is the advantage once I do this d of t upon t has a special significance and this sigma upon d of t has another special sign this d of t upon t is and this is nothing but x and r respective that is throughput and response time this is the average response time and this is the throughput and multiplication of this is equal to n what if the differences are not positive always let me explain again we did say that departure cannot be before arrival but departure of service request 2 can happen before the departure of service request 1 the service request 1 has come earlier that query may be just hanging somebody has put a query and that comes up and then therefore if I just list the arrivals and departures in a simple array like this a 1 a 2 a 3 a 4 d 1 d 2 d 3 d 4 then this d 1 need not always be smaller than d 2 d 2 may be smaller than d 1 because a 2 request gets sorted out first one so the difference could have a problem however because I am talking about all arrivals and departures I can logically conclude that the differences in sigma will be same by simply rearranging these so I can say if a 1 comes at d 2 I will mark that d 2 as d 2 dash and call this d 1 so d 1 is d 2 dash d 2 is d 1 dash d 3 is d 4 dash etc. Consequently response time for any job i is d i dash minus a i I will just take d 2 dash minus a 2 d 4 dash minus d 4 d 3 dash minus d 3 etc. Ultimately I am saying that in a work conserving system the sigma will remain the same therefore the shaded area which is sigma of d i minus a j same as sigma d i minus sigma a i which is same as sigma d i dash minus sigma a i and which is same as sigma d i dash minus a i which is nothing but sigma of the response time. So if I sum up all the response time together then that sigma is the shaded area get the point so sigma of the shaded area is response time total now we come to the little slug if r is the average response time and n is the average number of requests in the system at any time and x is the throughput that is request per second let us say then little slug states that n is equal to x into r this is a very fundamental law on modeling performance it is very unfortunate that in no programming course this fundamental law is even mentioned as a result programmers who are often responsible for maintaining the performance of the system and often responsible for determining what should be the server configuration what should be the user configuration etc are denied the availability of a very simple thumb rule. So we will look at that thumb rule in a moment but let us absorb the little slug in words it states that the average number in the system is equal to system throughput multiplied by average response time please note that originally we thought response time and throughput were two independent entities they are independent but they are related to each other in the following sense average number in the system at any time is equal to system throughput multiplied by average response time let us understand the significance of individual components of this little slug for example n we said is the average number of requests in the system this could be interpreted in real life as number of concurrent HTTP sessions number of concurrent SQLs there are thousand users who are sending SQLs but there are only 300 SQLs which are being sent concurrently then 300 SQLs are in the system at any time it is in the context of these 300 that I want to speak of response time and throughput the average number in the queue average number of people in cafeteria average number of people in airport average number of projects in hand little slug is not only for computers and by little slug is a general law for operations wherever response time throughput and the number in system are related like look at the interpretation of throughput request per second is the nomenclature that we have you it could mean the number of HTTP hits per second it could mean this chaos per second it could mean database request per second it may mean validations per second it may mean departure rate from cafeteria per hour or in a per airport or number of projects completed per year anything throughput relate notice that DB request per second need not automatically translate to business transaction one business transaction may do two database updates one database insert one three database reads the total request may be seven but that mapping can be done by looking at the query so once I know what are these numbers I can roughly map them into my business numbers you get the point so ultimately we say the little slug simply states that whatever be my response time and whatever be average response time and whatever be my throughput their product will represent the average number of request within the system and average number of request within the system is invariably a end user requirement he will say a thousand concurrent users or he will say I will have so many people not more than so many people waiting in the queue etc in that context let us look at some very simple non-trivial almost trivial like but non-trivial application of the law okay there is one more thing these things imagine that a service request comes from the user immediately it is answered and the user goes invariably in real life in programming particularly it is not that the request are coming from the same user suppose I complete my HTTP transaction it is not that I will again ask for it or if I send a query it is not that instantaneously I will send another query for example if I am updating a premium payment by a customer then I will have a screen when that fellow comes I will enter the data for that screen and when I enter the data I will go in next time and other customer comes okay meanwhile I will collect his money I will measure that money something or account that money something I will do this time is called think time that means even if I have end users in the system each user is not doing forever transactions only there is a think time followed by a transaction that is in real life that is what happened in case of computer system and that is modeled by you can still apply it is lot of business transaction so here is a system are which is our system there are end users here now each user is sending a request and getting a response that response time was R but now we say I add some think time Z to that response that user although is a concurrent user is unlikely to send another request before this think time so the placement of service will now placement of request will now happen not at every R instance but R plus Z instance if R is the response time then response time plus think time is together equivalent of a virtual response time for that and therefore N is now stated to be equal to X into R plus Z rather than only R all that we have done is we have modified this think time to make it realistic otherwise 300 concurrent users could actually flabbergase your server to 100% usage there is no wait time for that server at all requests are just coming like that that never happens in real life whatever be the number of concurrent users what actually we derive from this is that the system capacity planning which we are doing right is definitely going to be significantly higher than just only looking at the throughput and response time in fact is the other way round if you do not take Z into account then the capacity will become very higher but if you take Z Z into account then you don't require that much capacity because if you say X into R plus Z if you don't have Z N will be smaller for the same system but if you add Z the number of users that the system can support will become larger so given a system if you don't take think time at all let's say that system according to you permits 20 concurrent users but if you add think time of three seconds to everyone then number of users will increase so the same system can support more users in practice although N into X into R will give you a slightly different pick calculating this Z realistically therefore is very important if you calculate this wrongly there could be huge problem there's a very nice example in the total slide deck which will go on open source where he was asked they were asked you know there was an RFP actually this is a real life story the RFP and they said there are so many concurrent users and the total number of the transaction rate that is the throughput we want is this much and they found that that throughput was impossible to do let me first give you the next example and then in conclusion I will I will tell you that particular so this is the example here okay first some simple additional analysis we'll come back to this issue of this bottleneck analysis if you have a pipeline then you need to understand that in serial mode and in pipeline mode the computations are different so here is an example if a car is made by first making a chassis in two minutes door in four minutes then a window in two minutes and paint it in two minutes hypothetical the first car that comes out takes eight minutes but if you have a pipeline you keep pushing the things from here and there at what frequency will get these cars out you will not get it every two minutes although the paint is capable of finishing a car in two minutes but there is a bottleneck here door is a bottleneck that door requires four minutes so this fellow the window painter window maker will make it in two minutes and wait for two minutes the pipeline is choked here so the choke point determines the total throughput a pipeline mode ordinarily I should have get four car cars per eight minutes okay but now I will get only one car per four minutes or two cars per eight minutes because of this choke point actually what I appreciate from here is probably the door is you know you know translating this example into our real world probably the in not real ID word so this door probably maybe the network bandwidth could be could be the network band yes yes yes so if you how how this this this a network bandwidth should be estimated you know taking into consideration the concurrency response time yes then then is there any research or something available on the from the network perspective there should be but what you need to construct is an end-to-end scenario for an end user an end user sitting in Jharsugudha working on the network of 64 kbps and there are 20 users there you know the average bandwidth require what is the bandwidth available it is not alone the bandwidth but the delays are important in this case let me give an example this happened in unit trust of India when they were using satellite communication satellite communication has a certain bandwidth but as enormous delay because the signals have to go 30,000 kilometers up and come back now they had this application where the end user logs in on to the unique server here and does a transaction unfortunately the application which they had done even for login was actually using plain unique services in Unix from a terminal if you log in every character that you type actually goes to the unique server comes back and then gets I showed on your screen every character has to go and come back when you do it at a terminal connected on RS 232 or LAN interface no problem you do it through satellite he'll type a character sit like this type a character sit like login would take two minutes every transaction interface where he was typing the data would take that kind of people stopped using that system so there you have to rewrite the application so how do you solve this bottleneck the bottleneck in the bandwidth you say I have good bandwidth but I have these delay to reduce the delay I will not transport one character at a time I will take a bunch of thousand characters and push all of them together now this is the way you will solve bandwidth related problem but you are very right that bandwidth could become a bottleneck where everything else is fast and no matter how fast you make all other systems the choke will always be here in fact identifying the choke in an end to end system is a is a very fundamental requirement of a good diagnosis of of performance very very correct very very good point that you here is another example of a similar type 20 seconds 20 seconds 20 seconds so in a serial mode one order per 60 seconds but in a pipeline mode you can get three orders per 60 seconds however if the burger takes 30 seconds instead of 20 seconds then that will become a choke for so identifying a choke point is an important okay this is about whenever you have a pipeline and invariably you will have a pipeline particularly in a distributed rather and take a simple example somebody in the branch okay wishes to do a transaction from a customer after identifying the authentic signature of the customer the person logs in he has a part which is a local application he has a document retrieval system which takes him somewhere else and let's say he has a requirement to confirm from the bank something something so he has one application which is local which is local thing another which requires let's say a zone level kind of server at a third one which is on internet somewhere if the choke point is internet no matter how fast you make the other thing it will remain like so the identification of choke points in pipeline is important let's go back to slides and come back to our conclusion about the little slow very important point n is the average number of requests in the subsystem which could be concurrent HTTP sessions number of concurrent SQLs whatever whatever whatever and this n is related by little slow to the product of throughput and average response and all that we said is there is actually a think time which should be added to the response time to calculate the total number in the system in short if you have a system which gives x throughput and whose response time requirement is R which is being met with z think time then that system can sustain in concurrent users is the conclusion that you can I'll just give one example here to show how you can do simple validation of inputs consider program trading and automatic order processing the requirement of a broker broker you know what is program trading the trading on internet you do by typing out the code and saying I want to purchase 20 shares sell 30 shares or something like that this is what investors like you and I will do what day traders do day traders means people who sell and purchase during the same day without having either the money or the shares but at the end of the day they square up that is the speculation which keeps the market as it is now day trader also originally types but suppose he has written an algorithm ok he has written an algorithm which automatically based on some parameters places a buy order places a sell order etc he wants a response time of 100 milliseconds he wants the processing throughput of thousand orders per second and he says I will have totally 10 concurrent users so if I have 10 concurrent users with his throughput expectation of thousand orders per second and the response time of 100 millisecond forget the thinking time at all whatever be the thinking time the little's losses n is equal to x into r plus z in this particular case if I put values I do not know what the think point time is permitted 10 is equal to 1000 into 0.1 plus z since z is non-zero and people think in physical time only not negative time I can't say I have thought I I have thought yesterday or something I have to think now like whatever be the value of z this equation can never be satisfied so if a broker makes such a requesting build me such software there is no point in spending time system analysis design writing SQL queries implementing it and then wondering why the hell the system is not working so you advise him buy 3 more servers 10 more servers doesn't matter any number of servers any kind of gigabit bandwidth 10 gigabit nothing will serve because little's losses sorry 10 is not equal to 100 plus 1000 z z may be 0.1 z may be 0.2 z may be whatever it is possible only if this this r is different so I have to tell the broker why in in practical terms what it means it means that if you want 10 concurrent users and if you want to process 1000 orders per second I cannot guarantee a 100 millisecond response time there are many examples the the slide deck of dr. Rajesh Mancharamani contains some additional examples as well it is worthwhile for you to look at what all those things say what we will right now do is just to conclude this by some practical aspects of performance evaluation just as an introduction how do we evaluate the performance the literature is useful to you in doing a broad analysis and as I said we will apply these thumb rules right at the outset so the example that I was giving you that there was an RFP and the RFP said that the throughput must be so many orders per or so many transactions per second and total number of concurrent users are so many they found out by the literature law that it appeared that assumes of response think time reasonable think time they found out that the whole thing was becoming very unrealistic if they had to give that kind of response time and throughput the system was becoming extraordinarily large and dr. Rajesh Mancharamani while helping the team in TCS preparing this tender he said this this looks something very funny to me so it doesn't matter whatever you do they will hit the bottleneck sooner or later if they change this parameter slightly he did another calculation what he found out is if that is the throughput and given the think time he knew the total number of requests that the department handles in a year this was let me not name the department cases let's say total number of cases there's so many million cases per year that the department handles he's he proved using the same law or calculating that given all right given the response time requirement and given the throughput requirement okay sorry the number of users and throughput requirement only he said that the entire job of the department for one year will get completed in seven working days so the user had goofed up the user had not put a realistic requirement so they actually ask this question that look is that what you want do you want us to design a system which will complete all your department work in seven days only then the answer was no it will not then it was pointed out that it can be done but you take humongously large hardware is not required the RFP was actually cancelled by that department and a new RFP was floated which actually saved very large amount I mean that the system which came in was less than one fourth the cost of what it would have now that's a large cost just by validating the user in so the little slow is truly is not of little significance of a great significant and you may try to use it in whichever way at least as a thumb rule calculation about performance should always do the other problem that we have is we have already some software it's working we have developed it now we want to see how does this software perform does this give me so many transactions per second okay the earlier that what we talked about is capacity planning hardware and so on but given my software how does it perform on a given hardware or in order to make this software give me so many transactions per second what hardware should I use that is the capacity planning excel should have four process eight process 20 process whatever one for that a typical methodology adopted is called benchmarking a benchmarking exercises you actually have a benchmark in mind so many transactions per second throughput so much response time etc etc and you have your software and you say I want to run this software show me whether your hardware will give me this perform or give me a hardware which will give me this perform so ideally what I should do is I should go to a vendor saying okay give me some hardware let me see what it does it's not satisfactory add more process add more memory add more desk whatever and perform but for that I need to run this software in a simulated mode in actual practice let's say 1000 concurrent users are going to use this software I cannot make 1000 people physically set and run this so I'll have to simulate simulating such end users and in fact simulating even the functionality of my requirement is one important attribute of this benchmark exercise I want to simulate number of users I want to identify the transaction types that will be executed then I will have to set up the software under test whatever application I will have how to configure that application how to configure one or more database servers configure application servers configure reporting servers or if I don't have a separate reporting server I may want to configure the load of copying to an operational data this one thing which is often important to remember people run their benchmark they will take your software and they will run the database let's say there are queries and you will push a lot of transactions show so many transactions at that time those servers are doing nothing but running SQL query the moment somebody tries to run a report that thing will sink and that is because while measuring you did not run the report even simple thing like spooling services for printer are switched off from the operating system when you run a benchmark this cheating what you have to do is you have to create simulate the real life so in the real life whatever services your operating system will run database will run whatever actual activities will run all of that should be part of your load simulation then only will get realistic value then you have to set up the load and you have to decide the measurement parameters finally you have to conduct multiple tests in general even when you have your software ready it takes anywhere between two months to six months to actually conduct these tests it's not easy to do that and that is why most of the evaluations that we do in life do not have actual benchmark since that's the fact the industry has come up with a cooperative venture called transaction processing council these benchmarks by the way nothing new the earliest benchmark was a debit credit benchmark it's just banks debit and credit Jim Gray was the one Jim Gray is no more he's considered one of the biggest gurus of databases he was a creator of IBM system R or DB2 known as the mother of DB2 is no more he died in a freak accident he took a yacht in the Bay area in the sea and and simply disappeared never came back it was a very sad thing he was a Turing Award winner a great computer scientist of the last century so he first introduced this debit credit benchmark this was this was the first time he introduced the notion of think time says the clerk in the cashier always will major cash right something it's not that as if he's just doing for offer transaction there was a transaction processing council which was set up as a cooperative venture as I said all major vendors of the world support this and the TPC has large number of user members as well and this council then released the first versions of benchmarks called transaction processing councils a benchmark and be benchmark TPC a TPC be where the first standard benchmarks subsequent is a TPC C and TPC D TPC H is the current benchmark which is being used for data warehousing applications we in LIC use this benchmark by modifying it to reflect our business considerations and we have to map your performance requirements to the standard benchmark this task is not trivial even this takes two to three months and then ideally if you can run your own application that is best if you can't you map your application transactions into a standard and run the standard the advantage of these standards is all hardware manufacturers from time to time continue to publish their benchmark results and internally in the lab they have benchmark results for different configurations of that model so the very good thumb rule test by which you can go back so this was the last thing I wanted to mention although this course has been packed with contents I wanted to emphasize that this is just a glimpse of the cutting edge technology that is available today I hope you have now understood that while databases are extremely important and useful they're not the only thing they're just one of the n things that need to be deployed to build your applications so first of all databases itself please read the textbook for more details and try to solve exercises there's no shortcut of solving problem okay and the problems could be difficult so better try solving them jointly now this is a important suggestion form groups of two or three persons and execute group projects as the as Mrs. Sukumar said the other day we are planning a follow-up course the course incidentally the rates are fixed in August because that's the only time we have the guest house rooms so that gives you June and July as two months in these two months I would like your small groups to do projects take a subset problem from Alice's own domain now schema definition schema design etc a good point to start with this was Seema's suggestion the logical data model prepared for your data warehouse okay is a domain schema from that domain schema itself you can take that subset of schema for which you have the relevant data in terms of your fepses and make group projects like that do in these projects some part of a proper analysis and design to the extent that we have discussed if possible refer to any standards of an engine methodology follow your own organizational methodology and then designer schema design some transactions at least identify the reports and queries that you would like to this is a project we would like you to do in a coordinated fashion the idea is that these projects should actually be formally submitted because if you have to just do and forget then the tendency maybe to forget it also so you do these projects and you submit I would expect one of your group to become a coordinator for the whole group and this coordinator through email should set up these groups should do these projects will not do a very heavy evaluation of those projects except just to get a feedback how well you have been able to use whatever knowledge you have got it but if you do this then the next course in August which will be on the Java based dbfs application developer will be much more meaningful and towards that end simultaneously I will suggest start reading something on object oriented paradigm there are enough books on this you can start reading them only as a storybook of course you have to continue to work in your development center for supporting your applications I hope you enjoyed this course I wish you a very happy database programming and thanks and good bye