 We will now discuss, first of all the principles of software engineering. What is that field? Where does it get its name from and what does it include? We will describe very briefly the software engineering activities which are covered by this name. There are majors which actually indicate the maturity of the process that you follow in software engineering. The most important measure is called the capability maturity model measure or CMM levels as described by the software engineering institute. We will briefly discuss what those levels are and what they imply. We will do a very quick review of ER model because that forms the crux of modeling the static characteristics of data that you handle. We will then discuss the functional model which is called the data flow model. In the data flow model not only we describe how data flows between different processing modules that we have envisaged and how exactly processing of that data occurs. This is a model which has a pictorial representation just like ER model and that is called the data flow diagram or DFDs. These are classical analysis techniques. Today in the field you will actually find people using what we call object oriented analysis and design methodology. Subsequently we will have a lecture by my colleague Professor Umesh Belour where he will explain to you the fundamentals of OO modeling. However, since you have to represent your entities and data and since you have to represent processing the fundamentals are same whether you represent them in this model or that model. The OO model differs mainly in that the terminology and conceptual framework that it uses is beneficial when you are implementing the solution finally in an object oriented programming language such as Java etc. However, all the fundamentals will still remain the same as I mentioned. We will very briefly discuss user interface issues so that you get an idea of what to capture during systems analysis phase. We shall have larger discussion on these issues at a later time. And finally as I said we will discuss the software requirement specifications which is actually a formal document which needs to be prepared at the end of first phase of gathering the requirements of functionality from the end users. It is only on the basis of this document that software design will be done later and it is on the basis of that software design document the actual coding etc. will happen. So there is a formal process like any other engineering process. In fact that is the reason why the whole thing is called software engineering. There is just a quick review of some fundamentals which we already understood. In modern computerized world the business functionality depends mainly on software. So in which way you can register for your courses easily or with difficulty whether all validation rules are followed or not etc. is defined by software. Consequently software for application becomes extremely important. You do require good system software because that forms the basis of any application software. If you recall we said operating system, programming languages, other tools etc. All the database management system which itself is a system software of some kind. So all these things together will form the basis on which you will write application software. You are not going to write application software. You are at still higher level but defining what functionality that application software should give. But in order to define the requirements and in order to design the system should generally be aware of what technology components are going to be ultimately used to implement that system. You require tools to build software which keep evolving. For example we discussed 4GLs. 4th generation language is called a specification language. You don't prescribe algorithms etc. SQL as you already know is a 4GL language. But there are other tools for example tools which are called RAG tools. RAG stands for rapid application development. So anything which permits you to develop application more rapidly then using any other old technology is called a RAG tool. For example when photon, cobalt, Pascal, C etc. programming languages were developed these were called 3GL tools. In the era in which these things first came these all were rapid application development tools. Otherwise you have to program in the machine language or assembly language. When SQL came SQL was regarded as a rapid application development. Something else which is now available which is actually called code generators. That means you write the specifications in such a form either using a formal language or using some formal modeling tool. And the tool itself is able to create Java programs, able to create C programs, whatever you want. So these are all rapid application development tools. So for PLC's the code generator is here. Of course we call it a code generator. So for PLC we program the controllers. Program the logic controllers. You are talking about the hardware. No, then we specify or we define the controller things and either it for some sort of diagram or a logic logic. Yes. And then there is a code generator. Correct. So code generator is a rapid application development tool for that specific application. See this is a problem with terminology. What is rapid today was actually very slow just 5 years ago. And what was very rapid 5 years ago was extremely slow 10 years ago. When I read this part because this is very good. No this is there. I will tell you the still old technology. When I was doing my masters we had to design a flip flop using 4N, 2N, 4 not 4 transistors. So you have to actually design what is the resistance that you will put between collector and emitter and so on. When I first constructed a flip flop after 10 days of design and 2 days of actual soldering the damn thing will either flip or flop. And that is because we figured out that the HFE which I had measured the equipment measuring the HFE was showing incorrect HFE for 2 years which I didn't know. So these were the things that people today will laugh at it. In fact I don't think people today electronic engineers can actually design a flip flop because they not only get flip flop ready man they get decayed compass ready man they get variety of other circuits ready man they get the whole goddamn computer or chip ready man. So these things will continuously happen and that is why the notion of rapid which is sensitive to time today what is rapid was not rapid earlier is all that I mean and what is rapid today will appear as a very slow process 5 years normal life. So ultimate is you know when something like God said let there be light and there was light. That is the fastest. You have to approach the level of the God someday. Anyway we digress so let's come back to this. Typical life cycle of an application software is somewhere between 10 to 15 years. Now this is something you must understand and appreciate. No software first of all remains static even for these 15 years. You take 6 months or 1 year to develop some software by the time you complete the development the end user will say no this thing is to be done this way that thing is to be done this way these additional information is required to be handled etc. Consequently throughout the life cycle software will keep changing because the functionality requirement is changing. These changes have to be adopted in the software by what we call software maintenance activity. Maintenance is a term routinely used in engineering to just maintain the existing functionality of a machine. That is not true in case of software. Software maintenance in fact requires all the skills of new software development because you are actually adding functionality. Words you are adding functionality to a piece of software about which you may not know anything at all because somebody else has designed and written it. So software maintenance in the software activity is the most difficult part sometimes more difficult than originally designing and writing. Now since the software evolves and I might maintain some software for 2 years I may go somewhere else somebody else will come and maintain that. Mentalism is what? I am basically writing new lines of code to include new functionality. Either you have your own style somebody else will have different style. Consequently in 15 years why is the life only 15 years? In 15 years the shape of the software is such that it is not easily maintainable beyond that. A machine or a car can run for 30 years. A bicycle can run for 50 years. An old wall clock can run for 50 years. Software can also run for 50 years but it will have two problems. One is it will do exactly what it was doing 50 years ago which you don't want anymore. So the life cycle of software is taken to 15 years not because the software stops working but because the functionality that you require cannot be delivered by that software unless you add to it and adding some software to a system which has been maintained for 15 years is very difficult. In fact it is much easier later to rewrite the whole damn thing starting from scratch. That's the reason why the software life cycle is typically 10 to 15 years. And in these 15 years software keeps undergoing changes as I mentioned through and through. What are the characteristics of software? Software is developed or engineered. It is not manufactured unless it is what we call a shrink wrap product. An operating system, a word processor, these are called shrink wrap products because one software once it is built, millions of people use exactly the same software that is because somebody has had the mindset to examine the common functionality that everybody requires. Put that functionality into software. Give that software a name like Microsoft Office or Open Office or whatever or Linux operating system or whatever and release that office. Then millions of copies can be produced but that production is hardly any production. It is copy and put it in a CD, burn it and package it and send it. The manufacturing cost so to say which is strictly only for making multiple copies of the software is trivial. The real cost is in developing that software. So buying such shrink wrap products, most of the software and including the original of the shrink wrap product also needs to be developed or engineered. There is a very fundamental difference from other things. Second, the software does not wear out. I just mentioned it. Software will continue to run perpetually. If you are happy with whatever functionality that software was provided. But if you want a different functionality or if the software is not easy to maintain then you need to change it. Hardware however does wear out and that is another problem that modern software faces. If you have written a software to run on a particular hardware and if the hardware goes out of date which means the underlying operating system, the underlying tools, the underlying processors, memory days, all those undergo changes then you may have a serious problem. That is the reason why you are at least an abstraction layer. The new hardware comes for example but it is guaranteed that your Unix or Microsoft Windows operating system will run on it. It is guaranteed that Oracle or any other database will run on it. It is guaranteed that your Java or C++ compilers will work exactly the same way. Then the entire software is called portable software because you can port it across to the new hardware. So fundamental fact, software does not wear out but hardware does wear out. Most software is custom built as I said. There is a limited role for packaged products and barring the packaged products which are called shrink wrap products. Shrink and wrap or the English word shrinking is you know, you shrink them into a CD and package them and wrapping WROP is putting some nice carrier, Microsoft Windows, 2003 or some such thing and selling them. So except for such applications you require either custom built software or even if you are packaged software, you might have heard of packaged software called ERP packages, enterprise resource planning packages or human resource management packages such as Siebel or accounting packages. Most of these will require customization for each individual organization. So software development is a process which requires people who can write software and those people need to spend time. That is the time which you measure as person months. More person months are required if the functionality is very complex and or the tools which these people use are very low level. Say somebody has to write programs in Kobal or C or something they will take longer. Somebody has to write programs in SQL they will take shorter. If somebody has a code generator where you just put the model and it generates the code and you just have to test it. It will take less time. But the fact of life is that software is developed or engineered and therefore you require people. More the world goes towards systems which are automated, more software is required and therefore more people are required. Now you can understand why India has emerged as a talent pool for developing software for the rest of the world and that is because the world demands more and more functions to be automated. This process is unlikely to slow down over the next two decades at least. So for next two decades it is guaranteed that more and more people will be required to write programs. To give you an example of what would happen if the tools for rapid application development are not there. Well it was estimated about 20 years ago when Kobal programming was the only thing for business applications that we discussed. The prediction was that in 25 years then that means around this time now the prediction was that if programs continue to be written in Kobal and continue to be maintained in Kobal then the number of people required in United States to write those Kobal programs 25 years later will exceed the total population of United States. That was the prediction. That means every American would be writing Kobal programs for someone else who is writing Kobal programs for someone else etc. Stupid situation. So the reason why the globe can actually use a very small percentage of people to write programs is because these tools are there and still there is a tremendous pressure because the functionality requirements are increasing day by day. With this background we come to the notion of software engineering. All of you are engineers or applied scientists so you should know what engineering is. This is an IEEE definition in 1993. Application of a systematic, disciplined and quantifiable approach to development, operation and maintenance of software. This is application of engineering to software because this is what engineering is all about. Application of systematic, disciplined and quantifiable approach. If your approach to problem solution is systematic that means there is a definition. Discipline that means everybody follows that definition. Notice that on a shop floor if somebody is let's say machining certain parts of a larger thing like an automobile that color is not permitted the luxury of saying I might put this here or that here. That person is required to produce exactly as per specification within the permitted tolerances otherwise that part is useless. That is called discipline. The people who actually write code have to follow the discipline of exactly implementing what is there in the design document etc. That is application of engineering. Engineering itself is the analysis, design, construction, verification and management of technical or social entities. I don't know that you have heard the term social engineering. In fact the word engineering is applied to represent these attributes of the activity. So analysis, design, construction, verification and management. You agree? These are the activities that you have to handle. Consequently if you are talking about software engineering it must follow exactly the same thing. And therefore we must define a development process. How you have to do analysis first, then you have to do design, then you have to construct that software, then you have to verify whether the software works or not and then you have to balance that software during its life cycle of operations. And there is a iteration because the moment you have to maintain it. For example somebody says change this functionality. The easiest way in your own case when you write programs and your guide or somebody says no I want this solution of functionality. What do you do? You simply add some more lines of code directly. In an engineering methodology you can't do that. But the original code that you have written has been written only after there was a design document. And the design document came out only after there was a functional specification of analysis document. So if some change has to be made in an engineering principle you have to go back to the original saying this is the new functionality. Then you have to do an impact analysis. How does this functionality impact the other parts? Introduce the modified design. And then as per that design you have to actually write the new code. And after writing that new code you have to test the entire software again in exactly the same rigorous way that you tested somewhere else. Just to give you an example in a banking software if some request comes for making some changes in interest calculation or something like that they will appear to be trivial changes they can be done in a single day by writing 10 lines of code somewhere. But to figure out which 10 lines to change and then to figure out these 10 lines impact what are the positions of software and then to rigorously test them is a huge process. No banking software modification is released without one and half months of rigorous quality assurance or testing no matter whether the changes made are small or big. And there are generally teams of 40 to 50 QA engineers working on just rigorous testing. So that's the kind of protocol that you have to follow for engineering. As you imagine I mean any engineering artifact you will say bicycle. So somebody says okay change the functionality of the bicycle and automatically changes the way the chain works between something and whatever and really there's 500 bicycles and they don't work what will happen. So that's why that's what exactly we mean by saying applying the engineering to software. How do you build information systems in the engineering context? First you must come up with function specification. The next phase in building information systems is actually called systems engineering. It's very difficult to expose you to the notion of systems engineering in a course like this because this involves feasibility study. What will you seek from a feasibility study? Somebody gives you a functional definition. This is the functionality I want. The feasibility study will say is it feasible to build software to do this? Second it will say how long it will take some rough estimates of time and cost. Imagine that I want a functionality for a limited immediate application and I must have that functionality in three months. And if you come up with a software analysis and later on design and code and everything and it takes one year to complete then it's an infeasible proposition. You must immediately say that in this time period I cannot deliver this. Or suppose there is a small business fellow whose annual turnover is say 20 lakh rupees and he says I want this software for enterprise resource management package and so on. Will you develop it? I don't mind waiting for one year or two years. You put 500 people and you develop the best software in two years and say yes. Now I have developed this software for me 18 crore rupees. That fellow will go bankrupt. He can't do that. So therefore this basic feasibility must be ascertained right at the beginning. Won't you agree there is a very fundamental engineering requirement of undertaking any venture? So that is what we mean by system engineering. In this phase you also allocate different anxiety to hardware, software and people. What do we mean? Consider you are automating a weighing scale for trucks. You know trucks move on the roads. So you have to weigh them because they have to pay some duties or whatever. Now you may decide on an automated process where the trucks will just go to a diversion. You would have seen those in the roads. Okay trucks go through something. There is a weighing scale. They stand on it. In the worst case some manual person knows some reading. In a more automated case the reading is automatically generated electronically passed on to a computer and in voice is printed. Now in the second case you are allocating more functions to the embedded hardware. The weighing machine has embedded hardware, embedded software in that hardware which automatically will calculate these values, transmit them etc. Even for embedded hardware you may need some software funcide. But the basic objective of weighing will not be done by software. It will be done by some kind of a scale. So while such funcide is described to you you must start assigning. These jobs will be done by hardware. These jobs will be done by other gadgets. These jobs will be done by software. And yet there will be some job which will have to be done by people. For example people might actually tear off that invoice and give it to that fellow. Maintain an additional paper record or call up someone if a truck just runs away. Whatever. So there will be functional idea assigned to people hardware and software. This part has to be done as part of systems engineering. The next phase is the systems analysis phase where you gather all the functional requirements about the proposed system. And this is the phase that we are going to concentrate on in this session. After you have prepared, after you have completed the analysis, the end of analysis is a formal document as I described called system requirement specification document. By the way for meaningful systems the software requirement specification document typically runs anywhere between 100 pages to 5000 pages. And there is not a single line of Java code or C code or SQL code anything anything in that document. That is the rigor with which system requirement specifications must be defined. That is the rigor with which you will define such specifications for any engineering activity. Now that is something which we don't traditionally do for software which is made by individuals amongst us for a quick assignment or something. Have you ever written a system requirement specification document for any program that you have written? Never. So this is a new activity as far as you are concerned. But this is what a professional software must precede or must be preceded by. Without this document there is no sanctity for any professional software product. The system design is one where you actually do what is known as a software architecture design first. And then you go ahead and do module level design. So let's say in your software ultimately during the design you decide that the architecture will look like this. There will be so many programs this program will call this that program will call this etc. And let's say there are 500 program units that you decide on. Each of those 500 program units will have to be described in terms of the algorithm that the program will have to execute, the data structure that will be used, the other calling program parameters for everything. Again no coding, no Java or no C++ at this time. And if the software requirement specification document is let's say 300 pages, it is not uncommon to have almost 500 to 700 pages of the design document for that. You get the point that is what a rigorous engineering process is. After this you start coding. Not a single line of program is ever written in this process till the software design document is ready. You might write some lines of code as an experiment or prototype as we shall see later. But none of that code will ultimately go into the final system. The actual coding starts only after the design document is ready, not till there. Observe that in all your experience when you wrote software, whether you wrote simple programs for your assignments, or you wrote larger programs for your MTech projects or PLD projects, you never did these steps. That's a policy that needs to be corrected. However, going forward that's something that needs to be done for any non-trivial software which you will require in your organization or which you are in charge of producing for some other longer. Observe that these steps up to system design do not really require you to be a programmer at all. So any person from any field who understands information science, who understands the basic underlying technology, and who understands the methodology of software design can actually do these things. There is one other reason why in the entire world the profession of software engineers is filled up by people from variety of fields because actual programming knowledge is not required. Although traditionally people have been taught some programming and they become programmers first and then they become analysts or designers, but that is not essential at all. Coding is one activity, testing the code as I mentioned is another activity, and integration of different modules much like engineering system. So you make 10,000 parts, but all of them have to be assembled and put together so that the motor car will run finally. From that job is assembling and after integration there is something called integration testing. You have to test the final product. Now let's say you have been given the task of developing an information system, let's say great processing system for IIT or accounting system or whatever, and you have gone through all these phases and let's say I have given you that task. Now when you come back to me saying this is your software, I have tested it. It is my responsibility to test that software. That testing is called user acceptance testing. Sir, I think that testing the document has to be also tested before going for coding. Yes. That is not called testing, that is called document review. We shall see that when we discuss the document thing. Yes. Testing is word applied to actual code testing and functionality testing. So what is the difference between this testing and the acceptance testing or user acceptance testing? User acceptance testing typically stresses the functionality. I wanted this function to work like this, this, this. Whether your software does that or not. I have stipulated in my requirement specification that interest on a bank loan of this type should be calculated like this. Interest on some other type of long term loans should be calculated like this. Whether your system calculates it or not. You would say that you have tested this thing precisely. But there ends your responsibility. To be as an end user, the responsibility only begins. Because if I accept your software and sign off and say yes, your software is okay by me. That amounts to saying that I have tested all the facilities that I require you to do. And literally there is a problem. You will tell me that you have to pay me more money to even look at that problem. So there is a testing by the supplier. There is a testing by the end user. It is very much like you buy a TV. The TV manufacturer would have tested that TV. But when the TV is installed in your home, you definitely want to check whether it switches on when you place the pick, whether the remote works, whether the various screens come, various formats, etc. You do check, right? That is nothing but user acceptance testing. In case of TV or such other engineering products, it is trivial. In case of software, it is very real. That is why even take for example banks, insurance companies or other trading companies or whatever who maintain the IT systems which are developed by outsiders like TCS, Vipro, or whatever. These organizations within themselves have large information technology teams. And much of the time they are doing this quality assurance or user acceptance testing. In fact, now there are companies which specialize in doing user acceptance testing on your behalf. So let's say you have got a requirement specification and you have decided that Tata Consultants Services will develop that software. You have given all the specifications. Incidentally, when you do a system requirement specification document, at that time you have to specify how will you do the user acceptance phase. You have given that and Tata Consultants Services have delivered the product after 6 months. Now you do not have internal capability to test the software from a user perspective. So you have other companies which specialize in testing and they will send their people. They will understand your user requirement specification. They will prepare test cases and they will execute those test cases and certify to you that yes, this meets your functional requirement. So you can see a very large activity and large number of people who are required for doing this. What is the process through which all these steps are taken? The first and the simplest to understand is a linear sequential model which is also called the waterfall model. Waterfall is not water is coming there and falling in one shop. But consider that there is a staircase and the water is falling there. So water will first come at the top, fall onto the next layer. It will fill it up, fall onto the next and next. So step by step. This linear model is a process where you say that first you do requirement specification. That is systems analysis. Only after the entire system analysis is complete you then do design. Only after the design is complete you do coding. Only after the total coding is over you do testing. And finally you say user acceptance test. Unfortunately the linear model or the waterfall model is not considered good because after I have told you my requirements usually one year later I will get to see the software. And when I see the software I will say no this is not what I meant. And you will point out no this is what you have written. The problem starts because the English language is often inadequate to describe my requirement correctly to you. I myself might not have understood that requirement correctly. Apart from the fact that my requirements should undergo a genuine change in the one year the very fact that I cannot convey the requirements well or you did not understand my requirement well could cause a lot of problem. That is why this model is the 30th model. It is not considered useful model for real life applications. Consequently you have a prototyping model. In the prototyping model you take the major business processors and data and quickly construct a prototype. This is where those application development tools are useful. In this prototype you show to the one user that this is how your screens will look. This is how your report will look. This is how the data fields will come etc. It is very impressive to see the statistics of how the users were given your certain requirements themselves modify those requirements of the wording of those requirements when you show them that this is how things will work. They say no this is not what we meant this is what we wanted. In the prototype can be put in place very quickly say it about two months time. Then at the end of two months you have refined user requirements which both you understand better and the user understands better. You may then follow the linear model. So this prototyping model is it says model the business data and processes generate the application quickly. Test and turn over. Turn over means show it to the user. But now you want to put a rigorous system in place. If your prototyping itself has followed good practices you may be able to use part at least of the design. But you may decide to rerun the whole design and do a perfect design for a industry-grade software link. So usually a prototyping model involves early prototyping and either following the linear model from that point again or realizing the prototype again and again which results in other model which we shall see later. Rapid application development I already mentioned so you do almost code-less programming. But evolutionary model these models have become more popular of late. One is called incremental model. So sequential plus prototyping means in a way. You do a prototype and then you say okay you modify this you follow linear model for some time then again you do some prototyping at some place. The spiral model is most popular. You know how a spiral goes right? It starts from one point makes a circle but goes somewhere else another circle and it goes. The spiral model in software development will mean I will do either quick prototyping or quick linear model following to one cycle. And once it is there any modification that is required now is absorbed and you again do analysis design code generation etc. Keep doing that and the software then spirals. Spiral model is considered very useful for longevity of the software because you have followed rigorous processes. You have a variety of other models. There is one called concurrent development model. There are formal methods model. I will not go into the details of these because this is not part and parcel of our course material here. What are the umbrella activities which form part of the software engineering process? First of all project tracking and control. All of you are familiar with project management? Not formally perhaps. If four of you are doing a project how do you manage the project? What does management mean? How do you ensure that the project is completed successfully in six months of stipulated time? Distribution and tracking and control. So you do distribute and you say this is my weekly timeline and at the end of week you find nothing has been done. If you don't have a tracking mechanism you won't even know till four months are over and you find that absolutely nothing has been done. So project tracking is part of the software project management. Control. If nothing has been done you shout, shout, shout at two or three people saying now run faster and catch up. The fundamentals of project management. Formal technical reviews. You asked about the document. Any document which is produced must be reviewed and it must be reviewed by somebody other than the one who wrote it. Now you'll understand why software engine has to be a team effort like any other engine. So in a team of say nine or ten or eleven people if you have three groups, one group writes specifications for something that document must be reviewed by another group. Without this review where you take everything from English language mistakes in a plain requirement specification document to completeness of functionality etc. etc. Two, if you are doing a code document, you see SRS is written first then there is a design document. When you write the code, there is a code document, the actual code. If I write some code, say Java program, this Java program must also be read by someone else. Not just machine reading and executing. It must be read by someone else. This is called code review or code walkthrough where people look at my code and say whether I have followed standards or not whether my algorithm looks okay or not, whether I have missed something. So you have very very rigorous process. That is why it is called engineering. Consequently the software that we write would be termed as a mature software. It still works. The software may not be a matureish but the process is not an engineering process and that is the difference. Software quality assurance QA is a very very important step in any engineering. Here also it is important. Unfortunately the word testing or QA is not considered to represent an individual with a very high status or stature. So in the software business, people are generally reluctant to participate in QA or testing activity. But you will understand that this activity is as important as coding and as important as any other system analysis or design activities. Configuration management. This is an important item when we are talking about large software. A large software would typically have hundreds of modules and these modules may be independently maintained. So one module is version 3.4, another module is version 2.1, a third module is version 1.13, 4, 5, 6, 7 modules are all A1.1, whatever. Now you are saying that this product which is the version 2 of my total application, say accounting application comprises of this module, this module, this module, this module, this module, this module, this module, this module. In the next release of that version 2 of my software which I call 2.1, 5 of these modules might have migrated to a new version. Some other modules I might have gone back to the previous version because they said bugs. A new module might have come up. This is called configuration of the whole of software. Configuration management is a big thing. You might also have one particular configuration for State Bank of India, another configuration for Bank of Baroda, a third configuration for something else. Managing that and maintaining these releases and versions is a very, very important activity in large and complex software. Document production, SRS document. I said that the SRS document will undergo a change if some change is requested by somebody later. How will you actually manage that document? You always say produce an SRS document for the project that you will do here. You will review it, you will finalize it, you will print it in whatever format and you will give it a date. Say you produce that document on 13th March 2008. Now we continue to use that software which is so generated and within the years there are three modifications which are required. There are new functionalities which are introduced. How will you reflect that into a new document called system requirement specification document new? How will you say that? Will you insert paragraphs or pages in the original document? If you have made 20 copies of the original document how do you make sure that the recipients of all 20 copies actually have the same paragraph change, additional pages etc. How do you make sure that you distributed 20 copies but five of my colleagues have made another 10 copies each and distributed to someone else? In short, how do you maintain the updated version of a document relentlessly across all possible readers at all possible times? It's not a trivial exercise, you agree? It's not a trivial exercise because software undergoes changes, documents will change and producing those documents, releasing those documents, making sure that all recipients have the right version of the document itself is one of the activities that must be considered under the umbrella of software development. Re-usability management. This applies to large software houses. Consider other consultancy services who have developed some application software for Bank of America in New York. There is another project which TCS is undertaking for, let's say, Deutsche Bank in Germany. You will agree that many of the functionalities could be very common because both are banks. So why should that team working on Deutsche Bank should rewrite everything else if it has certain libraries or functions which are already part of this? The reason object-oriented analysis and design methodologies are more popular is because object libraries are more easily reusable than functions or subroutine libraries that you write for specific things. Re-usability significantly reduces cost and development of software if you already have some software rig. It's not different from copying an assignment from someone. We are reusing somebody else's assignment. So in software also the same thing. Another important activity in software engineering is measurement and risk management. What is the risk of delays? What is the risk of errors? How many errors could be permitted in software? What is the impact of one error? How will you make software completely error-free or bug-free? By thorough testing. Can you do that? The answer is no. Error testing does not guarantee that the software has no error. Thorough testing guarantees that you have not been able to find an error in the software. Please understand the significant difference between these two statements. The most thorough testing. You write a simple program which is let's say 200 lines program. It has a lot of if statements and iterations. You can actually draw a graph representing branches at every if, do loop iterations etc. etc. And suppose you have to do a testing which will test every path with all possible values of the variables which that path can be traversed with and to ensure that you still get correct answers. The number of exhaustive tests that you will require will be close to infinity. You can write a 200 line program in 5 days and you can spend 5 years in testing it exhaustive. Consider simply all possible integer numbers that a variable can take. You want to test whether it works for n equal to minus whatever 2 raise to power 32, 2 raise to power 31 or whatever. For each value and for each combination of values of all variables. That is what is exhaustive testing. You can't do that. Error testing methodologies which test of what you call the other line. Where the certain combination can create a problem you want to test it. Simple example, you test whether numbers can be added. You test it with 2 numbers, 3 numbers, 4 numbers. At a certain point where each individual number is within the representation capability of the programming language or computer. But the sum total is not. You will have an overflow. So these are the kind of problems that are associated with testing. And therefore you need to do some statistical measurement for risk. All the statistical measurement of what is the impact of errors in your software. Another huge topic by itself. We are not going to cover any one of these by the way. I thought I will just mention these. So you understand that software engineering is not just basic software engineering is not just writing programs. You have requirement specification phase, analysis phase, design phase, coding, testing, acceptance testing and management. But even other than that all of these activities constitute part of software engineering. So it's a vast failure. Software project management requires basic project management like in any other project. It requires formulation of software teams. And the biggest challenge is coordination and communication. These facts were understood only over a period of time as software evolved to become a juggernaut. Earlier when people used to have machines and used to write small programs in machine code, the notion of software project management did not evolve. Because an individual was constructing a program and that individual used to be almost like an artist. When large programs were required and how do you measure the programs? You measure them by size. The size is measured by what we call lines of code. 10 lines of code, 100 lines of code, 1000 lines of code, 50,000 lines of code. Anybody written a program for 50,000 lines of code? Non-trivial. The article development agency project which our programmers have been doing, the programs have been evolving were kept that software life for almost 15 years now. The size is considered not trivial but small project by industry standard. The size is about 150,000 lines of code. And there are about 11 programmers, 10 of them maintain that code and one fellow writes new code. You understand the complexity there. And this is nothing. Typical binding software for example would have 3 to 4 million lines of code. The software which controls the space shuttles would have 20 million lines of code. These are unfathomable numbers. It is not possible even for a single individual who is the greatest artist and the greatest programmer to ever be able to write such code. It's not possible. That is why software development is a teamwork and the fundamental problem in any team is coordination and communication. Consider this. You have found groups which are 3 each. Now you will figure out between today and tomorrow how long it takes to find out 2 other groups to match with you and work out the details. And now you have a team of 3 groups. The groups that you would have formed would be obviously based on some kind of a nearness amongst people either wingmats or classmates or friends who have perpetual mobile contact. But what about the other team which is part of your group? Maybe another hostel. If you have to hold a meeting to discuss some common document, you have to make sure that those 3 people also come. In order to make sure, you have to make sure that they are informed about it early enough. And you have to make sure that they don't have a football match that evening when you are planning this. It's not easy. And if numbers increase, the communication hazards increase maximally. The standard mathematical formulation for communication cost between N people. How many parts are there between N people? If every individual is required to communicate with every other individual, the order of magnitude is factorial N. Because each node in that graph will be connected to, it's a complete graph. And complete graph has the number of arcs of the order factorial N. Now imagine if N is 100, you're dead. Your project will not progress beyond basic communication. Hi, hello, how are you? That's all. You can't do anything else. So if you have 100 people team and it is not uncommon for very large projects to have 2,000 people teams. 100 people team, 50 people team is very common in the industry. These 50 people cannot work in this free-for-all none. So how do you do? You form hierarchical teams. And within the hierarchical teams, the hierarchy becomes so watertight that if one group at leaf level node, one individual here, has to communicate with another person at the other leaf level node, they have to go all the way up. Any kind of team formation and team control is not very easy at all. This problem was best articulated by a giant in software engineering. At a time when software engineering was not the name given to the field, the name is Frederick Brooks. He's a Turing Award winner. He used to work for IBM. He is responsible for architecting the IBM 360 operating system. He made several million dollar mistakes in that operating system writing because of the goof up of allocation of work to things. But that mistake was realized after the mistake was made and IBM spent a lot of money. But the understanding that the whole world got out of that was phenomenal. He has written a beautiful book, one of the most beautiful books to date on software engineering. It is called the mythical man months. There's the name of the book. Man month or a person month is the unit by which today you routinely measure the cost of software development. But he claimed that the notion of man months is mythical and he wrote that book. It's a collection of essays on software engineering. This book was published in 1972. Everything stated in every essay of that book is still valid today. Mythical man months. He is the one who came up with the most popular adage in software industry. When you know a project is getting delayed as a standard for the project management is you put more resources and try to bring the project back on there. He found out statistical evidence was contrary to this faith and he came up with a line. Adding people to a late project makes it later. That's the famous line. So if you add people to a late project, actually it is delayed further. Rather if you let just the original set of people do it, they will do it well. That was the million, not million, hundred million dollar mistake made in the OS 360 project by him. Not by him but by his senior management. He was the leader actually of a particular team which was advising to the contrary. Beautiful essay. I think that book should still be available. It is worth reading for each one of you quite independently as a storybook. Process and project metrics. Metric is measurement. Number. Remember the basic definition of engineering is your processes must be quantifiable. Quantifiable means you must measure. You can't define a recipe in an engineering fashion where you say thola salamak daldo. You have to say exactly one fourth teaspoon. Things have to be measurable. Because you want to guarantee repeatability of whatever you do. Sir, earlier we told about this management. Yes, yes. So that measurement includes these metrics? There would be but what? The process and project metrics are describing here based on lines of code and function point oriented. Represent only one specific kind of measurement. The measurement and matrices is a very large topic. The software engineering book will tell you all the kinds of measures and matrices. Here I am just describing measures which make immediate sense to us. One measure which makes sense to us is lines of code. If a program is thousand lines long, as opposed to ten lines long, it ought to be much more difficult to write, etc. The function point oriented analysis works out on certain function points which are allocated to function art. So for each report that the software is required to produce, you classify that report as complex, average or medium or small. And then say if this report has so many files to look at, if this report has so many queries to answer, then let's say I allocate twenty-five function points to it. You look at a screen, you look at query, you look at number of inputs, you look at number of outputs and arrive at that, this software requires say two hundred and fifty-three function points. Now there is a separate mechanism to convert function points into an estimator for some months. Say if five function points will require one percent to work for six parts or one percent to work for one month, depending upon the technology that you choose, the familiarity those people have with that technology, etc. Software engineering therefore has to do with the estimates of cost and time. And please remember that when you do an effort estimate in terms of percent month using any one of these matrices, that estimate does not automatically translate into a calendar month by simply aggregating everything. Twenty percent months does not mean that twenty people working in one month will produce everything. The minimum time required for any project may be six calendar months. Twenty percent months also does not mean that one person will take twenty months to do the entire thing. Person may do it in twelve months. So you have to take this percent month with a pinch of salt and translate it methodically, for which again there are very good models established based on the statistics of previous exposure, etc., etc. and those models are used effectively in the software engineering app. Project scheduling and tracking, as I mentioned, that's an activity that you need to undertake as part of the project management. In short, the software engineering envisages a quality focus, a process, or a well-defined process, any model that you have, the methods that are used under tools, which we will use for doing software engineering. The software engineering institute model, which I mentioned earlier, this is developed by the institute at Carnegie Mellon, just like our school of IT, there is a software engineering institute which is now world famous. This institute does not do anything other than software engineering. We had one of the star faculty members visiting us for almost eight months here. We interacted with us in new developments in software engineering models, etc., is the fundamental objective of this institute. This institute many years ago came up with a model called capability maturity model. So what is the maturity of the capability of any group to develop software and deliver software? That is what this thing measures. They have defined five levels of maturity to measure effectiveness of any organization in its software development practices. These levels are called CMM levels. So CMM level 1, level 2, level 3, level 4, level 5, they are five levels. You might appreciate the fact that IOT Bombay's team is already considered at level 1. So we have a CMM level 1. As to how good or bad it is, I will just read out the level 1 model. It is called the initial model. The process is ad hoc. That means nobody knows what anybody is doing during the software development phase. Very few processes are defined. So there is no definition of how this process will be carried out. And success depends entirely on individual efforts. So we will agree that all the software that IOT has like you write are CMM level 1 software.