 to a session on Mechanisms for Building Distributed File System and the Design Issues. This is Dr. Nita Poojar, Professor in Computer Science and Engineering Department at Walton Institute of Technologies, Hulapur. At the end of this session, students will get familiar to the name resolution approaches in design and implementation of distributed file system. Now in the last session, we had started with the same topic that is design and implementation of DFS and we had started with the file naming approaches. So now in this session, we will start with name resolution approaches. So name resolution is done by name server and name resolution is the process of mapping the names specified by the clients to the stored objects such as files or directories in the namespace. There are three approaches of the name resolution. First approach is having only single name server. Now as you know, the job of the name server is the name resolution. Name resolution is the mapping of the names used by the clients, file names used by the clients to the files and directories stored as objects. Now having a single name server is the easiest approach but it may lead to a single point of failure or performance bottleneck because single name server cannot handle so many file requests and if it fails, the whole process of name resolution collapse. Now we will move to the second approach. Instead of using single name server, we will use multiple name servers which are maintained on different hosts that is computers. So here we have totally avoided the disadvantage of single point failure. Now here each server is responsible for mapping objects stored in different domains. So when a name say in the codes I have given a slash b slash c, this is the name used by the client is to be mapped to an object then the local name server is queried first which in turn may require the remote server for further mapping the part slash b slash c part of the file name. That is when a name say a slash b slash c is to be mapped to an object first always it is tried to be resolved by the local name server. Now local name server has resolved the first part that is a part ok. Now the remaining slash b slash c, they are not the ones that are existing on its local file system, they are existing on some remote file system. So this in turn may require the remote server for further mapping the other parts like slash b and slash c part of the file name. So this process has to be repeated until the complete name that is up to slash c is completely resolved. Now let us see the next concept here, next issue in the design of and implementation of DFS that is caches on disk or main memories. The benefits of caching files at clients has been observed in distributed systems and this has been even focused on the previous session. Now this data can be either cached in the local memory of or the hard disk at the client side ok. So let us see what are the advantages. So the advantages of caching the data at the client side is the faster access because when the client needs to access the same file again and again it can retrieve that file from its cache only instead of sending it all the way to the file server. So it will get the faster access of the files and I can use the diskless workstations here which are cheaper because the files are cached on the in the local memory ok. So at the client side so I need diskless workstations here which are cheaper and there is single design for caching mechanism both at the server as well as the client side. That will be much easier for the cache mechanisms to coordinate both at the client and the server. Now disadvantages is that there is a contention between cache and the virtual memory system for physical memory space because virtual memory also uses a concept of physical memory and cache is nothing but a part of physical memory. So disadvantage here is there is always a contention rates between cache and virtual memory system for physical memory space. Hence there is a need of mechanism to deal with this contention problem. Now large files cannot be cached in the main memory thus requiring the caching to be block oriented because the size in the cache memory is limited large files cannot be accommodated at once at one time instance. So we have to cache it in the form of blocks. So I will cache only first those blocks that are needed and then other blocks and so on. So caching has to be block oriented here. Now disadvantage of this block oriented caching is it is more complex and imposes more load at the file servers when entire file needs to be cached ok because we are downloading from the file server block wise so it will impose more load at the file servers. A complex cache manager and the memory management system is required here in case of block oriented caching. Now caching in local disk so instead of caching in the main memory we will see the concept of caching in the local disk. What are the advantages? First thing is we can cache the large files which was not done while caching in the main memory. So we can cache large files without affecting a workstation's performance. Virtual memory management is very simple here. It facilitates the incorporation of portable workstations. Now let's see what is the writing policy then. Writing policy decides when a modified cache block at a client should be transferred back to the server. So as I told you that when client caches the files ok and it uses block oriented caching and it when it makes some modification to the block of the files then when should that modification be transferred to the server that is called as writing policy. So there are two policies for that. Write through delayed writing. In write through policy all the writes requested by the applications at the clients are also carried out at the servers immediately without any delay. So as soon as clients make any modification to the blocks of the files on its own machines those changes are immediately sent to the servers so that the servers do the same changes in the file blocks on their machines. Now what is the advantage? The advantage here is reliability even if the client crashes here very little information is lost because write through policy has immediately done the changes on the server side ok. So even if client crashes very little information is lost. One more thing write through policy here it does not take any advantage of the cache because it is immediately sending the changes to the servers so it doesn't take any advantage of the cache. Now the second policy second type of writing policy delayed writing policy modified blocks are returned to the server after some delay not immediately. So what are the advantages here? Many writes can be performed on the block in the cache initially now some of the data or intermediate results could be generated while writing this ok and these intermediate results they will be deleted in a short time in which case they need not be written at the server at all that means intermediate results will never be written to the server in this particular policy because we are writing to the server we are sending to the server the changes after certain amount of delay and within that delay time all the intermediate results will be deleted and they are because they are no more no more required further. Now what are the disadvantages in this policy? In the event of client crash obviously a significant amount of data will be lost because you have still not sent the modifications to the server side and before sending the modifications if the client crash then all those modifications done to the data is lost hence there is no reliability. Now the third writing policy we will see here delay the updating of the files at the server until the file is closed at the client. So this is the minute changes we have done in the second policy and we take it as a third writing policy. So what we do here is even the files are updated but those updates will be sent to the server only when the file is closed at the client side ok. So traffic at the server now depends on the average period that the files are open. Suppose the files are open for 10 millisecond then the after 10 millisecond when files are closed that time only the updates of the files are sent to the server side ok. So traffic on the server will increase only when the file is closed cash consistency. So there are two approaches you see multiple clients would have cashed the data on their own machines and they will do the modifications simultaneously. So what happens is the cash goes into inconsistent state. So there are two approaches to assure that data returned to the clients is valid. So there are two approaches server initiated approach and client initiated approach. Let's see what is server initiated approach. Servers inform cash managers whenever the data in the cash becomes stale. Cash managed stale means they have not been changed for a long amount of time or long period of time. So cash managers at the clients can then retrieve the new data or invalidate the blocks containing the old data in their cash. Now client initiated approach it is a responsibility of the cash managers at the client to validate data with the server before running to the clients. So disadvantages with this approach is communication costs are higher. Now server initiated approach requires the server to maintain reliable records on what blocks of data are cashed by cash managers. It maintains it has to maintain this record. The client initiated approach the client the cash managers at the clients validate data with the server before returning it to the clients whether it is a valid data or invalid data. Now let's pause the video for a while think and write in which approach the writing of intermediate results to the server is avoided. So as per our discussion three options are given here immediate writing policy, delayed writing policy and third writing policy. So we had seen that in the delayed writing policy where the applications to the file server are done after a certain amount of time okay there it is possible that intermediate results are deleted on the client side okay due to the file modifications that you do the intermediate results are generated and those intermediate results are stay only for a certain amount of time and before sending the file applications to the server these intermediate results are deleted on the client side and they are not sent to the server at all. So the correct answer here is the delayed writing policy. These are some of the references used for this session thank you.