 Two concepts, replication and caching are frequented in content delivery networks. What is the relationship between the two? How could we integrate both of these to have better CDN performance is what we shall discuss in this module. Today's internet-based businesses are concerned with the quality of service of content delivery in terms of the latency, quality of service and performance. Two options, proxy servers and CDNs are considered to be integral part of CDNs. The proxy servers as such are used for caching whereas CDNs are used for replications. CDN as the main focus area largely provides the dedicated server for handling a large number of user requests and the content of multiple types. This requires the replication of the content known as the replicas across multiple locations for fairly long period of time. The proxy servers on the contrary are used for adapting the caching of the content as a transient phenomenon and this changes according to the usage pattern and the caching replacement algorithms. This is a quick summary of some features which are compared against proxy servers in CDNs. For instance the key practice in proxy servers is caching in CDN it is replication marked by C and R. The cached content in proxy servers changes very frequently and the content is requested by users for instance in an internet service provider whereas in CDNs the content is predefined from the CDN supported content providers. You see the perspective is altogether different. Scalability in the first case is low and the second case is high and in performance of course proxy servers suffer from flash crowd events whereas in CDNs it is fairly stable. Now how can we integrate both caching and replication? For that we have a reference that is the table or the diagram on the left hand side. We see we have on the x-axis the frequency of changes. If the frequency of changes is too high it means that particular thing cannot be cached it means it has to be replicated all the time. If that particular content is rarely updated then we have a question. Is that change predictable or if it's not predictable? If it is predictable we can think about making a trade off between caching and replication because the replication is going to be of the fixed part and the change is going to be in the cache. If it's not predictable it means we have no option but to always replicate again. It means the only option that we have is here we have rarely or frequently updated content that is either periodic or non-periodic. For non-periodic it is not advisable to use caching but to go for replication. Now depending upon the discussion that I have just done we can think about the diagram on the left hand side which will provide us a beautiful relationship between the two. Let's say we have the origin server. The origin server has the content that is dynamically changing. If it is dynamically changing it means part of it has to be replicated and part of it has to be cached. How much of it has to be replicated? Well it depends upon the replication strategy. For instance there is a replication strategy called IL2P which is a two phase replication strategy. In the first phase the placement or the container for the replica is identified. And in the second phase the viability or the business sense of that particular place is assessed with regards to the user requirement. So if we have a replication strategy then we correspondingly need some kind of caching mechanism also. We know through our understanding of operating systems we have the least recently used and least frequently used pages that we replace in operating systems. Similar to that we can update our cache depending upon these. So if we have a replacement strategy and a corresponding caching strategy we can now think about assigning percentages to them. Given that we are going to have a total storage of one we can think about c plus r is equal to one. It means the weighted sums of these are going to be equal to one. For instance if we think about 50% replication it means we are talking about 50% caching. Combining these two we can best fit certain user requirements. Now this integrated use of replication along with caching is expected to perform best while keeping the cost low and maximizing user quality of service. The reference is again Raj Kumar Bhaiya content delivery networks.