 Welcome everybody to this celebration of faculty careers series which was started in 2013 and really is an outcome of the strategic plan and the faculty of 2020's emphasis on professional development of faculty through different stages of the career. This particular series focuses on full professors who are at least seven years or past in rank and really is an opportunity for these faculty members to both in part provide a reflective overview of achievements in their career but also looking ahead into what exciting areas they're hoping to continue their innovative path forward in. And today of course it's my great pleasure to introduce Dr. Agarf Kapoor. Dr. Kapoor got his PhD in electrical engineering at Columbia University in 1984 after which he was at Syracuse University in electrical computer engineering and he came to Purdue in 1991. He's been with us since then with a focus on multimedia information systems on database security and parallel and distributed computing and so without further ado let's welcome Dr. Kapoor to offer us his colloquium here today. Thank you very much. Well thanks for attending the talk. Let me go ahead start doing it because I have a long way to go. The title is as indicated here models and architectures for multimedia systems and information security. Here's a outline of my talk. It primarily consists of two major topics. One is related to multimedia systems which kind of comprises of my first 30 for 15 years of research starting from mid 80s all the way to the end of 90s and then I have the opportunity to move into area of information security which is I've been working on it for almost 20 years or so. And here's some some of the topics I'll be discussing. Let me go ahead. As everybody probably is now familiar but multimedia is basically you have combinations of different type of modality of data, audio, video, images, text and so forth. I don't have any video right now otherwise this is a nice video to play. So question is how you compose all this data together, how you store it and how you communicate it over the network. If I look at the distributed multimedia information system it is kind of a multi-disciplinary technologies consisting of many sub areas ranging from computation aspect of different media all the way to AI and knowledge representation with the contents of the different type of data sets. So my primary work has been in the area of distributed databases and some work on networking and distributed control for transmission of data. So let me briefly mention what are the requirements for building a multimedia database system? Where do we need a system which should be managing different type of data? The issues are dealing with not only text but also dealing with different type of data set, for example audio with the text. Question is how you model the information, semantic models of the contents of the different data sets, how you index the information for fast searching and how you combine information from different modality together and to prepare what we call multimedia document and how you store this information and how you communicate it over the network. So based on these requirements we propose an architecture model for multimedia database system. It's essentially a three-layer abstraction of the system. At layer number one the bottom layer you're dealing with individual data type, for example audio, data, video data, text data, images, image database system and then in the middle layer you combine, you prepare, this layer basically deals with the composition and integration of different type of data together in how to prepare multimedia document and we'll be discussing that in some detail. And then on top here which is close to the user interface allows you to structure the documents, do media editing and do different type of queuing and searching of the information. At layer number one the question is if we're dealing with individual media type, for example video or audio text and so forth, how we can do the indexing of information for searching purpose? Should we automate the process of indexing of images or video or should it be done manually? For example, if you are working in a video on demand server and you have to index large volume of videos, can you do it manually? Should it be done automatically based on the contents and information? So the cost versus complexity and the robustness of identifying contents of the data and preparing your indexing mechanism. So the main driving force behind this automation is the volume of data acquired. Inspection indexing can be extremely large. So the question is how we process the information media data or video data or images, how we can process them and identify contents and represent them. So looking to the our initial work has been focused on on video data where we look at indexing of motion based motion based indexing of events in the video data and there are three approaches. The first one is about special temporal logic and other models which have been presented in mid 90s. There's a trade based model that we're discussing briefly today. As abbreviation is video semantic data graph model and the trajectory model, which is the kind of a sub case for the model. Idea in trade based modeling is you identify an object, you kind of bound it with a box in two or three dimension and then you trace the movement of the object over a certain video frames and then attach time duration and other information for the object so that you can subsequently use it to develop indexing structure. Now object relations can be they can be multiple object you're interested in. The question is how object relations can be identified or represented. And this is this can be done by looking at I'm not doing image, I'm assuming which processing can be done at the frame level where you can identify and with objects you can tag them and you can trace them over our sequence of frames. And then you can represent this information space time information in form of a directed graph. The graph is basically as a bipartite graph where nodes represent objects in the video clip and then you can have mapping of nodes from the duration, how long the object has appeared based on the number of frames. And also you can look at the motion vector information how these are an object or set of objects moving across many frames. Here's an example of a VSTG model taking from a sports video data here in the first part of the video frames, video segment, you have one or you have two objects being tracked. There are two players which have been bound in box and they're called O1 and O2 and you can label them and you can track them over a long period of time as new object appears and you can add two new more objects appear, you can also represent them in the graphical model, they're called O3 and O4. And in the third segment of the video you can have another object appear, another object can appear in addition to what you have already in the previous segment and you can kind of track down all this information in the model itself. So the idea is to extract the information or the motion information of objects and store them in a graphical model. The question is what I do with this model, the VSTG model. What we do is we look at the relationship among these trajectories or information of the objects, they can appear one after another or they can appear simultaneously. Two objects can appear simultaneously. So here in this diagram we're looking at binary processes. Processes can be appearance of objects in the video frame. They can appear together, they can overlap, their appearance can overlap or it can be substantial. Sequential for example, before operator tells you that the two objects have, one of our objects have appeared for a amount of time, the other one has appeared with a amount of time and this is a sequential appearance. Similarly, meet operation means the two appearances have no gap in between. The first one before means there's a gap of appearances and the third one is overlapping means some of the appearances have some common frames and similarly you have during operation of the equals and so forth. And based on this information we have proposed a special modeling of these events, especially spatial event and then carrying those spatial event into temporal domain. This is the kind of a, sorry it's kind of busy slide, but the idea here is based on these binary relations I can identify the spatial appearances relative to each other as well as temporal appearances in the time space. Here's an example, suppose we have a video frame where we have a aero formation by four planes. The question is how I can represent the semantic of aero formation in form of these binary relations. What we do here is we take the projections we can boning box each object here and then take the projection on x, y plane or even z plane if you have depth information about these objects. And then you can use these operators or binary relations which we call operators also and develop a spatial relations spatial relation with the logic in it and or logic in it. So it gives you a spatial logic. You can then expand this logic over time also. So that gives you a STN model for representing events. So based on this quick review here is an architecture we proposed for video databases. So you have raw video data, you look at the sequence of frames and at the same time you look at each frame and identify individuals using standard image processing techniques. And you can which I think there's a lot of work has been done on the on the image side. But then after identifying an image of desired interest for example if you're looking into a particular player or individual as president or so and so forth or even an object like capital building or so and so forth you can trace them back to the frames and then over a period of time you look at their motions with the structure of each other within a frame, how they are related and over frames, how these relations evolve. So that gives you some kind of a at least low level semantic of events associated to these objects. So we did a small implementation of this concept in biological datasets with the help of Paul Robinson, he's with the Purdue Cytometry Lab. So the idea here is if you look at the cells and see how these cells are interact with each other as over time we can identify some interested event. For example it can give you an idea of some of the challenges which biologist people are interested in, how we can identify the detection of disease outbreak, what is the effect of drugs on the cell, the cell level and how cellular biological processes evolve over a period of time. The idea here is the most cell level, the information is not only about looking at their standard features but there's a much more high-dimensional dataset. You can look at the spectrum of information about the objects in this case which are cells. So the current HCL, HCL which is a high throughput screening which is a HGS environment. The idea here is if you look at these images, the images of cells, they're very dense and the question is how you can get their features out and how you can look at the state of a cell as they're being treated with different drugs. So what we developed, we used our special temporal model to give a finite state representation of the model. We call it event detection finite state machines. Each state itself at the lower level give you special temporal bands which I briefly discussed in the previous slides. So here's an example of, I'm still trying to remember these terminologies. This event called radiocystosis, which is basically engulfing of micro, engulfing of cells by another microbe in the biological process and the question is how you track these, this process of phygocytosis. It can be broken down into five different stages of a time where the cell comes close to an organism that is it, enters inside the organism and it's kind of diffused inside the organism that is completion of the process. So the finite state model can be represented, can represent these five distinct events and their durations and there are all these events for each other. So basically what we're looking into a meat operation, M stand for meat operation which is one after another with certain duration or without duration. Meat means they're exactly connected together in time. So the parameters here is you're looking into the S represents the duration, sorry, D represents the duration of an event. S represents the accumulated time of the process of that point and you're dealing with like two sets of objects. One is they are all putting in the bounding box. For example, this morning box represents the main organism. The small one is the cell which has been, which has entered the first object. So what we did is we did some experimentation of about 500 cells from Hela cancer cells, the cellvic cancer cells and they were treated with the camphorthacin drug medicine and the question here is we want to see the effect of the drug on these 500 cancer cells. So the cell states are represented as the cell is live. The early apoptotic state and then late state when the cell is dead as a result of the treatment with that medicine. The question is whether the traditional cytometry experiment is sufficiently reliable in terms of predicting the effect of the drug. So the slide the diagram on the left hand side shows the population distribution across these three states and then what we did is we say well if you already track the state of the cell based on this model you can correct your distribution which turns out to be which turned out to be better and which is given the right hand side. On the left hand side we kind of overestimated in the live cell overestimated the ineffectiveness of the drug but the drug was ready more effective than it was thought out to be and the question is what what is the cause of the errors. Well there are can be several reasons you do photo bleaching of the dyes can introduce some error inconsistent illumination of the cells can cause some error and as well as you do autofocusing of the slide on which these cells are present and that may cause some error so the process can remove these errors and can give you better count and indicate what is the real effectiveness of the drug. So this kind of summarizes the first layer of the architecture especially for video data this part of you. Now let me move to the second layer of the architecture we propose. The idea here is you take different data sets together and you compose documents. The document for this purpose what we did is we proposed synchronization model. I'm sorry this is a kind of a fuzzy slide this is a really authentic slide directly from the paper and I promise I'll give you more colorful slides in a minute. So what happened is we can define the synchronization of different media together in time and space by using patinettes and call it object composition patinette. This is our I believe is one of the main contributions we have in this area. The patinette is basically a graphical representation in which you have places and transitions and the tokens which basically describe the process of representation of the information to the end user. The way we did it the fundamental concept behind here is based on those temporal relations which was proposed by Jameron in the AI area you can translate every every relation into a carbon patinette and if you do that you can really represent the presentation process of a document including images and text and so on and so forth by a full patinette using these temporal models. So for example it's very easy to understand you can say okay we're going to have audio, video and image together for so long and so forth and so forth the text will come later on and the video can start also after a brief delay of the started process. So this representation this model is giving much more detail at the synchronization from temporal synchronization point of view which must be satisfied and the user side. The question is if this document is stored far far away from the user side how do you have the synchronized delivery of the document to the end user. So this model has been analyzed extensively has been also applied and modified by many researchers. Basically the model can only not only give you end user requirement how information should be presented it can give you another idea about how you can develop synchronization protocol for the network in order to ensure timely delivery of the data. You can do storage IO management that how this data should be pulled out of the storage devices and without any extensive buffering at the operating system level. You can also introduce security mechanism into that in the model and we'll describe that in the data part of the presentation. So the idea here is you look at the document itself we can annotate the document OCPN document with the information like how much reliability you want to ensure what video or audio or text and so forth what is the resolution you want to provide to the end user if you can have multiple copies of the same image but different resolutions you can do that and at what rate you want to present the video data and so on and so forth. So given this information which is a document specification model question is how you translate these requirements at the network level operating system level storage point of view as a security point of view and how you can go from end to end point of view deliver right from the server all the way to the end user what is the implication on the requirement of the resources within operating system as well as the network. Now this is a study done by IBM in Heidelberg which says okay in order to have really when you're synchronizing information especially audio and video there has to be some kind of synchronization requirements you should not have out of sync delivery of video and audio data. So for example if you're doing video animation and the correlation delay cannot be more than 120 millisecond about the two streams of data animation data and middle data video audio which is called lip synchronization requirement cannot be more the slippage cannot be more than 80 millisecond so on and so forth. Similarly audio with audio when you're tightly covered stereo system 11 milliseconds is a requirement. So this experiment did it by by by whole group of users and try to understand what is the synchronization requirement if there are different type of modality and the modality like audio video and so on and so forth. So what we did is we said okay based on the model we can expand the model and put more attributes for each object. For example duration of each object that video should be 50 second long the size is 12 megabytes storage file display area on the screen there's a certain deadline been swept to the start of the document and rate and reliability and the content information you can provide that this video is about our sports and so forth and the schema itself. So over time when you look into time that how this delivery happens you look from networking part of you that what is the bandwidth required to transmit this data without excessive buffering on the user site. So it's a variable rate model it's transmitting videos with high rate then it transmits some text ignore it and so forth. The question is how network can guarantee delivery of the information without really any dropping of the data set. Should the network be looking at every slot here based on the what type of data being transmitted or should look at an average bandwidth requirement on which the question is anything about this bandwidth must be being fetched or must be stored in the grant side so that the data is not is not slipped or not skipped or not dropped and get full library information. So the QS parameters for entrant resource allocation data among multiple friends data is a common thing you can slow down data is basically delay between successive packets and to delay the synchronization of multiple streams the reliability is is any dropage allowed for example can I skip some frames or audio of packets and what happened if you're dealing with mobile my user how this information is quality parameters are translated into the resource allocation at let's say at the base station level or cell service station level so I'll skip some of the topic what we did is we look at we can divide into the requirement of the bandwidth requirement and this is basically a graph showing you how this one different technology has evolved over a period of time in terms of bandwidth and also how these technologies are already spanning the scope of the coverage now low mobility is example you're talking about fixed devices as a receiver or portable laptops and so forth or from pedestrian to vehicular movement what you have is high mobility environment and you're dealing with two different kind of broad technologies one is the cell level the base station level one is the Wi-Fi level and what we did is we developed technology for transmission of document both of both type of mobility okay so the question is when you're talking about the resource allocation in order to meet the Q and P requirements within at the at the low mobility level or high mobility level there are the what parameters you should be interested in what we did some experimentation is a liability requirement where the drop page of information can be tolerated to some level so the question is if I'm if I'm remotely browsing a graph browsing different document part of the document or a server and if the document keep on coming to you but you see the changing bandwidth requirement profile from one browsing item to another browsing item so the question is what I need is some information about the bandwidth requirement up front before I start delivering the document from the server side and the issue become more tricky when you're dealing with mobile user don't even you're looking into a model where you have base stations where a user is in the coverage area of a certain base station and keep on the team document over the network and then as he or she continues moving from base station to base station okay the channels should keep on switching over the internet so that you get you can you have continued you continuously receive the desired document without really any dropage in the quality okay so what we did is we propose 4G solution what called 4G all IP mobile wireless access infrastructure but the idea here is to develop overlay protocol on the access point at the access point so that I can aggregate the network resources to different user user may have high quality requirement for example if you are teaming video versus text only you might be interested in getting more RF channel bandwidth then other user who may be just putting text or audio no quality audio so the question here is how a resource allocation can be can be can be can be performed at the base station level and so what we did is we say okay at the cell level can we have a buffering mechanism where the cell can itself be the part of the IP infrastructure and which can keep on the teaming data and buffer it and then depending on the allocation of bandwidth deliver the data to the to the end user the idea here is there is a speed mismatch the internal network delay has a delay on the other hand the speed on the on the radio frequency channel is much faster there is no delay there so you have to kind of smoothen out the this mismatch by providing buffering mechanism at the cell level so the question is how different resources can be allocated resources can be allocated different type of users so what we did is we we kind of did some theoretical analysis that okay if the total capacity of the bandwidth is that's a C available for the from the base station and you're transmitting different objects for different users how we can distribute the bandwidth to different users and depending on the number of users and the number of the requirement of the bandwidth and the availability of bandwidth it's quite possible you may have to ride you have to may have to drop some information for every user so that everybody can be accommodated so we can form this optimization problem as a quadratic assignment problem and then see basically can we distribute the drop page of information across all the users so this work we did in the 90s with the professor Prabhu from industry and engineering he was in reliability and scheduling so this is a work done by one of my students she graduated in the 90s and this is the first time the research was proposed how multimedia communication should be happened in a mobile set of environment so we did this research at that time and the idea here is simple protocol you can design for information control you run your NMP in non-linear programming optimization problem if the solution exists for the newcomers fine you go ahead give the allocate the bandwidth otherwise the connection can be ejected now one of the issue here is you may be treating everybody else as equally there may be high paying customers who are willing to pay high price for high quality data transmission versus low quality paying customers so what you can do here is you can define multiple service classes and within a class you carry out this computation for resource allocation or rather than doing uniform you cross all the all the users so the the question is how this whole fit how this all this synchronization protocol and network sport fit together so this is the architecture we kind of proposed we will look at the broadband multimedia network layer on top of that you have this carry out QSP is routing and end to end resource allocation protocol we just studied and then the issue here is once I have a bandwidth available suppose I have been given an average bandwidth of one megabits per second and 10 megabits per second is it sufficiently enough to communicate my document what happened if the document is has a high bandwidth sometime or low bandwidth sometime what is what should be done once the bandwidth has been located so what we look at is the thing we just discussing right now is mud as a middle layer configuration where you have given a bandwidth now based on this allocation as even at the seller level or the internet level the question is how I can guarantee end to end synchronization that whatever the bandwidth I have I have to live with it and then say okay we can do we can look at the document stream of packets and one option is we dump the whole document on to the end user if it if he or she has enough bandwidth sorry buffer available that is great which is called downloading on the other hand the buffering is limited you have to slowly release the document over a period of time so that the consumer can consume the process the document at its on pace and without any interruption in the in the presentation of the of the of the document especially the the time dependent data which is video or audio so look at different different mechanisms from several point of view all documents should be released our document at least mean document has been chopped into segments we call them as I use and synchronization interval unit I guess that's the name and there are many criteria to design these synchronization protocol from the server side once the bandwidth has been allocated again the the quantity of each of these algorithms the protocol is determined by what is the chances of buffer being overflow because you are dumping on data onto this on the client side or you're slowly slowly releasing the data in which you are underestimating the the network delay and what happens is if you are if you do a buffer under flow you are you're already out of sync with the document the document presentation requirement okay so these are some of the work has been done having published in the literature and glad to share the citation and let me move on to the next topic which is a the second part of my work which I started somewhere in I believe late 90s when the CEDA Center was established we we have the opportunity starting our research in the information security area and so this is a slightly different topic which is which we started about 15-20 years ago so here my emphasis has been on information security and in particular on on dealing with the authorization mechanisms from users perspective for the information and we are dealing with access to databases so one of the strategy used in the literature is called strategy for authorization which has been used in the literature is called road based access control which is which is came from the NIST from NIST model NIST developed this model the idea here is you have set of objects or set of assets you want to share with users and and you have to have some the user has to have some privileges to access those objects so question is how you control these privileges you can do at the user level if there are a lot of user you have a really complexity problem managing these accesses so you can start defining roles for example a user may be maybe a physician for example or maybe administrator in certain cases and you define roles a role can be assumed by many many users and then roles can perform certain operations on the objects the set of objects so basically you have user role role to permission assignment challenge and the the question is how this how you can define policies that who can assume what roles and how what privileges are assigned to different roles so the main advantage of far back model is give you efficient management scheme you can have hierarchy of roles which is very natural in any enterprise setting and there are also constraints example separation of duty constraints is to prevent fraud for example conflicting roles should not be accessed by the same user example a person who is approving a check should is also writing a check shouldn't happen and there are many areas in which access control has been has been implemented what we did is we say okay there should be more than just simple accesses it can be more context information you can add into the into this model based on time based on location for example where you're accessing information what time you're accessing information so we developed this model which is called generalized temporal hardback model duty hardback model and the idea here is a role has to be enabled and disabled before it can be assumed again so for example you can have different type of constraint in the role nine to five role out of that time this role is not available so user kind of cannot have access to the information or it can be even given role activation and so forth so it proposed this model which has also been extensively expanded analyzed by different researchers as an example you have in the county office for example you can have a tax exemption processor or tax payment processor or a tax refund processor or and you can assign that how when these are these roles that can be activated and what type of access they can have on certain type of datasets reading writing changing or so on so forth you can the type of privileges can be assigned you can also assign the activation constraint that it's only between 120 minutes that's a two hours to four hours and so forth so the issue here is now if you're dealing with federation of federation of agencies how different agencies can operate when there's our access privileges are limited how information can be shared yeah so there are two different models there are basically three ways you can do it one is you can have a tight coupling among the agencies in terms of policy sharing okay so there is a less issue of autonomy more concern is the sharing information in a real sense and this the idea here is if there are multiple agencies they have their own policies can we get a global data policy which is consistent can we develop it and the point here is if you do that there's you have to kind of lose in your autonomy but that is okay especially for the government agencies because the emphasis is more on sharing rather than protection information okay versus other enterprises for example when hospitals are collaborating okay you want to protect your policies and have it more autonomous um so the first where the first type of collaboration is called federated collaboration in which you can say all right all the agencies or our organization they must disclose their policy map things together roles should be mapped if I'm let's say an administration in one organization and a covenant boss in another organization there should be a covenant mapping should happen in terms of the roles and then start merging the policies and if they're adding conflict try to resolve the conflict and coming up with a global scheme so that when you do the integration the question remains is you want to make sure whatever you can access originally in your own domain when things are all merged together you still continue doing it and also if you cannot access some information in your domain the global integration should not allow you to do the same thing okay so the security should not be violated in any form okay so there are set of policy conflicts can happen role assignment can change separation of duty can change and user specific SOD separation of duty can also change when you do the merging so question is how you resolve how you identify these conflicts and how you resolve these conflicts so here's an example of sharing among county offices you may have a county billing sorry county clerk office interacting with the county trade at office and there are some information which are common for example social security numbers of the taxpayers and names can be common on other hand information which can be shared across as example you are looking at tax amount property redemption costs and so forth and there are other private information within the offices so these are kind of environment we're looking at and the question here is if these two agents these two offices would like to collaborate okay what is the issue okay so what happened is if I start connecting things together I say okay in this area of CTO county tax office you guys that's the name I have a role TCM map to that's a PDM board here so so and so forth this is a graphical model for the RBAC policy where the user can access information after role has been properly mapped but there's a problem here for example this guy this user do three let's call this the user if you assume this role this is a permissible here but this role cannot access information here okay but what happened is if I go across the the site and come back here okay I have access here this access is not permissible this access is also permissible I can access information under this role so this is the violation integration will cause this violation similarly you can see other violations can happen similar the separation of duty violation can happen if a user can access this role he or she should not access that role okay but also this user can access this role suppose this is the given so what happened is if I go across the other domain come back I can access this information violating this SOD constraint so the question is how you identify these conflicts okay and how you resolve that so so the question is that we we need this resolution we need a resolution mechanism and question is when you are resolving this issue you must have something under in your mind that okay if I have to resolve it what is implication what is the best way to resolve it now there are many optimality we can map this problem of resolution into an optimization problem with certain objective functions for example we can write to resolve the issue but maximize data sharing or maximize role mapping links or maximize different prioritize accesses and so on so forth again so what happened is you take this graphical model of the R-back graphical representation of the R-back model from two domains and you can translate this whole model R-back model into an integer programming problem and the constraints within individual policies are mapped into IP constraints so we propose this approach I'm not going to get into too much detail here so you are minimizing for example the weighting vector defining optimality criteria you have a the question here is A is a constraint matrix B is a constraint constant vector and these are some of the variables binary variables you use to defining user role accesses and the question is what is your whatever your maximization criteria is do you can you get a good solution based on the on the parameters you are optimizing so what happened is what can happen when you translate this problem into an IP problem you can solve it and identify the best solution based on your optimization criteria okay well when you do that there is a trade-off here you want to maximize interoperational among agencies but it can cause autonomy loss okay can be reduction in terms of production of local accesses okay so as new constraints can be implied can be developed across multiple domains so there's a trade-off which can be need to be looked at so this was about when you're doing a tightly coupled federation of agencies but what happened is in another case extreme cases that I will keep my policy the way I have and you want to use my resources you tell me when you want to use my resources and I'll see if I can fit your request I can accept your request so what happened is if you're running an overall workflow problem and then I I'm based on the book like I can tell that I can identify that I need some help or some accesses and that's in domain number one some accesses in domain number two so and so forth so I can carry out my full workflow process and in that case what we have developed is another methodology I'll come to that in a minute and this kind of model is much more suitable across private enterprises example IBM has a municipal shared service cloud enabled application integration across many agencies in the e-government processes similarly industry 4.0 4.0 is a new standard coming up for manufacturing and production over for composition across many enterprises okay so each task of the overall workflow needs to access certain resources and those accesses are controlled by the individual organization policies and I need to verify can I run my workflow over all across all these agencies or all of these enterprises so here's an example of text redemption process across let's say county clerk office or county trade office and district clerk office so what happened is as a user if I make an initial assessment request that request is then transmitted to to district clerk office as well as county trade office they take their own times and so forth and get the information back if I'm doing an urgent request that can this whole thing be done in 40 hours 400 minutes and so forth then you start looking into a workflow like this if it's an urgent request process question is is this mappable to individual policies across these domains so what happened is that domains can only specify that we can serve you or not serve you this is all they don't tell you but they do not disclose their own internal policy okay so what we look at is looking at this interface specification of the policies question is can we is it possible that I can run the workflow across these domains okay and can question is can verify workflows in in absence of global meta policy which is the previous model we studied okay so the question is how domain policy is specified we are going to look at the duty of that model every policy every domain has its own policy based on time which is time dependent they have their own internal policies running providing resources on a different time so so we can use this assuming that duty of that model is used to capture policy requirement of the individual domain the question is can I map my workflow across these three county offices okay for example initial assessment request comes to the to the cco office which can then transmit information to other sites and then they can tell you whether we can finish the job according to according to your deadlines so what happened is in order to verify whether I can still compose this urgent request workflow process okay what we need to do is take the policy break in into multiple take the workflow break into individual workflows for the workflows in the individual domains and look into the property that whether the individual workflows of what are called projected workflows can be supported by different domains so here is a kind of simple state difficulty process where this part of the workflow has been given to this this domain which look at its own internal state model and try to map it so idea here is you have multiple asynchronous final state model running together and you're trying to go through these models and see you can coherently develop a global workflow verify global workflow across these different multiple a different final state machine model which represent duty of that policies so given the your workflow and this is a definition of the workflow the question is can you can you satisfy these two conditions that yes I can support the workflow and I can meet the timing requirement this is basically the detail of this slide so this is a architecture which we have implemented which takes your request of the overall workflow divide into different segments to different agencies and then agencies can be like domain one domain two domain three send the workflow to the individual domains and get the information back where they can support the virtual okay I have to speed up so here are some of the demos available for this thing of these two that well as the last topic is is a looking into privacy versus access control while the idea here is if you're sharing information are you are you protecting information from each other the idea here the question here is what is the tradeoff between the privacy and access control so I will kind of quickly go over it you have access control and let me the point here is for this is the information which are which needs to be disclosed and you want to protect protect individual information it's what you looking at the process of what we call generalization in which you kind of take the individual information and put in bigger blocks and so that the individual ideas cannot be cannot be disclosed individual information cannot be disclosed so what to use is for the privacy purpose we use gay and an organization technique which is generalization of the approach in the inflation database system if given and access control of data you want to generate generalization techniques and so that the privacy can be preserved so we look into this model and the question is can I enforce certain privacy requirement at the cost of losing some access information or at the cost of increasing access beyond my liberty the can increase the scope of my access and can still ensure privacy so that is the tradeoff okay the more if you want to increase your scope or the you should not be increasing too much scope either okay my access should be limited and so that I can state of law privacy I can read the privacy requirement so this is such a which we did the solution turns out to be complex problems in terms of finding these anonymization blocks and I will quickly go over it and that is a set of algorithms proposed because the original problem is and we compete and the idea has let me summarize quickly here the the k is basically the number of blocks you are merging together okay the half the value of k is the better utilization of the information is by the same time what happened is you are you have to give up a lot of access that is basically the vertical axis is indicating okay so there's that is a tradeoff we want to highlight in this slide and here's the system we have implemented for this organization technique for the relation data we have extended it to streaming data as well as the graph data the graph data is very important for us because in currently most of the social networks are graph data and you want to protect your privacy as much as you can not in terms of your IDs but also in terms of association structure who are your friends and how far they are they're connected to you and so forth okay so let me to put multimedia document security together the first part and the second part and the idea here is I can look at the ocpn model presented question is I can protect some individual individual in the in the lesson video or audio frames and that is looking at the different level of resolution of images and we can change the the ocpn model and put this security information requirement inside the model that if the beer is an adult can go ahead this way if this is the child take a different path for presentation different object must be structured and this is the structure of the new tool multimedia database system with the security model implemented but the last thing which is a tool we developed in which a user can can can compose their own information that if you are a if you are a physician or you are a patient or you are a health provider both can all these parties can access information but as a as a as a patient I'm the owner of the data I should know who can access it we should not access it and here it can be accessed so we have this this tool available this is a tool for this for this kind of access originally we designed it for for Facebook we we filed the initial patent but somehow University did not encourage us to go forward and turn it over to after we have this a paper ready Facebook came with this model how to protect individual information that the Facebook so currently this is a we're looking into now building a Xenon cyber base system we're more looking into now threat management across the system how to detect threats what are the response and recovery mechanism and some of the chatting we went looking into the scalability aspect of a large scale CVS and real-time response and recovery okay we started with the some work on cloud data center as a part of the CVS and recently published some work on this area at the end I would like to thank all the sponsors and collaborators as well as the most important among the greater students and the colleagues and collaborators with the department across the campus and from other institutions I'd be glad to thank you sorry for taking more time more than I know I'm sorry for rushing through the whole presentation too much to cover but I try to condense into some of the key projects we did which I believe has some major impact any questions that's the kind of you know kind of sort of in much of the area so I didn't see what you mean in the case as I said my research kind of shifted toward more on security I think there are some more interesting challenges in security which is which is kind of a new learning for me so I'll for near future at least I can continue working in the security area that's which happened you know going uh as I mentioned the serious was established the center they started a seed funding project and I look at the modeling work we have been doing in multimedia like between that model and the final state model and I see there's a opportunity for us for modeling quite a few especially a duty artwork model so we were successful in getting that seed funding but then it pan out into many NSF funded project on the road I can say the share luck I guess that we have been constantly getting funding from NSF the current work we're doing is North Grumman for the last few years for developing a kind of a theoretical basis for developing resilient CBS system which is can be for the CPS cyber physical system or IoT systems or cloud data centers so I found that very exciting that we may have access to some real data that partnership but this is such a cutting edge that you know people who are evaluating these things here don't know enough about you know I mean so something comes with a tax you're actually a good idea that goes maybe ahead of the time and then you know Facebook comes through with it I think that two things happen like that the student Arjuno Samuel who was who did this project as a we have a student from kind of school they found a very good job with Microsoft so they they started right in the middle as you you know they have to leave also we got some funding from from the university administration but then we signed for the VC to send a proposal we were say well there's no hope for this technology there's no potential in this technology and that kind of sentence was another set back for us so doing this as the students left that it was a major issue second thing is the VC VC funding was they don't give us encouraging remarks although we see Facebook doing the same thing later on after six months that every user can can protect information access can be given to their friends or relatives which is essentially the model we develop in IPM system great and it was a time let's give Dr. Kapoor thank you thank you very much for your time thank you you're a thank you for coming absolutely I'll be glad to thanks Sabu for coming