 Hi everyone. I'm Oscar. Welcome to Protocol Research at Status. I'm Dean. I also work on Protocol Research at Status. Today we're going to talk to you about VACP2P. So VAC is a family of protocols for developing secure, private, and sensor-persistent P2P solutions. What that means in practice for us at Status right now is essentially secure messaging. When people talk about secure messaging, you usually focus on sort of end-to-end encryption, forward secrecy, avoiding man and mill attacks, and so on. But there's other security properties such as privacy and predicting sort of metadata as well as sensor persistence. And all together these sort of properties help you protect against various types of threat models that we see around the world today, as well as for certain high-risk individuals. So one way of looking at the problem of sort of secure messaging is in terms of a messaging stack, where sort of each layer is roughly orthogonal to the other layers. So starting at the very top, you have sort of end-use semantics. So that's your, it's some private sort of encrypted group chat or maybe some public forum with some iteration mechanism. Then below that you have sort of the data, because this is a data sync layer, which ensures that sort of all the nodes have sort of the right idea of what the state of some context or conversation is. And then below that you have sort of your secure transport, which gives you properties like confidentiality, forward secrecy, and so on. And right now we use double-ratchet for data status, but there's others like message-layer security. Below that you sort of have the metadata protection layer, where right now we use whispered status, but there's also other sort of like PSS and tour mix nets that are kind of different, but they share some common design goals and they have various properties against various types of threat models. And then below that you have sort of the overlay network, overlay routing for natural virtual, and so abstracting awaited actual transports like TCP, UDP, and so on. And this layer sort of helps a lot when it comes to sensor persistence, because this is where you sort of guard against various types of traffic filtering and so on. And it's a kind of somewhat orthogonal piece of that. You also have trust establishment, which is sort of how you establish end-to-end trust, where there's various models like trust of first use and web of trust like PDP and key base and so on. Talking briefly about some design goals, we have it back. So we're trying to sort of create these interfaces at each layer, where that allows for sort of existing and new protocols at each layer of the stack. We don't want to reinvent the wheel here, because there's a lot of protocols and components that are being developed for people at this conference, as well as sort of in adjacent ecosystems and communities that sort of solve all of the problems. So it's more about sort of defining these interfaces and then allowing these various other protocols to work together. We very much sort of take the design philosophy from products like lead PTP and substrate in sort of enabling that and having kind of a protocol selection that depends on your specific design goals. So let's say you're extremely sort of privacy conscious, then you might want to use a mix net, such as the one developed by NIM. But if you have different sort of constraints, maybe you sort of care a lot about bandwidth or latency or sort of sense of persistence, then you might want to have a different choice at that specific layer. Another design goal is in terms of better specifications. And for status, we started off with this ad hoc protocol defined in the app. And now we're sort of moving to the more other extreme, where we sort of look at a protocol as an animal in its own right and sort of trying to make it maximum useful for sort of other parts as well. And this also includes sort of making more rigorous analysis of the exact gantees in behaviors of these protocols. Another design goal is in terms of sort of providing the tools to do sort of simulation testing in both at scale and for adverse environments. And this is for things like white block and Jepsen simulation framework and so on, to make sure that the design actually makes sense and will scale to the sort of have the properties that we want at scale, so it doesn't sort of fall under, which is what we're seeing with some of the stuff we had in production with Whisper, for example. Another big one is in terms of mostly offline. So when it comes to your messaging and so on, often users are sort of mobile phones. And usually this is sort of an afterthought. And this leads to people accessing the network through gateways and so on. And we want to sort of think about this from the get go. Because what makes mobile phones a bit special is that, for example, I guess you can't have back to executions, very limited on Android, it's draining your battery. So you got to think about these things and how you actually sort of natively connect to a network and make sure things work properly with research data devices. And finally sort of a lot of people use things like WhatsApp and Telegram and so on. And we do it internally at Stats as well. And like people are not going to use this technology unless you actually enable an extent use experience. So that's something you have to sort of make sure that you support the user stories for the end users, for people to actually use this kind of stuff. With that, Dean will talk a bit about our data sync layer. Yeah, so what we started off with that status is a very naive whisper transport. And whisper comes with various issues in the way that messages are sent, they expire at a certain point. So we need to go to save these messages on a mail server so that with mostly offline devices, a message actually reaches its destination user. So to solve for that, what we did is we built a data sync protocol on top of whisper to already add more reliability guarantees to the lower whisper transport layer. This is a general problem in peer to peer systems when you have unreliable transports where you need to add some form of synchronization protocol on top of it to ensure that we have these availability guarantees, which are usually only achievable through centralized servers. So how does this look? It's a very TCP like inspired protocol in that we now split up the way we send a message into multiple steps. What we initially do is we offer a message to a specific user. So Alice would offer a payload to Bob. And then once Bob has received that message of the offer, he will request it from Alice. Alice then sends that message. And Bob then returns an act saying that he received this message. With each of these steps, what happens is things are retransmitted. So if Bob is offline when Alice is sending an offer, Alice will exponentially try and offer that message again and again up until it receives a request from Bob. And that continues for every stage. So we know that this entire process has happened and that a message is transported in the end successfully because we've retransmitted it up until the next step. That's how the interactive mode works. It's like a very bandwidth high version of the protocol. What we also have is the batch data sync mode. So here we don't have any offers. We don't have any requests. What we do here is we simply repeatedly send the message payload up until we have received X. Sorry, this is the bandwidth high one. The other one just has a higher latency up until you've received the actual message because there's multiple steps in between. This still has certain problems because it requires us to keep forwarding these messages repeatedly. So one way we can solve for that is we can add a remote log on top of the protocol, which Oscar will talk about. Yeah, so the idea with the remote log is very straightforward. You essentially have a local log where you sort of have your message on a local machine and then you replicate it on some decentralized file storage. And that acts as a kind of highly available caching layer. So what you have is you have essentially two new components. You have a cast and then name system. The cast is a content addressable storage. And essentially you upload some content and then you address it by sort of its content hash. So it's a mutable store. And the name system, that's something like DNS or E&S or SwarmFeeds or IPNs in the IPFS world. And essentially gives you sort of location addressing or mutable references. So when Alice, they upload something to the cast and they get back to address for it. And then they upload sort of the name system. And then when Bob comes online, when he knows it comes online, and they know about sort of this kind of conversation, nor did this sync scope between Alice and Bob, they have a fixed point. They know that it can look at the name system, even with Alice being offline. So they check the name system and they see what messes they've been missing. And then they can go to the cast and get those without Alice being online. In terms of data format, it's very straightforward. It's essentially a mapping. And it's just a mapping from sort of the native message identifier, which is sort of what's determined by the upper layers to this sort of address at the cast. And then there's also sort of an address to the next page. So you essentially have these pages with logs, because if you have a lot of sort of messes, you want to split it apart. And that's an alternative, you can also embed this sort of actual wire payload and encrypted content in the name system as well. And this is just kind of a trade off in terms of what the name system, specific name system supports. So it could be a very small one, in which case you can't actually embed the content. And then the trade off is if you want to sort of traverse the link list and sort of go to the cast multiple times and so on. This is sort of a very general design. So it works with swarm, it works with IPFS, and also works sort of if you want to have it on a USB drive to the backups or any other sort of similar thing that implements these interfaces. In terms of privacy properties, this very much depends on the underlying CAS. So if you have, for example, a mixed net and you do posts or gets through the mixed net, then you actually get send and receive an anonymity that way. You can also do other things like you can ratchet the name system. So you change the location of it kind of like how you do ratcheting in sort of forward secrecy schemes and so on. Dean, we'll talk a bit about how we achieve certain consistency committees with the metadata format. Yeah, so a problem of all this messaging stuff is that messages are usually related and so we need some way to provide some form of consistency to these messages. So we need to be able to provide ordering amongst various messages, not only a linear ordering through my messages, but we want some kind of DAG if I'm in a conversation where messages are linked to each other to show what message depends on what message. So here we have an example of this DAG where what we want is we want linear lists of messages saying I sent this message before this is the next message and I keep linking the previous message that I sent. But what we then also start doing is we start linking messages that other people have sent me. So I start linking the last seen message that I've received into the next message that I'm going to send. And additionally what we can start linking in is we can start linking in things like remote log information so that when we go to this log we know exactly what position a message is placed at in this remote log so that if I've missed let's say 10 messages before my last message I know exactly which page to go to and from which message number to which message number to sync. And the way we do this is relatively easy right now. We just add a protocol buffer essentially to the message as a header where we have a repeated byte array of parents which is just the parent IDs of messages. So that would be like messages that have been in this group context before my message this can be messages I've received or messages I've sent. Then the sequence number that is the one which relates to the actual number we're seeing on the remote log and then we have something like previous message which is my last sent message. And additionally we also add a boolean to indicate whether a message is required to be act or not. And this is important for things like user typing notifications or that a user came online. Those are things that we don't need to really be consistent about because if that message gets dropped I don't really care I'm not going to retransmit that. Those are like one time notifications that I'm just going to send out and if it's received that's good. If it's not received I don't really care because it's not important to the actual in a chat context conversation that is occurring. Cool so in terms of problems and rough priorities as next steps. So our initial focus has been on data sync. There's still some things we have to improve with it specifically in terms of scalability and semantics for sort of large sync context. So imagine if you have a group with sort of a thousand participants you want to reduce sort of the the chatter and sort of have efficient joining of logs so you don't sort of overwhelm the network. Another big sort of priorities in terms of better transport. So currently we use whisper and it sort of has a lot of issues when it comes to scalability and is using proof of work for BAM resistance which is not great for heterogeneous devices or nodes because your phone can easily be overwhelmed by someone spinning up an EWS node. And it's also not in sent device there's no reason to to actually run the whisper node and it sort of doesn't map cleanly to the counting of the resources that you're actually sort of consuming and so on. So that's sort of in looking at various alternatives and that's a very early research stage right now. And related to that is also the sort of concept of adaptive nodes. Right now you have generally speaking sort of this thing where you have full nodes that are sort of participating fully in network and then you have light nodes or even sort of gateways that are kind of like leeches and they don't actually contribute to the network. And we think this is a bit backwards because if you look at sort of more successful like peer-to-peer bit torrent as an example as one of the most successful sort of peer-to-peer deployments you actually always everyone's contributing to the network and the health of the network. It's a very nice property and not everyone all nodes will be able to contribute equally but you sort of have the option to do that. And it's something that Swarm is also working on this sort of idea and like you can imagine that if you have a mobile phone with limited data plan you probably aren't gonna contribute to the network as you might pay for it for some other means. But let's say that you come home and then you sort of plug in, you use wifi and you plug it in for charging then you can sort of start to use that for relaying your messages. Or if you have an old laptop laying around then you can use that to sort of store the messages and help the network that way. So this idea of sort of creating a continuum between sort of light nodes and full nodes where you sort of have some capabilities and you try to sort of contribute that to the network, sort of make the network stronger. We think this, that's sort of something we're looking into and we would love it if more people are also sort of thinking in these terms because it sort of creates for a better network in general and it's sort of also more in line with how we generally sort of see successful societies and systems where all nodes contribute to the extent that they can. So yeah, that's it for us. Are there any questions? Yeah, and you can check out these websites where we have some research logs. You can read more in detail about Datasync, Remote Log and the general stack. We also have specifications. And yeah, feel free to reach us on Twitter. Yeah. Yeah, so there are, there is a, for MVDS that there's already a spec and it's also in the status version one app. Remote Log is a spec improved concept. It's still not deployed in sort of the app and then the other sort of products are still sort of in various research stages. Oh, and the metadata format is in spec and proof of concept stage, but not yet in the app. There is some, not gonna promise anything because that's also sort of research. There's some interesting ideas when it comes to using senior knowledge proofs and so on where you can sort of limit it that way, but it's too early to tell if those things will actually work out, so we still need to spend more time on that, essentially. I would also point out that that's a specific thing for multicast networks as well where you sort of have this implication factor. So for other types of networks that might not be as big of a problem, but yeah. Any other questions? All right, cool. Thanks, Ron.