 Hi, I'm Andrea. Hi, I'm Lot. Today, conversion measurement relies on third-party cookies, which can also be used to identify a user across sites. The attribution reporting API enables these measurements in a privacy-preserving way without third-party cookies. The API does so by moving the logic of attribution to the browser. With a single API surface, you can produce two types of reports, event-level reports and summary reports. Event-level reports associate a particular ad click or view on the ad side with data on the conversion side, but conversion side data is very limited. Summary reports offer more detailed conversion data and more flexibility for joining click or view data and conversion data and are best suited for reporting use cases. In this video, we will only focus on the summary reports. OK, so you said the API moves the logic of attribution to the browser, but how does this work exactly? OK, let's assume that I am an ad tech company. On the impression side, I can store on the browser some metadata called the source key piece. I'm using this to encode the dimensions that I want to measure, my measurement goal, and so on. Similarly, on the conversion side, I can store some metadata as well called the trigger key piece, which I will use to encode the dimensions I want to measure. Additionally, I will also store an aggregatable value which represents a conversion count or purchase value, essentially my measurement goal. When a user converts, the browser combines the source key piece and the trigger key piece to produce an overall aggregation key associated with the aggregatable value for this key. Later on, the browser sends the report to my server. This is one aggregatable report. OK, and the attribution flow you just walked us through happened in one single browser, correct? Yes, that's correct. As the name summary reports suggests, a summary report is made up of batches of aggregatable reports. Aggregatable reports are encrypted, then sent to the ad tech servers on a timing delay, and the output data only contains aggregate results with noise, but we'll get back to that a bit later. And because of the multiple built-in privacy protections, the API can be very powerful and flexible. To understand all these details, let's take a look at the structure of a single aggregatable report. An aggregatable report consists of keys and values. This is very generic and it's a good thing. The context on both the impression and the conversion side is encoded. As a result, the browser has no idea what this key represents. Only the ad tech understands what it means. An aggregation key is an identifier that represents the dimensions that I want to measure, for example, metadata or categories. The key is defined by the ad tech company and then held on device in the browser and is the unique identifier for which values are aggregated. Let's say that for my use case, I want to track the following dimensions, geography with four different options, campaign with two and product with four. This is how our final key will look like. Now, with dimensions defined and encoded, the last piece is setting what is to be measured. This could include the purchase value, conversion count or other pre-purchase conversions like add to cart. Now, looking at the bigger picture, such aggregatable reports are being collected from multiple users or multiple browsers, encrypted and sent to the ad tech servers. As an ad tech, I will store these reports, batch them and decide when to send them to the trusted execution environment or the TE, where the aggregation service will then be able to decrypt the individual reports, aggregated data, add noise and finally send back the aggregated results. Aggregatable reports are encrypted in the browser and can only be decrypted by the aggregation service on the TE. I have no visibility into the actual content of the aggregatable reports, which is good because this is a privacy protection mechanism only into the summary reports. Okay, you just mentioned that noise was added to the summary reports in the aggregation service. Can you explain a little bit why is noise needed? Sure, adding noise helps to protect user privacy. The noise is added in a way that makes it difficult to identify any individual record or user as part of the aggregated data. During this video, we will also be talking about buckets, so just keep in mind that bucket is just another name that we will use for aggregation key. Small buckets are more likely to have higher noise ratios and this is by design. We want relative impact of noise to be higher on small buckets to protect user privacy. If the blue bar is really small, like on the right of this diagram, for example, you have one conversion by one user, you have an individual that's alone in that bucket. So to protect their privacy, we want to have more noise in this bucket relatively speaking and the orange bar, which is the noise will have a larger relative impact on this small bucket. Now, when we say noise, we are referring to a random value. More specifically, this random value is drawn from a Laplace distribution. To define this distribution, there are some parameters needed for calculation. First, a location parameter mu, which is going to have the value zero. Secondly, a scale parameter B, which is going to be directly proportional to the contribution budget set to 65,536 or two at the power of 16 in the API and also inversely proportional to epsilon in the API. The noisy summary value is the result of adding some noise to the true aggregate output. Nice. So you explained how noise was generated. What can you do to make your data less noisy? Do you have any control over that? The value of the noise that is added is independent of the individual report values, the aggregate report values, and even the number of reports. What this means is that as a not the company, I can't control the noise value, the orange boxes here, but I can control the blue boxes. By stacking more blue boxes or by using larger blue boxes, I can make the orange boxes smaller in proportion so I can have better signal to noise ratios. So in order to start experimenting with summary reports, I need to first decide what I want to measure. Second, decide what dimensions I want to track. Third, define my aggregation keys and finally, collect aggregatable reports and send them to the aggregation service at the frequency of my choice to receive final insights from summary reports. So I pick the measurement goals, the dimensions and aggregation keys, and the batching frequency. All of these parameters have an impact on noise. Now, let's take a step back. As an adtech, is there any way I could see what the potential noise impact looks like? How can I define or maybe even tune my configuration to ensure I get high accuracy, which is lower noise ratios? Well, this is where NozLab comes in. With NozLab, you can, one, simulate the noise ratios you would get for your use case, and second, develop strategies to reduce these noise ratios and all of this without writing code. NozLab is a web application that's publicly available at this URL. So let's open it and let me first give you a quick tour of the tool. Today we're going to be using the advanced mode, but there's also a simple mode. And the simple mode is less powerful, but it's useful to get familiar with the tool. Now, back to the advanced mode. Here on the left is the panel where you would enter your simulation parameters. Also note all the little helpers, the little question marks here that you can click if you're not sure what a parameter is about. And then once you've entered your parameters to simulate, you would click the simulate button. And here on the right, this is the main area where you can see the results of your simulation. Now, for each simulation, you first see here a summary of your input parameters. And you also see the actual simulated summary values. This here is a simulated summary report. So each row in this table displays the aggregation key and the summary value for that specific key. In NozLab, we still hide this table by default just because it's a lot of data. What we do display for you, however, is the average noise for a given summary report. Now, important to note that nothing is persisted in NozLab, so if you would like to keep that simulation data, you're going to want to download your simulation results by clicking the download all button here. Okay, it's great to see the average noise, but how can you know the noise? In a real system, using the aggregation service, I don't know how noisy my data is. Correct, and this is the point that's intentional to protect user privacy. But in NozLab, we can calculate and display these Noz ratios for you because it's all a simulation. Now, noise is drawn from the exact same distribution every time, but it's random. So for a given set of input data, the exact noise values you're going to get are going to differ. But a simulation in NozLab should still give you an idea of the noise ratios you should expect because it simulates, in a mathematically correct way, the noise that will be added in the aggregation service. What does noise average mean? How is it calculated? Okay, first what matters is not the absolute noise values you get, but rather how much noise you get relative to the true real measurement data. So how big the orange bars are relative to the blue bars. And we call that ratio Noz to signal ratio or Noz ratio for short. Now let's zoom in and look at one entry in this report. One entry is one bucket and its value. Now let's assume that this bucket, this key represents the total purchase value for campaign one, geography Europe and product category t-shirts. And let's say that for this campaign in Europe, the total purchase value for t-shirt was 50 euros. Now this is the summary value, so some of that is actually noise. So the true purchase value is not 50 euros. And let's say that in that case, the random noise that was added in the aggregation service has a value of 10 and the signal is 40. So for this bucket, the Noz to signal ratio is 10 over 40, which is 25%. So really this ratio is how big the orange bar is relative to the blue bar. And that's for one entry in my summary reports what NozLab does is calculate the Noz to signal ratio for each bucket and then make an average of these ratios over the whole summary reports. What I've just described here is the simplest way to measure noise. That noise metric is called average percentage error and that's APE for short. APE is a nice way to measure noise because it's quite easy to reason about. The higher the APE value, the more noise you have relative to the true measurement data. However, APE has a few limitations and this is why we have a more advanced noise metric here, RMS RET. You can learn more at this URL, but for today, let's focus on APE. Okay, and with this, we are now ready to start simulating in NozLab. Let's enter our parameters. So what would you want to measure? So for my use case, I only care about one conversion type, purchases. I have two measurement goals. One, how many purchases took place and two, what's the total purchase value? A purchase is on average 100 euros and maximum 1000 euros. I'm also okay with measuring only one purchase per impression or click. What about dimensions? What would you like to track? I want to track three dimensions. I'm measuring across six geographic regions, four different campaigns and two product categories. I also know that the total conversion count per day over all buckets for all these dimensions is roughly around 500. Okay, so let me add this here to the parameters. What about batching frequency and that epsilon parameter there? Okay, we'll come back to batching frequency and for epsilon, it will be fixed in the API, but you can still adjust it in NozLab to assess its impact on noise. All right, with this, let's simulate and look at the results of the simulation. Now these noise ratios are okay, but let's say that for your use case, you want to further reduce these noise ratios. Remember, if you want to reduce noise ratios, what you need is higher values in your buckets, so bigger blue bars. So far, we were measuring conversions across six different geographies, but maybe for your use case, it's an acceptable trade-off to split your geography in two or three regions instead of six. So we're using larger geographic regions and so less buckets that are less precise, but we'll have more conversions in each bucket. So let's try that and simulate. Nice, this has a crease on noise ratios significantly. Okay, that's better, but what if I want lower noise ratios? Is there anything else we can do? Definitely, because so far, we've assumed that you, the ATEC company, wanted daily insights, daily summary reports, but a simple way to increase bucket size here in order to lower on those ratios is to wait longer. Now, in the context of summary reports, waiting longer translates into reducing the frequency at which you request summary reports to the aggregation service, and this is called the batching frequency. So let's try that. Let's set our batching frequency to weekly instead of daily, meaning we only get weekly measurement data and simulate. And here, all noise ratios went down to below 1%. So by reducing the batching frequency, you're making a trade-off. Less frequent insights, but better signal to noise ratios. I see here a parameter that we haven't used yet. Scaling, what's that about? Ah, yes, so scaling is an important one. Scaling is a technique you, as an API user, can use to reduce noise ratios. So you explained that the noise distribution depends on a few parameters, and one of them is a contribution budget, right? Well, scaling is a technique you can use to ensure that you best use a contribution budget that's available to you. So all things held equal. The better you use a contribution budget, the better your signal to noise ratios are going to be. Now, how you set up scaling depends on what you're measuring. And in a real system with a real API integration, you would need to write a bit of code to set up scaling. But in Noiselab, you can simulate what would happen if you set up scaling. So you only need to check a checkbox, and Noiselab will perform basic scaling for you. Scaling is recommended, and it's turned on by default in Noiselab. You can try turning it off and see that noise ratios will get much higher. And with this, we're done with our demo. But we've really just scratched the surface here. There are many more parameters for you to explore and tweak in Noiselab. For example, first, we just talked about scaling, but by default, Noiselab performs quite basic scaling. In a real system, you may want to use a more advanced scaling strategy. Second, you can play with various key strategies. Third, we assume that no buckets were empty, but you may want to simulate what would happen if you had buckets with zero conversions, because this can have a significant impact on noise ratios. You can learn more about all these different parameters here. So let's summarize. With Noiselab, I can quickly test parameters and strategies to reduce noise ratios. This is exactly what I was looking for. Great, Noiselab is optimal for this specific use case, but there are also a few things to keep in mind. First, Noiselab is not an end-to-end testing tool. It does not integrate with that tech system. It only really stimulates the API functionality. Also, where you input in the tool to run your simulation is not real data. You can configure it to look roughly like your real measurement data, and you actually have some pretty flexible controls, but this doesn't actually capture the full complexity of what real data may look like. It's still okay because it gives you an idea of the noise ratios you can expect, but it's good to keep in mind. Finally, once you've used Noiselab to onboard on the API, you will likely reach a stage where you want to quickly test out a lot of different parameters or even experiment with your historical measurement data. In that case, try the simulation library. It's a powerful tool. Also, keep in mind that it may require some engineering know-how. Nice, what if I have questions or feature requests for Noiselab? Well, Noiselab is experimental, so your input is very much welcome. Please file your ideas and questions on our developer support repository for Privacy Sandbox at this URL. One last thing, Noiselab is open source, so you can check out the source code there. So thanks for tuning in. Give Noiselab a try and tell us what you think. Ciao.