Hello everyone, so maybe we can start那我们现在可以开始我看到现在大部分的观众都是中国的观众所以我们今天也是有同船的所以我就用中文来去做这个演讲那其他比如说外籍的人是你可以去拿这个同船耳机So let's get started, it's the final session todaySo I think you've had a long daySo I hope I can give you more informationin my session, you've been from Intel and PGWorking on Network Platform GroupYou might know that TLS is a very common application or protocolSo first I want to ask how many of you know anything about TLSSo I can briefly introduce TLSThe logic of my presentationFirst I will introduce TLSSome of you may know itBut some of you may not be familiar with itAnd then I will introduce another topicTLS is a protocolHow could we improve the performance of TLSWe use a quick assist technology to accelerate TLSAnd when we do the accelerationWe found that not only do we need hardwareWe also need softwareSo we designed a synchronous frameworkSo we enabled OpenSLCinema acceleration frameworkAnd we have received a lot of feedback from the usersAnd many users or clientsWe have deployed this frameworkAnd we found that we can enhance this frameworkAnd then we enhanced itAnd then we applied it to VPP frameworkAnd then that's the final stepAnd then we can improve the performanceWe also found that they might have some requestsFor encryption, decryption, etcWe will talk about thisSo we found the problemWe solved the problemWe solved the problemWe found new problemsAnd how could we solve the new problemsAnd then put it in the new applicationThat's the logicSo let's look at the application scenarioNow this network is differentI get this graph from GoogleIt shows the TLS use cases and current statusYou can see TLS accounts for 40%And in this year it accounts for 80% to 90%LikeAppleIt requires all the app can support TLSOtherwise they will not let your app appear on the app storeOr when we do the service meshEnvoy this kind of service meshesYou also need to have the TLSFor certificationSo the application scenarios of TLSAre quite a lotYou convert the internal trafficTo external trafficOr from internal storage trafficTo external network trafficSo TLS protocol was widely appliedOr briefly introduce what is TLSIf we use OSR protocol to describeIt's like the presentation layer of TCPTLS cannot solve the actual problemsIt can make your sessions more secureYou have the handshakeThen you have the TLS sessionsLike clients say helloI need to have a conversation with your serverI will hold TLS versionAnd also encrypted protocolThen I will tell this to serverSo it will give me feedbackAnd then I can talkAnd I will have a random numberI can give you this listSo the server will send the public key to the clientFor example, when you visit Google or AliAli Baba is in the cityYou can catch the authenticationYou need to do the authorizationOr authenticationYou need to know if that's the right personYou need to talk to the right personAnd TLS can recognize this public keyI can get the Ali credential as wellBut the key is whether you have the private keyThe client will send another random numberAnd then it will be encryptedBy the public key the server has givenIt's just too complicatedThis time the client needs to giveHe doesn't need to use a machineThis machine will use the server's public keyThe random number will be encryptedBy the server's public keyAnd then the server has the private keyIt can decrypt itIf you use one formYou can only have limited processing capabilityAfter you have the random numberYou will have TRFAnd then it will create a keyThat enables two parties to talkThat's the process of TLSTLS protocol is very heavyIt needs to transfer a lot of informationMore importantlyIt needs to have a symmetric key calculationAfter they have encryptedHow could they do the decryptionThat involves heavy procedures and stepsIf we use software to do the TLS handshakeOn the current Intel CPUFor one second we can do 1KThat's a very good result alreadyIf we use this CPUThe CPS can be 60KSo it's 60 timesThere's one solution to solve this problemWe can use hardware to accelerateFor some first sample encryption and decryptionWe can let the hardware to do thisCPU is where the true value liesIf you use CPU to do some simple encryptionIt's not worth itSo we have designed a hardwareTo do encryption and decryptionWe can save CPU for more important tasksFor more complicated calculationsAnd because it's a specially designed hardwareIt can save your total cost as wellI will introduce this hardwareWe call it Intel Quick Assist TechnologyMonitor is tired as wellSo the Intel Quick Assist TechnologyCan do the following thingsIt can solve some heavy workloadLike encryption, decryption, compressHere we focus on encryption and decryptionYou have this bulk cryptoAfter TOS handshakeYou will have this symmetric bulk cryptoAnd then you have the public key encryptionDuring the handshake when you need to do theA synchronizeAnd then it can also do compressionWe will work with ClouderaTo compress the dataThose are parallel functionalitiesOn the rightWe can also do secure key managementWe can help you to protect your keyThe Q80Can help you to do encryptionDecryption and accelerationIt can take off the workloadOf encryption and decryptionSo that to improve the performance of the systemBut after you have the hardwareWhether you can fully releaseIt's performanceAnd that will involve the softwareIf you look at the common software architectureYou need to send a request to hardwareAt application layerYou send the request to open SSLAnd then open SSL send the request to Q80 engineWhich is the hardwareAnd thenAfter you have the processYou complete the transfer from software to hardwareWithout changing the softwareThe software will be waiting for the hardwareTo complete the workWhile the software is waiting the CPUMaybe sleepThat's not efficientBecause CPU will waste timeSo we need to fully utilizeFully utilize hardwareFor example while hardware is workingThe software should do something elseSimultaneously as wellThen we use this asynchronous modelThe hardware and your softwareCan do the tasks in parallelIt's like this computing unitIf you can make sure your computing unitCan workAnd the CPU will not stopAt the same timeWe hope both the computing unitAnd the CPUCan work very efficientlyAt the same timeAnd then you have more hardware resourcesAnd that's the asynchronous modeSo when the application has a requestFirst it will utilize the functionsAnd these functionsWill be sent to the engineOf the hardwareBut POT engine hereIs a softwareSo the request will beFirst sent to the softwareAnd then to the hardwareSo there will be a new sectionThe hardware is workingIf the software is still waitingThere will be a waste of resourcesSo the software can be sparedTo do other tasksSo here there's some non-blocking callsTo fully utilizedAll the resourcesIn terms of software and hardwareSo if we want to do thisAsynchronous modeWhich should we doSo the advantages for theAsynchronous modeAre that toMake the softwareAnd the hardwareWorks simultaneouslyAnd separatelyFor softwareThe performanceIf we use 14 coresTo run the TSSo it's about 10 IKBut if we use this hardwareThen the performance is not goodBecause the CPU will take longer timeIt's about 6KBut on the other handIf we use a synchronous modeWe can reach 90KSo you can see that the performanceCan be improved to a large scaleAnd how do we do itSo we work with OpenSL CommunityTo achieve a synchronous frogWe use a very basic foundationWhich is the fiberSo the fiber is smallerSo the user can manageLike thisSo this is the main designIn the designSo it's likeAnd the design was achievedBy this Async functionSo it's like thisSo it's like thisSo it's like thisSo it's like thisThere will be a side task generatedSo the sub-tact is generatedAnd created a descriptorFor communicationDoing a lot of thingsAnd then the requestWill be popped to the deviceTo the hardwareAnd we use an Async posturalTo talk to QGTalk to QGAnd thenThe second postural will passThe current taskAnd then the workflowWill retreat from the current jobAnd then go back to the start jobAnd get the F2And the main programCan monitorIf there are any messages comingSo if there's a new messageThey will restart the Async start job againBeing a sync start jobIt's very smartIt can identifyWhether there are previous tasksIf there areThen it will automaticallySwitch toThe Async post job procedureAnd continue with the unfinished taskThis is a very natural processAnd fully automatedSo it's like we start a programAnd then the program is runAccording to certain sequencesAnd then there's a post jobAnd sendingThe request back to main programAnd with these sequencesThere is a full leverageOf all the resources simultaneouslySo we have already realized thisSo based on this logicWe have achieved this on open cellWhich was a synchronized functionSo after thatWe have put it into anEnergicsAnd we have establishedOur own GitHub treeWhich was customizedFor our synchronized demandsEnergics willSo Engex will send requestsAnd this request will be poppedTo open SL stackAnd it works startsFrom thereSo with this GitHubWe will getSeven times better performance Than beforeDuring this processSo the previous imageWas before our adaptionAnd after our adaptionWhere we have addedThe notification mechanismWe need to startThe programSo we need toInitiate the sync moduleTo connect and communicateWith hardwareWe made some adaptionAnd the new programAnd the newSo the new programHas used some F2And which have also given usSome overheadWhere we do the communicationThe applicationAnd Engex will useSo just as I mentionedAbout FDSo this isA data structureIn the Linux kernelSo with thisThere will be some informationSend to kernelSo the kernel based F2Will create some overheadSo when the programsAre switchedSo the kernel readingAnd also writingAnd theTransformation of different statusCan have some problemsBecause the performanceIs not good enoughSo we haveA new mechanismSo when the hardwareFinish this taskIt has to sendSend notificationsTo our per layersOur previous designWas the notifySo why do we need this notificationWhy do we have to use notificationInstead of tellingYou what to doAnd if I gotApprovalI can do it directlySo we have addedThis enhancedCarback mechanismSo theCarbackIs integrated intoThe QAT engineAnd it requestsNo longerSendBut insteadThere will beA callback to be usedAnd there isNo switchBetweenThe storageSo there will beNo writingAnd readingOur kernelThe spaceSo when the QAT engineEncountersSome problemThere should beA callbackIn the new designWe think that the kernelDoes not have to evolveIn the callbackSo we have designedA new statusSo as long asWe design a functionThatCan enable itTo get the statusThenBecause ifThere are too many requestsSend into the hardwareThenThe limitedProcess and powerOf the hardwareWillCreate latencySo the applicationCan getNotificationAnd with this enhancementWe can get20%of improvement performanceAndWe workWith other teamsWith open cellWith open SSLAnd nowWe have the3.0VersionSo it willBe integrated intoThe open SL3.0Master branchAnd alsoIn the GitHubWe have the QAT engineAnd we'll get the codeYou can get the codeFrom this linkAnd also we madeAnother improvementWhich wasInvolved with VPPDo you know VPP?So this is a very goodProjectProcessing vectorThere is a host stackRealizing the VPPThis host stackCan deal withTCP and IPAnd the TLSSoThere is also A TS layerIn the VPPSo we canIntegre this functionInto the VPPWith these new functionsI just mentioned beforeThere is a new TS layerIn the sync supportSo whenVPP is doing TSThe VPP can sendThe request to hardwareSo VPPIs also the firstApplicationIn the callback functionBut we have doneSome modificationsOr changesWe need to doThe application to doEqualIs alsoAs synchronousIf we canRealize these functionsIn VPPWe don't need toChange the code itselfWe canPull the code into a VPPAnd thenI just transferThe normal toolsSo forPlans is still a TLSLinkBut the linkIs doneIn LinuxIt's a normal messageSoAll this canBeAcceleratedBy a signal'sModeLike if you want toAccelerateAppageYou just need toChangeTo VPP stackLastWe'll talk aboutKey Protection TechnologyWe use QATTo actualProduction ScenarioSome clientsWill put their serverAt some knotsAnd thenThey need to doThe handshakeButKey asset ofQSIs the authenticationOr you can sayIs the public keyOr the private keySo in this caseFor exampleYour private keyMight beViolatedOr might beSeen byOthersAnd KptCanProtectYourProbate keySo thatYour private keyWill not bePresentedOr demonstratedInYourSystemWhen you give thisTo your hardwareActuallyIt's not safeBecauseYour private keyMight beExposedIn your memoryAnd thisProbate keyWill appear againBut it's encryptedAnd afterKotLet's see thisProbate keyIt will have some descriptionsAnd will tell youThat thisProbate keyActually is encryptedAnd thenYou put thisProbate key intoPDTAnd thenKot will getTheProbate keyAnd will do theDecryptionSo in the whole processTheProbate keyWill not beExposedIn theMemoryAndKot itselfIs doingDecryptionAnd thenTo protect its privacyBut thisWill not impact its performanceSo thisBasicallyTheSituationSo that'sSummarizedWe provide aTOS modelAnd we found thatTOS need to beAcceleratedAnd thenWe provideSome hardwareSolutionsWe found thatHow we cannotSolve the problemWe need coordinationOf the softwareAnd thenWe provide thisOpenSSLModeAnd we foundMaybe that'sNot the idealSituationThen we enhancedThe codeWe putOpenSSL3.0And thenWe provideSomeKPDTKnowledgeSo thatThePlanet securityPrivacyCould beInsuredThat'sKPDTookawaysThank you