各位回答好 现在已经到了 Thank you, it's time for me to start my presentationThank you for attending my sessionMy session is called Understanding Scholarityand Performance in the Kubernetes MasterMy colleague 陈星宇曾焚书and I will deliver the speech那些都是我們從阿里巴巴的 cloud我們會說說我們如何使用蝸娜麗的方式去做我們的工作在關於蝸娜麗的方式和我們的練習如果我們來看蝸娜麗我們四個部分都會介紹第一,背後為什麼我們需要使用蝸娜麗還有歷史使用蝸娜麗我們也會說說蝸娜麗的使用我們在阿里巴巴中第三,我們會說說技術的經驗我們會說說我們的問題使用蝸娜麗我們會說說蝸娜麗的使用我們會說說第一,我們會說說使用蝸娜麗的使用和使用蝸娜麗的使用我們在2013年我們使用AIS使用我們的使用我們在2015年使用蝸娜麗使用這些這些賣相式我們在2017年用Sygma使用Sygma使用Sycamore在2018年我們印象中我們建立了新的一個治療管理作為了新的新的核心當部的我們在阿里巴巴中我们有这10,000个货币在购买TModel 淘宝还有其他货币我们有1亿货币如果看一看贫款会很高贫款会变成W11J货币我们有10,000个货币10,000个货币10,000个货币我们用2亿货币然后首先呢主要是一天限制吗我们认为货币会是一个何一一没有下一次把 Absolutely我们用2亿货币我们把1亿货币我们的货币我们也有很多人都不 familiar with thatcubing mark我們用cubing markto assimilate hollow notesto assimilate into the 10,000 scalewe suppose part is200,000for the old objectsit includes servicesconfigure math and sequenceand about1000kand in this caseusecubing markto do the assimilationwe find some latencyand there is a search latencyget port and get notesand some scheduler latencythe latency will beabout 10 millisecondsso for such acombatasewe believe that such a latencyis a big problem for usand i believe that for this latencyi believe there are a lot of reasonswhen we did the Kubernetesit's open source like a very average carwhen it's reach a very high scaleor very large scalelet's say if you want to run on the motorwaythe components will not meet the requirementsof this new situationif you look at this graphi believe that it just demonstrate the Kubernetesfor structurewe find that ETCDwill be a bottleneckfor our large scale applicationwhen the ETCD is at a very large scaleand will find some problemslet's say the ETCD read and write latencyincrease or hikeif you use the ETCDyou will find that there will betoo many new request errorand i believe that it will happenin heaven when there are a lot of notesor any US attack will occurand for the recommended ETCD spacewill be very limitedif you want to save a large amount of dataor the metadataand there will be the maximumfor your storagethat's why the ETCDwill stop providing serviceapart from ETCDthe schedulerneed to scheduleall a containerefficient will be very lowand latency will occurif you look at controllerthe controller cannotmeet the requirementsof this node updateand there will be some failureto our source versionsome errorsand last but not leastwhen you reach such a large scaleif one of the componentswill just failand the rebootwill take some timewill lead to a low efficiencywill lower your SLCand in response to these problemswithin our alibabawe make some optimizationon the componentslet's look at the ETCDin terms of ETCD for starterswe usedthis ETCD1.0 technologythat deal withthis large amount of datastorageETCD programacyif this node valueand port valuecan besaved in a kb clusterwe call ittelt clusterwithin our alibababy doing thislargevalues can be stored in this waynow we canhandle the storage problembut there are other problemslet's say if you run thistelt clusteror thisclusterthere will be somedata migrationthere will be very high latencyand all of datawill be stored withinone ETCD clusterit will be easy for us tosee some data isolationand if youoperate a resourceand the resourcewill see a hikeit willaffect other resourcesand that's all we havethis right handgraphthis is called 2.0 structureit has serverwith different resourcesit means thatfrom thekk prefix are differentwe justdivided intodifferent ETCD clustersby doing thisthe ETCD clustercanhave a highstoragebecauseit's just divided intodifferent ETCD clusterswill seedifferent rafter groupsso that's whythe data isolationcan handle it pretty welllike theevent clusterthe portnodecan be separatednode can be separatedand weall of thiscan adopt this approachwhen the 2.0is finished updatingit will look atthis left hand graphone ETCD clusteris not enoughwe can useother ETCD clusterto handle this problembut maybewe can haveanother approachto improveitthe capabilityof one ETCDthe single ETCD clusterwe can findthe bottleneckof this performancewe can justfocus onthis bottleneckin responseto this bottleneckwe havethis 3.0versionclusterand withthistrialswe canfind the bottleneckof this performanceif like theright handwe canfind theanother linestorageand itis apage distributionout withwhen a userneed tosavenew datawithin theetcditmaintainfreelystoragepagespacewe knowtherewill beanternalmaintainingitis likefreely spacei don't knowwhether itis clear for youif youlook atherethe structure hereand if youlook atthe red partthe red representitisin usageit meansthatdata are storedin thispartyou canjust interpret itas asmalldatapartrepresentitis notused需要分配一个连续两个页面的这个页面的这个页面的时候我们可能we canuse algorithmfromleft torightthe51and 52can beresent toreturn tothe userfor theetcd itselfitis awhen the n isvery bigor the storageverybigand the searchor query timewill beincreasingfor the usersthey willfind thatwhen thestorageis veryhighand thestoragelatencywill behikingso that'swhy wegot such aproblemfor thesegregationhash mapagorithmwith thisagorithm相同页面大小的页面key作为hash mapthe samepagefor thesequencestoragerestructurewe canjusttransferinto thisgroundwe callframe mapframe mapconceptivepage is1front page is42size 2will be51size 3with 47for thisnew data structurewhen weneedthe new pageand query timewill bevery fastif youcall the sequencethe querywill bevery lowif youthe hash mapwillfromoen too1so that'swhat theperformance isimproved a lotyou knowmanualwhen thereis thispagerecyclingfor the45 and 46let's saywe justrecover itor recycloatewe usehash mapagorithmto do itwhen isrebeastfirstwe willsee thatwhen thereis somepossibilityof thecombinationand withthistomovethecombinationthe previousonecombinationwith thisletter onesothese twocan becombinedinto abig onelet's saythiscombined45returnintothispagesixsothat'swhy wehave such awholecombinationyou canhave alook atslidethe structureis veryclearand forthisnewetcdversionwe willuse itand forthiscorrespondingthestructureandwith thisseveralgigabytewe canmake itinhundredsgigabyteandI believewe canmake theperformancetwenty timeshigher than beforein termsofetcdI juststart herenowlet'sinvite my colleagueandthe otheroptimizationanddid the time concernI willwithout further adoI will startto talk aboutefficient notehard beatsI don't knowwhat version dousethiscorrespondingfor ouralibawe usethisone pointone overcorrespondingor evenhighI believeone of thebiggest problemof usto be confrontedwithhow to dealwith thisnothard beatswith manynosehard beatwill beveryhighathighatbeforeif youyou willknow thatevery ten secondsthecubelatwillbeforthisproductionnosetherewill bea lot ofimages andvolumesandI believethatthehard beatwillbevery bigandmaybethefifteenkigabytewithtensomeimages andvolumesyou canjustimagenoseinetctherewillbeverybitetctransactionnosefor theetcdwe canjustknowthattherewillbea lotoflatencyandthedoubtwillbeasweknowthisnodeimagesverybigandthissequencewillbeverybigsothat's whywe need tobewe need tofigure outthisproblemincubelatisinorderto tacklethe problemI justwemustknowthislatestofnodewe canjustjudgewhetherthisreadornotreadthenodeobjectwe canjustaddthislistobjecteachnodehasits310sif youlook at thisright controllerand you canjustjudgewhether thisnode isavailableor isusefulor notwith thislistthiscyclecan beveryshorttodetectconstraintcan notshow withwhether thistimecan beshorterandbladedits100 bytesand itslightweightwithinetcpthetransaction nodewill be verysmallandif youlook atthisfunctionalityinthisenabled1.10onefourit'sadefaultopenandbladedheartbeatissueshowfunctionalityis one aspecthoweverwe need to deploy several nodesin an HA clusterunder this HA clusterwe mightdeploy several nodesdue to different reasonsif wewant toupgradethe apiserveryou will see thatall of the pressurejust load toone nodethat isnot what weexpectedand inorder to solvethis problemtowe need to addallowwe thought aboutwhether weneed to addallowedbalancerwe alsotalk aboutbecausethebiggest pressurecome fromheartbeatload balancerwe don't knowwhether you haveyour own clusteror youcreatea clusterincloudso if it isin thecloudthenusually it willhaveaload balancerselfhoweverthat doesn'treallysolve theproblemand wethought aboutanother solutionthe reasonwhythis problemhappensis becauseit's very simpleactuallybecausecubinaticsformade a lot of effortsthatuseHSGPStocommunicateand inorder toreducetheconnectionpresistion connectioninorder toreducehands upoverheadso whencubinat andHSGPSreduceestablish aconnectionthe connection willalways be therewill not be brokenandnow that we knowthis problemto solve itis also verysimplepersonallywe know thatthe clientwe need tohave aprotection mechanismand there are twowaysfirstlywe canwhen weofferone requestthen we cansendtwo many requestswhen exceededthelowand thesecond wayis towhen the valueis too highwe canjust rejecttheother solutionis thatwhen weupgradethe serverwe cantry tousemax searchbecause if youwant toupgrade itwhen youupgrade thelast onethe last one'strafficit'sreally highbut if youusemax searchfor threebecause therearethreeand inorder tohave abetter load balancingwe haveanother strategywhich is tohere we havetoretryfirstlyis to tryanother serverif there aretoo manyrequestsjust connectto another API serverand the other oneis tryanother serveraftersever minutesif it isalreadythen aftersever minuteswe willestablishanothertrxconnectionbut if youwant torebuildanotherconnectionwill thathavemagnitudeimpactactuallyit willnotbecause weneedtocontrolthefrequencyrebuildingyou cancalculatethey arevery acceptableandyou canalso seethestagram herethere arefour nodesandthe pressure on eachnodes areactuallythe samealmostthe sameand you canseewhetherthisclustercanrecoverfastrapidlybyserverthere isoneproblemif youdidcabineticsdevelopbeforeyou canseethatthe keycommunication mechanismbetweenclientserverisleastwatchandcashitthetypicalallofthetypicalontheetcdyou canseethatallofthecodescan bereusedafterwardandinthismagnetismthereisoneproblemifthewatchisbrokenit'sdisconnectedandsoin order toguaranteethe consistency of the datawe will havereleasenumberwhen youretryit willgive you the numberof theserverand then the serverwill use the numberto checkso afterfive you haveit will startfrom sevenand senddatastarting fromsevenifinformasfocusonportit willcreateportand willalsodestroyportsotheseriesafterwardwill alsobeupdatedhowevertherewill alsoifthedatacomefromothernodesupdatedandthatlink tothedisconnectionsoAPIserverwillnoticethatmyseriesdoesn'thavefivesobetweenfiveandseventheclientknowthatyou haveatoooldversionerrorinthis circumstanceoverheadisreallylargeandalsothecovalentandcontrollerbecausecontrollerisfocusedon a lotof datainwatchandtherationalisactuallyverysimplebecauseitonlyfocuson acertainpart ofdataifthispartofdataisupdatedsoyoucanestablishthenumberisfivebutiftheversion numberisupdatedthenthenumberwill betriggeredafterinformergettheversionanditwillautomaticallyknowthatthenumberisinwhenitisrebootsometimesyou might need topushsever thousands ofdataeventswealreadyreleasedasone pointone fiveifyoumet someproblemit'stoopenthatfunctionOK然后刚才是那个thecontrollerwill usecontrollertogetthedatahoweverwhen weare developingsometimeswedon'twant towewant togatherdatafromthecontrolleryou canhaveanothersolution因为我们通常都是就高可用不同的多分开12点那多分开3好there are severalADI nodesandwhenitreturnthedatayou don'tknowwhichnodeit'scome fromandfor thenode objectit will besaved in nodeand the returningmodelis very simpleyou canonlyit's onlyby nameor by serverif you wantto lookforall of thepods in one nodetheoverheadwill bevery largeyou havetogetall ofthenodehasfora lotofdatasothat's whyit hasa pagereadingtoavoidtoomuchoverheadandthispagereadingmagnetismwillleadtomultipleroundtripinthereasonneedoftheonetimeof problemsso we cantrynottovisititcdandwhenyouvisittheserverthe serverwillonlyrequestthecurrentversionnumberweknowthatif itis alreadyupdatedthenhsserverhas alreadycachedthedatainappcdthatmeansthevisitisconsistencethenyoucanreturntothepliantrequestithastoosoit willnotleadtotheserveryoucanhaveanindexintheserveryoudon'thavetoidentifyalloftheserveryoudon'thavetoidentifyneedtochecksomeofthenoblekeyoneofthetypicalbatchpracticewhenyoudescribealoadinone10,000loadit takesfivesecondshowevernowit onlyvisit in xcdpowerafter the indexin a labelingisso much fasterbecause thetrapload has a lot ofinteractionit is not just aone-time ofinteractionthe functionthis kind offunctionis very dependent onxcd1.4soin the 1.6versionwe willadd thisfunctionfinallyI want to talk aboutthe controlleroptimizationsoAin thecommonetics modelserver isstatelesshowever thecontrollersunavailabilitywill alsolead toserver unavailabilitysometimes yousend a requesthowever the requestwill not be dealt withyou have totransformdeployment to portand in alarge-scale clusterthe controllerinformerwillstrawmillions ofobjectsand that's severalgbwhen thecontrollerexpelledit willgo to thethat they willbecome anew clusterand tosynchronizethe datainserverhoweverit willcreate a largeoverheadand itwilltake severalminutesespeciallyyou use alot ofcrdin yourdailyworkthis kindofproblem canhappenaboutso inorder tosolvetohavethe informerin the betaso whenitbecome agroupit willnotneed tosynchronizethe dataespeciallywhen weupdatethe controlleris killedand thegroupcontrollerwillreleaseleaseonrestartin the controlleronly takes several secondsok然后通过这种方式我们可以做到主控制了特别在so themaster controller就只需要经过的可以被互补restartmaster controlleronly takessever secondsand ifit isanormalbrokenthen youhave towait a fewwait a fewif you haveany questionyou canask mein the last slidethecontroller fell overthe currentlector isstandbyit willnotvisitcontrollerso doesit meanyou needtomodifytheoriginalcodemaybeyou canjustwe willactuallysynchronize themin the communityi haveone more questioni knowcubemarkandhas sixisdown by sixscalabilityso ifwe willuse sigmato testyour systemorhave youpreparedyour ownwe haveour ownworkloadbased onour scenarioisdifferentone more questionwell thank youso muchfor yourspeechI noticedyou talk aboutsigma andcapacitywhat's the relationshipbetween the twothe original nameis calledsigmasigma is basedon alis oneandit'saindependentresearchedinvolveengineand in2018we haveinco-systemso we havethesigma todownloadand exchange themintocubeneticso it'saccount nativeso it'sretirationand replace themin the underlyinglayerthank you