時間がかかっているので始めましょうご参考になってくれてありがとうプレゼンテーションのタイトルはオペレーションのオープンスタッククラウドで100フィジカルサーバーを作っています私の名前はケインガラシプロジェクトを読んでいますひろみち伊藤さんのバーチャルテクチャパンもっとキロモトキさんのNHC今朝、ドコモーとNHCのプレースリリースをプレゼンテーションのオープンスタックニュートロンでプレースリリースの詳細をご紹介しますプロジェクトを始めた時にその資料が必要です例えば、ハードウェア、リソーシス、パフォーマンス、ハードウェア、ソフトウェア、コンフィリーションその資料を見つけた時に残念ながら、その資料を見つけたことがありません最後に、私たちの資料を取り出し自分の資料を取り出しこの資料のエンバーマンスを使ってスターベットを使って日本の国際のインスタッチのコミュニケーションのテクノロジーを使ってこのエンバーマンスは全ての会社によって全ての会社によってそれを見つけた時に一つのシミュレーションを使ってこのシミュレーションはとても良いですこのネットワークコンフィギュレーションをスターベットで作りましたネットワークが普通にある2つのネットワークレイヤーがいます1つのリフスイッチと2つのスパインスイッチ下のスイッチとスターバーが2つのネットワークコンフィギュレーションを使ってまずはネットワークコンフィギュレーションを作ります最初はネットワークの値段を思い出し、2つのコンフィギュレーションを考えています。1つはマルチアーシス・リンクアグリケーションです。今日は quite popular today。クリエイティングアボンリング、サーバーサイト、エムランクの2つのスイッチを作り、E-CNPのパケットを使う。そして、このコンフィギュレーションは quite new.今日は、マルチアーシス・リンクアグリケーションをサーバーサイトに置くことができます。このコンフィギュレーションは、サーバーサイトのロードバランスにロードバランスに置くことができます。このコンフィギュレーションは、ネットワークの中で、サーバーサイトを作り、しかし、コンフィギュレーションは多くのプロベーションを使っていません。2つはネットワークのネットワークコンフィギュレーションです。バーチャルネットワーククリエイションはエステンションのために、ネットワークセキュリティーを増やす。スペースを使って、トンネルネタコンフィギュレーションについてこのコンフィギュレーションについて実は2種類のドライバーはVXRANとGREこの瞬間VXRANを使ってとても便利ですドライバーは2種類のドライバーはオープンブーシッチドライバーはロードバランスについてドライバーはオープンブーシッチドライバーはネタコンフィギュュレーションについて使って試験とインバイメントをチェックしました実はこのスミュレーションについて1VM、1サーバー1レシーバー違うフィジカルホースについてスループについてこれが結果です見えますか?実はネタコンフィギュュレーションについてオープンブーシッチと見えますか?見えますか?エンドホースのエシエンピーは最も悪い終わりに選択してMRAGでオープンブーシッチをパフォーマンス・ポテンシャル・スタビリティについてでもシリアス・イッシュについてこのスループについて実はVMのMTUについて1,500MHz8,950MHzでもまだ4Gbpsしかし20Gbpsフィジカルネットワークもう少しこの試合にVMの数を増やすそして最終に477MHzのサーバーサイトについて同じ数をレシーバーサイトについて1,000MHzのサーバーサイトについて同じ数をサーバーサイトについてこのパフォーマンスそして見えますかMTUの1,500MHzでも20Gbpsフィジカルネットワークまだ2Gbps最終にでもMTUが増やすまだ10Gbpsのパフォーマンスですからフィジカルネットワークは50%だけですそのためスロースループを調べてこのプロベムはBXラウンを調べてソフトウェアにパケットエンカプセレーションサーバーサイトのプロセーションを調べてCPUロードが増やすそのプロセーションはCPUロードが増やすそのためネットワークパフォーマンスを調べてこのプロベムはネットワークを使ってBXラウンを調べてこのプロセーションは新しいBXラウンを使ってシミュレーションを使って2つサーバーサーバーサイトを使って2つレシーバーサイトを使ってBXラウンを増やすそのため素晴らしいプロベムを調べて小さく大きなプロセーションを調べて大きなプロセーションを調べてこのプロセーションは素晴らしいプロセーションを調べてこのプロセーションはCPUロードがサーバーサイトレシーバーサイトです最後にBXラウンを使ってMTU2つの小さなMTUサイト大きなMTUサイトこのプロセーションはMTU-9000フィジカルホストフィジカルホストでも普通のMTU1500DHCPサーバーこれを使って大きなMTU小さなMTU大きなMTU sometimes cause a problemコミュニケーション over the internet then it causes some problemsSo we give usersto changesMTU freelySo next topic is abouthigh availabilitySo ifyour system requirespeople sittingnext to the systemall the timesusually you need10 to 12 people moremore peoplebut this is a hugeinvestment for usso it is impossibleSohowever if we can delayfixing a problem laterthen we can only work onweekdaythat's whyto achieve this oneHA high availability becomesvery importantand our decision iswe put double redundancyand triple redundancyfor softwareSo this is the overalldesign of our HAwe are still usingsome commercial producton theroad balancer sideand below theroad balancer weput mysqlactually galera clustersand openstack apireason whywe are still usingcommercial road balanceris mainly we need toterminate ssl at theroad balancer sideand otherslike labitemqneutron agentactually we are usingmask foroperate OS installationwe rely ontheir owncommerciallet me talk aboutmysqlHAthis is our configurationsousingroad balancerweuse only one nodefor lead and lightandtotally we have four nodesand one arbitratorabitrator is only usedfor column basedactually wethose four nodesretains the dataand as for the health checkwecheck the TCP connectionsand the statusof each nodesothis status meansas long as the stateis changing in this arearoad balancersends the data to the nodesosome configuration sayscheck just only the status2 and 4 butif there is a synchronizationthen there is a state changehappen from here to herethat's why we are checkingthis status instead ofusing thisto statesanyway this is a procedurefor node recoverythere is a node failurethis momentand then theroad balancer detects a failureby checkinglike those stateand then itmigrates the connection fromdb1 to db2andafter fixingdb1 we dostate synchronizationused IST ordb1and after thatjust beforebeginning back the databaseto the clusterwe need to change the priorityif you don't change the prioritywhen the db1back to the clusterthen theconnection migrated fromdb2 to db1but we don't want tochange the connection frequentlyso that's whywe change the priorityjust beforerecover the db1 to the clusterand finallyit gets a healthy stateand we also measurethe recovery timefor each synchronizationthis is recovery timefor ISTandyou can seethis takes a long timecompared to othersthis is becausethere is a differencea db performance differencefrom the left side and right sidefor the left sideactually the maximumdb performancemaximum is like340TPSbut the right sidehas more than1000TPSandthis casethere is a 24TPSbackground trafficas for the recovery nodeit gets2040TPSand in addition to itthe recovery nodeneed to get a morelost state from the other nodesoin that caseideallywe need to haveat least doubledb performancethan the averagetrafficso in this caseaverage traffic is240TPSbut the maximumthroughput is340TPSso it can get doublerecovery timeget wrongsame things happen for thesst as welland the sst needs tosend more statesso it means it takes more timeso it is importantto preparea gooddbto think of adb recoverythis is asituation for disaster recoverywhen youlose all the datathenrestore from the backupis the only wayso for the first nodewe just restorethe database from the backupand fixing all the stuffand run mysqland start synchronizationafter finishingthis synchronizationactually those two nodescan be a donorso next synchronizationcan be done simultaneouslyand alsowe measuredtimefor each stateand you can seeas for the restoreactually it takesalmost 100 minutesbut others is quitefirstso for it takes100 minutesour strategieswe createdb backupevery 12 hoursand during 12 hourswe just take abinary logand reflecting12 hours binary logto the databaseactually takes timealmost10 minutestime is justused forapplying a binary logbut the otherstate is quitequite firstand in the best casewe can recover all theopenstack databasein 3 minutes from adisasterand alsothis isdns dhcp tftpbuth isstraight forwardfor dnswe create just master straightdstp there is a replicationandmarshamars itself is quite easywe just usevmand recover fromvm if there is failuremarshawe just needwhen we needto deploy a new serverthen we need a marshaso as for marshawe do not need to keep runningall the timeso just use this simple backupfinallyrabbitmqif we addmarchrabbitq address to the configuration filethenrabbitmqh isautomatically supported by the clientside butproblem with this one isat least we need 3rabbitmq ideally 5rabbitmq nodesagainst splitbrainso todecrease therabbitmq nodewe can takethisscenario as a wellso just using a loadmarsha anduse just one node forread and writethis candecrease the number ofrabbitmqsso inour casewe are still consideringplus of cons of those toolsand we have not decidedyetso next is neutron HAI hand overokI like to talk aboutneutron HAfirstI like toexplain thenetwork node setupbasic strategy forHA for each network nodeneutron hasseveral network手でやったほうにか下は行ったりきたneutron havemultipleseveral agentsdhp agentmetadata agentthese are required forbasic operation ofneutron networkthese agentssupport variousHA modedhp agentit supports active activemodewe canassignsingle network to multipledhp agentsto makeservice availableeven if one nodesfailedthis is very simple configurationinneutron dhp agentsper networkit is better toset this number2 or 3it is becausedhp agents are alsoused asdns server butdns mostrenax or unixresult mechanismonly supports 3dns serversnext is rare3 agentit supports at nowonly supports active stampby modeif it failsit meanswe canassign a routeronly to a singlenetwork l3 agentit fails we need to migraterouter on the hostto another agentfinal one ismetadata agentit is very simpleit has no stateto keephawhat we needis just to keepmetadata is running on thenetwork nodeslet me talk aboutmonitoringfailure detectionto keepha we need toha is roughly categorized indetection and recoveryfirst point isdetection by monitoringwe need tomonitor network agentfrom variousaspectone isdata planeforinternetworkwe can pingto the network nodefromvxlannetworklayer3 agent has also connectedto external networkso we need tocheck connectivityof external network tooandthe second partis control planereachabilitywe can checkagenthealthagentarrivenessbyusingagent listthrough neutron apieach agent reportsit's stateto the neutron serverthrough message queue periodicallyso we can checkwhat we need towe can check the stateby rest apianother point iscontrol planereachability so we need tocheck pingtothis is a summary ofmonitoring pointso we needtowe needto monitorvarious point todetect network node failuresonext topic isrecovery from failurethisconsists of three stepsfirstwe need to disable agentondisable agentwe failedon a failed nodethiswe saidadmin state upof the agentto followsby doing thisthe agentis excluded fromschedulingscheduling means the allocationof the neutron schedulerjust secondthenwe migraterouter on the failure nodeto another agentthis issimply done byrest apidisassociatedand associatenetwork or L3 agentcommandthenfinallyis migrated to anotherrouterfinallywe shut downinterface of the failure nodewhencontrol plane failedwe cannot controlthe network nodeno longerso we need to shut downthe interface or shut downnodeto avoidnecessary confusionthis is just atipsto check extranet connectivityextra networkis connectedreachable frominternetwe would like toavoidto access the node itselfso we usenetwork namespaceassignthe IP address in the network namespaceand check theconnectivityfrom the extranetworkI'd like to showsome results fromour experiencethe first istraffic duringrouter migration from one agentto another agentinjectedcontrol plane failurethenour monitoring system detectsthe node failurecontrol plane failurewe startmigrator routers fromanother layer to agentthis isit is measuredby iper fromextranetwork to one VMinternetworkduring afterten secondstrafficat this timerouter migrationstartedandtrafficrecoveredten minutes afterstarting migrationof the associated routernext one ishow tothe progress of router migrationin this casewe migrated88 routersfrom one agentto another agentrouter migration isrequested by REST APIrequestprocessing inlayerthree agent isbit slowcompared to the REST API requestbut in casecontrol plane failurethisthisslowness does not affectthe data planetrafficit onlyaffects thecontrol plane availabilitysothis slownessis not sobignot so big problemthe final topic ispossible improvementwe testedwithicehouse neutronso injuno theresome improvementinlayer three agentor other agentbig topic islayer three agentto integration withlayer three agentHAlayer three agentHAinjunoimprove the data planeavailabilityfor internalnetworksorrybut at this timeit doesn'tmonitor external networkso we need towe still needexternal network monitoringand it also needssheep planewe also need tomonitor sheep planesolayer three HA just forjust checks the data planeavailability so we need tocombinevarious methodsanother possiblethere arepossible improvement pointjunojuno neutronssupportlayer three ordinary schedulingwhen the network nodeis downbutto allowscheduling fromexternal through therest APIwe need toalso need toneed to be consideredandfor the sheep agentor rescheduling or reschedulerpossible improvementand we would like tocontributeto improve these featuresso in lastmeans we want to talk aboutmanagement resourcesso more or lessthose resourceswe need for the management planeand in someday weI asked each peopleto give us the informationabout how much resources they needand they give methis numberfor controller apithere are threemessage queueare three ideally fivedatabase as I mentionedfour for nodeand one forabituratorneutron three monitoringthree storagetens of terabyteand deploymentmogo db is not includedand twoso total management resourcesbecomes like thisbut actuallythere are quite a fewnova computeso as for theit is important to the sizingsizing is very importantit means we need to think ofwhich servers canintegrate into one serversfor thiswe put a huge load to thetestnet and we measuredload of each nodeso one of the typical testis a scalability testin this testwe increase the number ofbms as much as possibleand here ishardware resource boundarybut even thoughwe create morebm than the physicalresources actuallyopenstocks works wellanywaythis istrafficwe measure at each serversas for the api serversactuallyno CPU load muchbutquite a lot memory usagewas measuredand for lab temqactuallyalmost nothing no loadand for mycqlalso no CPU loadbutmycqlcan utilizephysical memories as much as possibleso the memory usageis quite bigand this is also very importanttpsin the beginningwe shareone mycql from theopenstock and the servixand the actualdb traffic is like around300query per secondbut if you can dividemycqlopenstock needsat least 150tpsand as I saidbeforeif you consider the synchronizationyou should preparea 300tpsfor mycqlbut 300 is not a difficult numberit's easyand this ismonitoringwe are usingazabix for monitoringand those are the numbersdb size we needactually those are thedb sizewe are monitoringand those are thetime we are retaining dataso in thissetup we need86GBand this issize for the openstockduring thedays we did a lot oftests likescalability testsand then measure the differencesize ofthe difference between the two daysand as you knowtoday openstock hasmany garbagecollection mechanism likekeystone you can deleteexpire the keystone from mycqlby running this oneso thedb size increaseis not so bigfinally let metalk about deploymentthere are too manygood deployments toobut at this momentwe arecreating our ownunchable baseone reason iswe can change the configurationespecially HA and neutrala little bit different answersand another iswe have alreadyusedunchable force operationand we don't want todo a lot of softwaremore than thatthat's why we are currentlyusingunchable force deploymentand operationthis issome tips we learned from thescalability tests butduring out of time just skipthen the endso thank you for listeningsowe I thinkwe can pick one or two questionsdo you have a questionokok sothe important this first onemaybe it is famous butneed to be careif you create asecurity group thenthe default security group is createdand you knowif there are100servers sharing the onedefault security groupand if you create a newBMthen this BM'sentry is created to all thehundred nodesso if you share the default security groupthen the obvious agentbecomes stuck easilyso important thing isif you want to do scalable testsyou should delete the defaultrule from the security groupthis is very importantwe spend like one weekto find the problemactuallyany questions commentsoksountil the next meetingour group is hereso if you have comments or questionspleaseplease to come and discussthank you very much