You are on page 1of 14

Technical Note

Performance Analysis Methods


ESX Server 3
ThewidedeploymentofVMwareInfrastructure3intodaysenterpriseenvironmentshasintroducedaneed formethodsofoptimizingtheinfrastructureperformance.Keytothisexerciseistheabilitytoidentifytheroot causeofperformanceproblemsonVMwareESXServersystems.Thispaperdiscussestheprocessfor identifyingperformancebottlenecksonESXServersystemsandrecommendsactionstocorrecttheproblems youidentify. Becausevirtualizationenablesyoutoconsolidatemultiplephysicalserversontoasinglemachine,traditional operatingsystemanalysistoolsthatareunawareofvirtualizationeithermisscriticaldataorproduceinvalid results.VMwareInfrastructure3providestwowaystomonitorESXServerperformance. VirtualCenterprovidescapabilitiesforasimple,graphicalfirstpassanalysisofhostperformance. Theesxtoputility,thecommandlineperformancetoolavailableonESXServer,offerscapabilitiesfor moredetailedmonitoringofhostperformance.Forinformationonusingesxtop,seeAppendixB:Using theesxtopUtilityintheVMwareInfrastructure3ResourceManagementGuide(seeReferenceson page 12foralink). Thispapercoversthefollowingtopics,withrecommendationsineachsectiontomatchproblemsyoumight identify. CPUAnalysisonpage 2 MemoryAnalysisonpage 5 StorageAnalysisonpage 8 NetworkAnalysisonpage 11 Referencesonpage 12 Appendix:Countersonpage 13

Copyright 2008 VMware, Inc. All rights reserved.

Performance Analysis Methods

CPU Analysis
CPUloadisgeneratedby Theguestoperatingsystemrunninginsidethevirtualmachine Theapplicationsrunninginthevirtualmachine ESXServer,asitprovidesavirtualinterfacetothehardware AlthoughtheworkperformedbyESXServerdoescausesomeCPUload,applicationsinthevirtualmachines generatethegreatmajorityofprocessingonasystem.Asolidunderstandingoftheworkloadprofileofthose applications,whethertheyarerunninginavirtualenvironmentordirectlyonhardware,canhelpyouanalyze CPUusage.

Check CPU Utilization


StartesxtopbyenteringthecommandintheESXServerhostsserviceconsole.Bydefault,esxtopshowsCPU utilization.Toensurethisdataisdisplayed,pressc.Thefollowingscreencaptureshowsexampledata producedonatestsystem.

Theoutputincludesthefollowinginformation: ThePCPU(%)lineintheheadershowsutilizationforthephysicalprocessorsonthehostsystembycore andthetotalphysicalCPUusage.Thecommadelimiteddatafirstdisplayedshowscoreutilization followedbyused total,whichaveragesutilizationofallcores. TheLCPU(%)lineshowsthepercentageofCPUutilizationperlogicalCPU.Thepercentagesforthelogical CPUsmappingtoasinglephysicalcoreaddupto100percent.Thislineappearsonlyifhyperthreading ispresentandenabled. TheCCPU(%)lineshowsthepercentagesoftotalCPUtimeasreportedbytheESXServerserviceconsole. Ifyourunanythirdpartysoftware,suchasmanagementagentsandbackupagents,insidetheservice console,youmightseeahighCCPU(%)number. Anidleworldisrunning.InESXServer,aworldisamanagedexecutionentitysimilartotheoperating systemconceptofaprocess.The%USEDentryofthatidleworlddisplaysthepercentageofCPUcyclesthat remainunused.Ifesxtopreportslessthan100percentutilizationfortheidleworld,thatmeansonlya fractionofonephysicalcoreremainsavailableforadditionalwork.Themaximumvalueforthisnumber canbemanyhundredsofpercent(upto100percentforeachcore)smallnumbershererepresentheavily loadedsystems. Checktheutilization(%USED)ofthevirtualmachinesyouwanttoanalyze.Thevirtualmachinesare reportedherewiththenamesspecifiedatthetimetheywerecreated.Aswiththeidleworldsrow,

Copyright 2008 VMware, Inc. All rights reserved.

Performance Analysis Methods

utilizationforeachvirtualmachinecanexceed100percent.AvirtualmachinethatusestwovirtualCPUs, forexample,canshowupto200percentCPUutilization. Youcanexpandthegroupdataforavirtualmachineyouwanttoexamineinmoredetail.Todoso,press e,thenenterthegroupIDnumber(shownintheGIDcolumn)forthevirtualmachine.Thescreencapture belowcontainsaCPUexpandedinformationdisplayforGID30fromthepreviousscreencapture.When youexpandthedisplay,esxtopexpandsrowsandprovidescounterdataforeveryworldinthegroup. Thisdataincludes: vmmXForeachvirtualCPUprovidedtothevirtualmachine,esxtopdisplaysavirtualmachine monitor(VMM)world.Thisworldperformsthemajorityoftheworkrequiredtoexecuteand virtualizethevirtualmachinescode(operatingsystem,application,andhypervisor). vcpu-XESXServercreatesavcpu-XworldtoassisttheVMMworldforeachvirtualCPU.The primaryworkofthisworldisvirtualizationofI/Odevices. mksThislinereportsdataassociatedwithservicinginterruptsformouse,keyboard,andscreen. vmware-vmxTheVMXworldsassistinmaintenanceandcommunicationswithotherworldsand generallydonotrepresentamaterialportionofthegrouputilization.

Evaluate the CPU Data and Correct the System


Ingeneral,toevaluatetheCPUdatathatesxtopprovides,considerthesystemsload.Isthesystem overloadedwithtoomanyvirtualmachines?IstheguestoperatingsystemusingallofitsvirtualCPUsand doesitrequiremoreorfasterprocessors?AreallguestoperatingsystemswaitingforI/O?Forexample: CheckthePCPU(%)linetoseeifutilizationforallcoresisnear100%.Ifso,thesystemissaturated.If multiplevirtualmachinesarecompetingfortheCPUs,trytoreducethenumberofvirtualmachineson thesystemorfindothermeansofdecreasingtheloadonthesystem.SeeCPUSaturationoftheHoston page 4formoredetails. SeeifthePCPU(%)lineshowsanunequalloadacrossprocessorcoreswithsomeatsaturationandsome remainingnearidle.Ifso,applicationswithinthevirtualmachineareutilizingallofthecoresprovided tothem.IncreasethevirtualmachinesvirtualCPUcount,ifpossible,andverifythattheguestoperating systemismakinguseoftheadditionalcores.Iftheapplicationsupportshorizontalscalability,youcanrun multiplevirtualmachinestousetheadditionalcores.SeeCPUSaturationofaVirtualMachineon page 5formoredetails. IfallCPUsremainunderutilized,eithertheapplicationinthevirtualmachineismisconfiguredorthe virtualmachineiswaitingforI/Ooperationstocomplete.SeeLowCPUUtilizationonpage 5formore details.

Copyright 2008 VMware, Inc. All rights reserved.

Performance Analysis Methods

CPU Saturation of the Host


YoucanuseboththePCPU(%)and%USEDcounterstoidentifysystemsthatareusingallphysicalCPUs.Itis possible,however,forthevirtualmachinesonthesystemtoutilizenearlyalloftheprocessorcycleswithout actuallyrequestingtheadditionalcyclesthatareavailable.Thisnearsaturationcaseisthesignofaheavily loadedsystem. Abettersignofoverutilizationonahostisreadytime(%RDY).Whenanyworldsreadytimestartstoclimb, thatworldisspendingthereportedpercentageofitstimewaitingforsomeCPUtobecomeavailableforwork. Readytimeabove10percentisworthinvestigationandmaybeasignofanoverutilizedhost.Foramore detaileddiscussionofreadytime,seetheVMwaredocumentReadyTimeObservations(seeReferences onpage 12foralink). Hostsaturationisaclearsignthattoomuchworkisassignedtoasingleserver.Thisisusuallyaresultofoverly aggressiveconsolidationratios.OvercommitingCPUresourcesinthiswaydegradesperformance.Consider thefollowingremedies: VerifythatVMwareToolsisinstalledineveryvirtualmachineonthesystem.Inadditiontomanyother benefits,VMwareToolsprovidesanetworkdriver(vmxnet)thatisnecessaryforefficientvirtualmachine networking. IfyouareusingVMwareDistributedResourceScheduler(DRS),verifythattheallsystemsintheDRS clusterarecarryingloadwhentheserveryouareinterestedinisoverloaded.Iftheyarenot,increasethe aggressivenessoftheDRSalgorithmandcheckvirtualmachinereservationsagainstotherhostsinthe clustertoensurevirtualmachinescanmigrate.IncreasethenumberofserversintheDRSclustersovirtual machinesfromtheserveryouareevaluatingcanmigratetoserverswithavailableresources. IncreasetheCPUresourcesavailabletothevirtualmachinesbyincreasingthenumberorimprovingthe performanceofCPUsorcoresonsomeofthesystemsintheDRScluster. SetCPUreservationsforthevirtualmachinesthatmostneedtheprocessingpowertoguaranteethatthey gettheCPUcyclestheyneed. EnsureyouareusingthenewestversionofESXServer.NewerversionsofESXServerprovidebetter efficiencyandCPUsavingfeaturessuchasTCPsegmentationoffload,largememorypages,andjumbo frames. ReducetheCPUresourcefootprintofrunningvirtualmachines.Forexample,youcantakethefollowing measures: Decreasediskornetworkactivityorbothforapplicationsthatcachedata.Youcandothisby increasingtheamountofmemoryprovidedtothevirtualmachine.DoingsocanlowerI/Oand reducetheneedforESXServertovirtualizethehardware. TakesomeoftheloadofftheCPUbyreplacingsoftwareI/Owithdedicatedhardware(suchasiSCSI HBAsorTCPsegmentationoffloadNICs). ReducethevirtualCPUcountforgueststotheminimumnumberrequiredtoexecutetheworkload. Forinstance,asinglethreadedapplicationinafourwayguesttakesadvantageofonlyasingle virtualCPU.ButthehypervisormustmaintainthethreeidlevirtualCPUs,wastingCPUcyclesthat couldbeusedforotherwork. FewapplicationsfullyutilizetwoormorevirtualCPUs,andvirtualmachinesareoftencommitted toaspecialpurposewithasingleapplicationoneachvirtualmachine.Theguestoperatingsystem andthehypervisormustexpendCPUcyclesmanagingmultiplevirtualCPUs.Iftheapplicationsare notusingthosevirtualCPUs,youcanimprovesystemefficiencyasawholebyreducingthevirtual CPUcountforthevirtualmachines. ForvirtualmachinescreatedfromphysicalmachinesusingVMwareConverter,analyzethevirtual machineresourcesaswellastheapplicationsrunninginsidethevirtualmachine.Stopany unnecessaryservicesrunninginsidethevirtualmachine.Also,reducethenumberofvirtualCPUs andtheamountofmemorycounttotheminimumrequiredtoexecutetheworkload.

Copyright 2008 VMware, Inc. All rights reserved.

Performance Analysis Methods

Ingeneral,theeasiestwaytoaddressCPUbottleneckswhenthevirtualmachinesarecorrectlyconfiguredis toincreaseprocessingpowerattheclusterlevel.IfVirtualCenterreportsfullyutilizedCPUsforallhostsina cluster,youneedtoincreaseclusterresourcesordecreasethenumberofvirtualmachines.

CPU Saturation of a Virtual Machine


AswithhostCPUsaturation,youcanseevirtualmachineCPUsaturationwhenthevalueof%USEDfora virtualmachineishigh.IncontrasttowhatyouseewithhostCPUsaturation,theidleworldmightreporta largeamountoffreecomputationalresourcesandthevirtualmachinesreadytime(%RDY)mightremainlow. Youcanseethisbehaviorwhenasinglevirtualmachineutilizesalloftheprocessorsallocatedtoitbut additionalCPUsremainunusedonthehost.Youcanconfirmthevirtualmachinesutilizationofallofits virtualCPUsbyexpandingthevirtualmachinesworldontheesxtopCPUscreen.Ifyouconfirmthatthe virtualCPUsaresaturated,youhavethefollowingoptions: VerifythatVMwareToolsisinstalledineveryvirtualmachineonthesystem.Inadditiontomanyother benefits,VMwareToolsprovidesanetworkdriver(vmxnet)thatisnecessaryforefficientvirtualmachine networking. Ifpossible,increasethenumberofvirtualCPUsprovidedtothevirtualmachine.Becausetheapplication inthevirtualmachineissuccessfullyusingallofitsvirtualCPUs,itmaycontinuetoscaleasyouincrease thevirtualCPUcount.PayattentiontothevmmXworldforeachvirtualCPUafteryouincreasethevirtual CPUcounttoverifythatthevirtualmachineismakinguseofitsnewlyprovidedresources.Theaddition ofvirtualCPUsimposesadditionaloverheadonthehostevenwhenthevirtualCPUsarenotbeingused. ThusyoushouldcarefullyassessthevirtualmachinesneedstoavoidincreasingthevirtualCPUcount unnecessarily. Ifpossible,poweronmultiplevirtualmachinesrunningthesameapplication.Thevalueofthisoption dependsonhowwelltheapplicationsupportshorizontallyscalableconfiguration.Anapplicationmight performbetterwhenrunninginmultiplevirtualmachines,eachwithasinglevirtualCPU,thanitdoesin asingleSMPvirtualmachine. Utilizefasterprocessors.Becauseprocessorperformanceiscontinuallyincreasing,theoptionof upgradingprocessorsormigratingthevirtualmachinetoasystemwithnewerprocessorscanprovide moretotalthroughputtothevirtualmachine. SetCPUreservationsforthevirtualmachinesthatmostneedtheprocessingpowertoguaranteethatthey gettheCPUcyclestheyneed. ReducetheCPUresourcefootprintofrunningvirtualmachines.Forexample,youcantakethefollowing measures: Decreasediskornetworkactivityorbothforapplicationsthatcachedata.Youcandothisby increasingtheamountofmemoryprovidedtothevirtualmachine.DoingsocanlowerI/Oand reducetheneedforESXServertovirtualizethehardware. TakesomeoftheloadofftheCPUbyreplacingsoftwareI/Owithdedicatedhardware(suchasiSCSI HBAsorTCPsegmentationoffloadNICs).

Low CPU Utilization


Ifyouhaveconfirmedperformanceproblems,lowCPUutilizationisusuallyasignofinefficientlydesigned datacenterarchitecture.Thedesignmightbeflawedintheconfigurationofanindividualvirtualmachineor intheconnectivitybetweenvariouscomponents.Thefollowingsectionsdiscussmethodsforinvestigating systemlevelcomponentssuchasmemoryandthensystemwidecomponentssuchasnetworkingandstorage.

Memory Analysis
Hostmemoryutilizationincludesallmemoryusedbyvirtualmachinesonthehostandallmemoryusedby ESXServeritself.ThemonitoringcapabilitiesinESXServerdonothelpyoudetectimproperusageor configurationofmemorywithinavirtualmachine.Youmustusetraditionalmonitoringtoolsintheguest operatingsystemtoidentifymemoryhungryapplicationsorshortagesthatleadtoswappinginsidethe virtualmachine.
Copyright 2008 VMware, Inc. All rights reserved. 5

Performance Analysis Methods

Check Memory Utilization


StartesxtopbyenteringthecommandintheESXServerhostsserviceconsole.Pressmtodisplaythememory counters.

Theoutputincludesthefollowinginformation: Theheadershowshostdatathataffectsallvirtualmachinesrunningonthehost.Thephysicalmemory row(PMEM)showsthetotalRAMinstalledonthesystem,theamountusedbytheserviceconsole(cos), thememoryusedbytheVMkernel(vmk),andotherstatistics. ThenextfewrowsshowhostlevelmemorystatisticsforvariousESXServersubsystems: VMKMEMshowsmemorystatisticsfortheESXServerVMkernel. COSMEMdisplaysthememorystatisticsreportedbytheESXServerserviceconsole. PSHAREdisplaysESXServerpagesharingstatistics. SWAPdisplaysESXServerswapusagestatistics. MEMCTLdisplaysstatisticsforthememoryballoondriver. Dataforeachvirtualmachineonthehostappearsasarowinthetableatthebottomofthedisplay.The followingcountersareofparticularinterestwhenevaluatingvirtualmachinememoryusage: ThetotalmemoryallocatedtothevirtualmachineappearsintheMEMSZcolumn. Thememorythatisactivelyinusebytheguestoperatingsystemanditsapplicationsisreportedin thetouchedandactivecounters(TCHDfortouchedmemoryand%ACTV,%ACTVS,and%ACTVFfor activememory).%ACTVSand%ACTVFprovideslowandfastaveragesofthe%ACTVcounter.Eitherthe %ACTVorTCHDcounterscanserveasgoodpredictorsofmemoryusage.Whentheactivelyused memoryofoneormorevirtualmachinesexceedstheamountofmemoryonthehost,theserverstarts toswapandperformancedegradessignificantly. Knowingtheactivitycausedbytotheballoondrivercanbeuseful.Whentheballoondriverisactive intheguestoperatingsystem,thevirtualmachinesMCTL?counterissettoY.Theamountofmemory theballoondriverisusinginaspecificguestoperatingsystemisreportedunderMCTLSZ.ESXServer usestheballoondrivertorecovermemoryfromlessmemoryintensivevirtualmachinessoitcanbe usedbythosewithlargeractivesetsofmemory.ESXServertakesthismemorymanagementstep beforeitresortstoswappingmemorytodisk. TherateatwhichESXServerisswappingmemorytoandfromdiskisdisplayedbytheswapwrite (SWW/s)andswapread(SWR/s)counters.Thesecountersshouldremainnearzerofor highperformingsteadystateoperations.AsustainedrateofasignificantnumberofMB/sisacertain signthatthehostdoesnothaveenoughmemory.

Copyright 2008 VMware, Inc. All rights reserved.

Performance Analysis Methods

ForNUMAsystems,theNUMAcountersdisplayingmigrationcount(NMIG)andremoteandlocal memory(NRMEMandNLMEM,respectively)canofferinformationindicatingwhetherthevirtual machineisutilizingNUMAmemoryinefficiently.MemoryaccessacrossNUMAnodesisinefficient andmigrationsofmemoryacrossnodesslowdownexecution. ThememoryoverheadrequiredtomaintaineachvirtualmachineisdisplayedbytheOVHDcounter. Knowingthisadditionalusagecanhelpyouplanvirtualmachineandhostconfigurations.

Evaluate the Memory Data


AnalysisofmemoryutilizationonanESXServerhostrequiresnotjustinvestigationofserversidestatistics butalsoasolidunderstandingoftheapplicationsthatarerunninginvirtualmachinesonthehost.When memoryisshortonthehost,ballooningandswappingmightbevisibleinesxtop.Swappinghasasignificant impactonperformance.Whenmemoryisshortwithinavirtualmachine,theguestoperatingsystemswaps memorytodisk. Toevaluatethedataprovidedbyesxtop,considerthefollowingfactors: Ismemoryshortonthehost?Swapping(SWW/sandSWR/s)isacertainsignofthisproblem.Heavyuse oftheballoondrivermightalsosuggestashortageofmemoryonthehost,butballooninghasonlyavery slightimpactonguestperformance. Canyouaddressmemorydeficienciesbyresizingvirtualmachines?Checkthememoryusageofcritical applicationsrunninginthevirtualmachinestohelpyoudecidewhetheryoucandecreasetheamountof RAMprovidedtothosevirtualmachines.Someoperatingsystemsexpandtoutilizeallavailablememory eventhoughthisapproachprovideslittleornobenefittotheapplication.Reducingthememoryspace andcorrectingoversizedcachesfreesmemoryforothervirtualmachines. Isthetotalactivememoryforallvirtualmachines(TCHDor%ACTV)consistentlyexceedingthetotal availablememory?Ifso,youmusteitheraddmorememorytothehostormigratevirtualmachinesto anotherDRScluster. Aretheguestoperatingsystemsswappingmemorytodisk?Ifavirtualmachinehastoolittlememory,the guestoperatingsystemswapsinsidethevirtualmachine.ThisswappingappearsthesametoESXServer asanyotherdiskactivity,butyoushouldinvestigateitandsolvetheproblemusingtraditionaloperating systemanalysistools. CanyouseeNUMAmigrations(NMIG)onthesystem?TheNMIGcolumnreportstotalmigrationssincethe virtualmachinewaspoweredon.Ifthisnumbercontinuestoclimb,thevirtualmachineisbeingmigrated fromnodetonode.Therepeatedmigrationscertainlydegradeperformance. DoestheamountofmemorylocatedonaremoteNUMAnode(NRMEM)remainatanonzeronumber?This maybeasignthatthememoryassignedtoavirtualmachineexceedsthememoryofasingleNUMA node.Ifthevirtualmachineisusingmorememorythanisavailableonasinglenode,someofitsmemory iscertaintobelocatedonaremotenode.Remotememoryaccessisquiteslowcomparedtolocalmemory access.

Correct Memory Configuration on the System


Thestepsyoumusttaketoresolvememoryshortagesaresimple:uselessmemoryoraddmoretothesystem. Thefollowingrecommendationsarevariationsonthistheme: VerifythatVMwareToolsisinstalledineveryvirtualmachineonthesystemandthatthememoryballoon driverhasnotbeendisabled.(Theballoondriverisalwaysonbydefaultandcanbedisabledmanually usingtextbasedadvancedconfigurationtools.Itshouldbedisabledonlyinextremelyrarecases.)When itcanusetheballoondrivertoreclaimmemorywithintheguestoperatingsystems,ESXServerisableto takememoryfromvirtualmachinesthatarenotusingitandmakeitavailabletothosethatdoneedit. ProvidemorememorytotheDRScluster.Astotalresourcesgoup,VirtualCenterbalancesvirtual machinesacrosstheclusterinawaythatprovidesvirtualmachinesthememorytheyneed.

Copyright 2008 VMware, Inc. All rights reserved.

Performance Analysis Methods

Setmemoryreservationstoprovidetheminimalamountofmemoryrequiredfortheoperatingsystem andcriticalapplications.Thisapproachallowsforsustained,fastaccessforcriticalcodeandprovides hintstoVirtualCenterforoptimalpositioningofvirtualmachinesacrosstheDRScluster. MakesuretheamountofmemoryusedbytheVMkerneltomaintainthevirtualmachinesisacceptable. Thisvalue,reportedforeachvirtualmachinebytheoverheadcounter(OVHD),isdependentonthe memorysizeofthevirtualmachine,thenumberofvirtualCPUsprovidedtothevirtualmachine,and whetherthevirtualmachineisrunninga64bitoperatingsystem.Fewervirtualmachinesonthehost, feweraggregatevirtualCPUs,andlowerprecisionoperatingsystems(32bitasopposedto64bit)lower thisnumber.Youcanfreeresourcesforallvirtualmachinesintheclusterbyreducinganyofthesefactors inthecluster. SizevirtualmachinesonNUMAsystemstoguaranteethateachvirtualmachinesmemoryfitsonasingle node.Ifthereisamismatch,youmusteitherdecreasethememoryallocatedtoavirtualmachineor increasethenodesmemorysize. Sizeguestsappropriatelyaccordingtotheirneeds.Forexample: Dependingontheaccesspatternforthedata,databasesmightnotbenefitfromthelastdoublingof cachesize.Experimentwithsmallercachesizesandseeifperformancedrops.Ifthesmallercachesize doesnotdegradeperformance,decreasethevirtualmachinesavailablememorysoothervirtual machinescanusethatmemory. Checktheguestoperatingsystemsstatisticsforswappinginsidethevirtualmachine.Provide memoryasneededandpayattentiontoesxtopstatisticstoseeifprovidingtheadditionalmemory generatesanewbottleneckonthehost.

Storage Analysis
Storageoftenlimitstheperformanceofenterpriseworkloads.Traditionalmeansofanalysisaresoundfor evaluatingstorageperformanceinvirtualdeployments.Thissectionintroducestoolsforidentifyingheavily usedresourcesandvirtualmachinesthatplacehighdemandsontheirstoragesystems.Youcanthenapply traditionalmethodstocorrecttheproblems. ThissectiondoesnotcoveriSCSIstorageusingsoftwareinitiators.WhenvirtualmachinesaccessiSCSIstorage throughthehypervisorsiSCSIinitiatororasoftwareinitiatorinsidetheguestoperatingsystem,thestorage trafficappearsontheVMkernelnetworkorthevirtualmachinesnetworkstack.SeeNetworkAnalysison page 11formoreinformation.

Check Storage Utilization


StartesxtopbyenteringthecommandintheESXServerhostsserviceconsole.Pressdtodisplaythedisk adapterinformation.

Copyright 2008 VMware, Inc. All rights reserved.

Performance Analysis Methods

OnESXServer3.5,youcanpressvtodisplaythestoragesysteminformationpervirtualmachineorpressuto displaytheinformationperstoragedevice.Thesamecountersaredisplayedineachlisting.Theoutput includesthefollowinginformation: Eachofthethreestorageviewsdisplaysinformationinaparticularorder.Youcanexpandgroupsinthedisk displaystoviewmoredetailedinformation:

Adapter View
Intheadapterview(d),eachphysicalHBAappearsonarowofitsown,identifiedbytheappropriateadapter name.YoucancheckthisshortnameagainstthemoredescriptivedataprovidedthroughtheVirtual InfrastructureClienttoidentifythehardwaretype. PressetoexpandtheHBAslistedintheadapterview.Theexpandeddisplayshowsworldsthatareusingthe HBAs.LocateavirtualmachinesworldIDintheWIDcolumntofindthedataforactivityrelatedtothatvirtual machine.

Virtual Machine Disk View


Inthevirtualmachinediskview(v),eachrowrepresentsagroupofworldsontheESXServerhost.Each virtualmachineappearsonarowofitsown,andesxtopdisplaysrowsfortheserviceconsole,system,and otherworldsthatarelessimportantwhenyouareanalyzingstorage.ThegroupsIDs(GID)matchthoseshown ontheCPUscreen,andyoucanexpandthelistingbypressinge. Pressetoexpandtheworldsdataforeachvirtualmachineinthevirtualmachinediskview.

Disk Device View


Inthediskdeviceview(u),eachdeviceappearsonitsownrow. Presseinthediskdeviceviewtoshowusagebyeachworldonthehost.

Counters to Check
Theoutputincludesthefollowinginformation: TheACTVcounterprovidesasnapshotofcurrentactivity.TheQUEDcounterliststhenumberofqueued commandsthatthehostwillprocessafterACTVcommandshavefinished.AsustainednumberofACTV commandsishealthyandindicatescontinuingdiskactivities.Asustainednumberofqueuedcommands indicatesaheavilyloadedsystem. TheLOADcounterprovidesanestimateoftheutilizationofasingleHBA.Itrepresentstheratioofthe numberofcommandsthatareactiveorqueuedtothetotalnumberofcommandsthatcanbeactiveor queuedatonetime.ThusaLOADvalueof1.0meansthatboththeactivebufferandqueuearefull.Atthis point,theserverbeginsfailingtoexecutecommands. The%USDcounterprovidesthepercentageofthequeuedepthusedbyVMkernelactivecommands.Very highvaluesindicatethelikelihoodthatcommandsarequeued,andyoumayneedtoadjustthequeue depthsforsystemsHBAs. Youcanviewthetotallatencyseenfromavirtualmachinetothearrayinthevirtualmachinelatency counter(GAVG/cmd),whichshowsthesumofthelatenciescausedbythehardware(DAVG/cmd)andthose causedbytheVMkernel(KAVG/cmd). Abortedcommandsasdisplayedbytheabortedcommandspersecondcounter(ABRTS/s)represent commandsissuedbytheguestoperatingsystemafteritdeterminesthatastoragerequestcannotbe fulfilled.Abortsareasignthatthestoragesystemcannotmeetthedemandsoftheguestoperatingsystem.

Copyright 2008 VMware, Inc. All rights reserved.

Performance Analysis Methods

Evaluate the Storage Data


Itisimportanttohaveasolidunderstandingofthestoragearchitectureandequipmentinyourenvironment beforeattemptingtoanalyzeperformancedata.Considerthefollowingquestions: Isthehostoranyoftheguestoperatingsystemsswappingmemorytodisk?Youcanchecktheguest operatingsystemsswapactivityusingtraditionaloperatingsystemtools.Youcanseedataonhostswap activityintwoesxtopcountersSWR/sandSWW/sasdescribedinCheckMemoryUtilizationon page 6. Arecommandsbeingaborted?Thisisacertainsignthatthestoragesubsystemisunabletohandle requestsastheguestoperatingsystemsexpect.Correctiveactionmightincludehardwareupgrades, storageredesign,orvirtualmachinereconfiguration. Isthequeuelarge?Althoughlessdangerousthanabortedcommands,queuedcommandsareasignthat youneedtoupgradehardwareorredesignthestoragesystem. Isthestoragearrayrespondingatexpectedrates?Storagevendorsprovidelatencystatisticsfortheir hardwarethatyoucancheckagainstthelatencystatisticsinesxtop.Whenthelatencynumbersarehigh, thehardwarecouldbeoverworkedbytoomanyservers.Forexample,25mslatenciesareusuallyasign ofahealthystoragesystemreadingdataonthearraycache,512mslatenciesreflectahealthystorage architectureinwhichdataisbeingreadrandomlyacrossthedisk,and15mslatenciesorgreaterpossibly representanoverutilizedormisbehavingarray.

Correct Storage Configuration on the System


Ifyouconfirmthatyouhavestoragesystemproblems,considerthefollowingpossiblemethodsofcorrecting theproblems: Reducethevirtualmachinesandhostsneedforstorage. Someapplications,suchasdatabases,canusesystemmemorytocachedataandavoiddiskaccess. Checkthevirtualmachinestoseeiftheycanbenefitfromincreasedcachesandprovidemore memorytothevirtualmachinesifappropriateandifresourcespermit.Theadditionalmemorymay reducetheburdenonthestoragesystem. Eliminateasmuchswappingaspossibletoreducetheburdenonthestoragesystem.First,verifythat thevirtualmachineshavethememorytheyneedbycheckingswapstatisticsintheguestoperating system.Providememoryifresourcespermit.Next,asdescribedinCorrectMemoryConfiguration ontheSystemonpage 7,eliminatehostswapping. ConfiguretheHBAsandRAIDcontrollersforoptimaluse. Increasethenumberofoutstandingdiskrequestsforthevirtualmachinebyadjustingthe Disk.SchedNumReqOutstandingparameter.Fordetailedinstructions,seetheEqualizingDisk AccessBetweenVirtualMachinessectionintheVMwareInfrastructure3FibreChannelSAN ConfigurationGuide.SeeReferencesonpage 12foralinktothisdocument. IncreasethequeuedepthsfortheHBAs.CheckthesectionSettingMaximumQueueDepthfor HBAsinthedetailedinstructions,seetheEqualizingDiskAccessBetweenVirtualMachines sectionintheVMwareInfrastructure3FibreChannelSANConfigurationGuidefordetailed instructions.SeeReferencesonpage 12foralinktothisdocument. Makesuretheappropriatecachingisenabledforthediskcontrollers.Usethetoolsprovidedbythe controllervendortoverifythissetting. Iflatenciesarehigh,inspectarrayperformanceusingthevendorsarraytools.Whentoomanyservers simultaneouslyaccesscommonelementsonanarray,thedisksmighthavetroublekeepingup.Consider arraysideimprovementstoincreasethroughput.

Copyright 2008 VMware, Inc. All rights reserved.

10

Performance Analysis Methods

Balanceloadacrosstheavailablephysicalresources. SpreadheavilyusedstorageacrossLUNsbeingaccessedbydifferentadapters.Thepresenceof separatequeuesforeachadaptercanyieldsomeefficiencyimprovements. UsemultipathingormultiplelinksifthecombineddiskI/Oishigherthanthecapacityofasingle HBA. UsingVMotion,migrateI/OintensivevirtualmachinestodifferentESXServerhosts,ifpossible. Upgradehardware,ifpossible.Storagesystemperformanceoftenbottlenecksstorageintensive applications,butfortheveryhigheststorageworkloads(manytensofthousandsofI/Ospersecond),CPU upgradesontheESXServerhostincreasethehostsabilitytohandleI/O.

Network Analysis
Networkanalysisisusuallyastraightforwardprocessforwhichtypicalnativetechniquesarevalid.Checking forloadrelativetolinkthroughputandlookingfordroppedpacketscanidentifyallbutthemostsubtleof problems.

Check Network Utilization


StartesxtopbyenteringthecommandintheESXServerhostsserviceconsole.Pressntodisplaythenetwork information.

Thefollowingpropertiesofthisdisplayareworthparticularattention: Eachrowpresentsdataforonenetworkrelateditemonthehost,forexample:aphysicalNIC(vmnicX), avirtualswitchinterface(vswifX),avirtualmachine(containsthevirtualmachinename),theVMkernel networkstack(vmk-tcpip-A.B.C.D). Thenetworkitemsareorganizedbythevirtualswitchtowhichtheyareattached.Thevirtualswitch nameislistedintheDNAMEcolumn. NetworktrafficonthehypervisorsiSCSIinitiatorappearsontheVMkernelnetworkrow,whichcontains thenamevmk-tcpip-A.B.C.D,whereA.B.C.DistheVMkernelIPaddress. NetworktrafficoniSCSIinitiatorsconfiguredintheguestoperatingsystemappearonthelineforthe virtualNICdisplayedusingthevirtualmachinesnameasshownontheesxtopnetworkscreen. Youcancalculatetotalthroughputforeachitembysummingthetotaltransmitteddata(MbTX/s)and receiveddata(MbRX/s)forthatitem.Asthephysicalhardwarebecomessaturated,thesystembeginsto droptransmittedandreceivedpackets(%DRPTXand%DRPRX,respectively).Dependingontheprotocol, thesystemmayretransmitthedroppedpacketsatalatertime.

Copyright 2008 VMware, Inc. All rights reserved.

11

Performance Analysis Methods

Evaluate the Network Data


Toevaluatethenetworkdata,considerthefollowingquestions: DothephysicalNICsreportedspeedandduplexsettingmatchtheexpectationofthehardware? HardwareconnectivityissuesmightcauseaNICtoautonegotiatealowerspeedorhalfduplexmode. Doappropriatenetworkitemsshowasignificantload?Forinstance,isanetworkintensiveloadina virtualmachineactuallygeneratingtheexpectednetworkactivityonitsvirtualNIC?Are storageintensiveloadsgeneratingtrafficonthevirtualNICortheVMkernelNIC(vmkNIC)when softwareinitiatorsareusedonthehypervisororintheguestoperatingsystem? IsthenetworktrafficflowingonappropriateNICs?AtypicalESXServerhostmighthavenetworktraffic generatedbyvirtualmachines,networktrafficfromtheiSCSIprotocol,VMotionrelatednetworktraffic, andnetworkactivityassociatedwiththeserviceconsole.YoushouldhaveseparateNICstohandlethese differentkindsofnetworkpackets. Duringperiodsofsaturation,doesthetotalthroughput(MbTX/ssummedwithMbRX/s)match expectations?Eithertheguestoperatingsystemortheotherendofthecommunicationlinkmightbe throttlingtheperformance. Arepacketsbeingdropped?Whenoverworked,thehardwarerefusespackets.thosepacketsarereported asdroppedtransmitted(%DRPTX)anddroppedreceived(%DRPRX)packets.

Correct Network Configuration on the System


Ifyouconfirmthatyouhavenetworkingproblems,considerthefollowingpossiblemethodsofcorrectingthe problems: Makesurethatthehardwareisconfiguredtorunatitsmaximumcapacity.Verifythat1GbNICsarenot autonegotiatingdownto100Mb/sbecausetheyconnectedtoanolderswitch.Similarly,ensurethatNICs arerunninginfullduplexmode. Whennetworkthroughputseemslowerthanexpected,applytraditionalnetworkdiagnosistechniques toinvestigateeverylinkintheconnection.LowthroughputattheESXServerhostisnotnecessarily causedbyserverconfiguration. VerifythatVMwareToolsisinstalledinallguestoperatingsystemsandthatTSO,jumboframes,and10 GbEthernetareenabled,wherepossible. BondmultiplephysicalNICstovirtualswitchesthatshowhighutilization. ProvideseparatevirtualswitchestheirownphysicalNICsandseparatenetworkintensivevirtual machinesontheirownvirtualswitches. IfvirtualmachinesrunningonthesameESXServerhostcommunicatewitheachother,connectthemto adedicatedvirtualswitchsoallnetworktransfersoccurinmemoryandnotaspacketsareshippedover thephysicalnetwork.

References
ReadyTimeObservations http://www.vmware.com/pdf/esx3_ready_time.pdf VMwareInfrastructure3FibreChannelSANConfigurationGuide http://www.vmware.com/pdf/vi3_35/esx_3/r35/vi3_35_25_san_cfg.pdf VMwareInfrastructure3ResourceManagementGuide http://www.vmware.com/pdf/vi3_esx_resource_mgmt.pdf

Copyright 2008 VMware, Inc. All rights reserved.

12

Performance Analysis Methods

Appendix: Counters
Thefollowingtablesincludedescriptiveinformationoncountersmentionedinthisdocument: Table 1. CPU counters
Counter GID %USED NWLD %RDY Description GroupID. ThepercentageofCPUthatisusedbyaworldorgroup. Thenumberofworldsinagroup.Whenthisnumberisgreaterthanone,therowcanbeexpanded todisplayinformationoneachworld. Thepercentageoftimethataworldorgroupiswaitingforaprocessortobeavailabletoexecuteits workload.

Table 2. Memory counters


Counter MEMSZ TCHD %ACTV Description Theamountofmemory(inMB)allocatedtoavirtualmachineatthetimeofitscreation. Theamountofmemory(inMB)thathasbeentouched(recentlyused)byavirtualmachine.Inthis caserecentlymeanswithinthepastminuteortwo. Instantaneousviewofthepercentageofmemorypagesthathavebeenusedbyavirtualmachinein thepreviousseconds.UnlikeTCHD,whichcountspagesbyfollowingworkingsets,%ACTVisupdated morefrequentlyandisbasedonasampleoftheentirememorypool. Slowmovingaverageofthe%ACTVcounter. Fastmovingaverageofthe%ACTVcounter. TheNUMAhomenode.Thisisthenodeonwhichavirtualmachineisbooted.Migrationsthathave occurredsincethevirtualmachinestartedrunningwouldresultinthevirtualmachinerunningon anothernodeornodes. ThenumberofNUMAnodemigrationssinceavirtualmachinewasbooted.TheESXServer schedulershouldavoidNUMAmigrations,soifthisnumbercontinuestoclimbduringnormal operations,sometuningofthevirtualmachinesmayberequired. TheamountofmemorythatexistsonaremoteNUMAnode. TheamountofmemorythatexistsonthelocalNUMAnode. ThepercentageofthevirtualmachinesmemorythatexistsonthelocalNUMAnode.N%L=NLMEM/ (NRMEM+NLMEM) SettoYwhentheballoondriverisactiveintheguestandNwhennot. Thiscounterreportstheamountofmemorythattheballoondriveriscurrentlyreclaimingforuseby othervirtualmachines. TheamountofmemoryusedbytheVMkerneltomaintainandexecuteavirtualmachine.

%ACTVS %ACTVF NHN

NMIG

NRMEM NLMEM N%L MCTL? MCTLSZ OVHD

Table 3. Storage counters


Counter ACTV Description ThenumberofI/Ooperationsthatarecurrentlyactive.Thisnumberrepresentsoperationsthehost isprocessingandcanserveasasnapshotviewofstorageactivity.Whenthisnumberhoversnear zero,thestoragesystemisnotbeingused.Ifthisnumberisconsistentlysomethingotherthanzero, thesystemisconstantlyinteractingwiththestorage. ThenumberofI/Ooperationsthatrequireprocessingbuthavenotyetbeenaddressed.Commands arequeuedandawaitingmanagementbythekernelwhenthedriversactivebufferisfull(seeACTV). Occasionallyaqueueformsand,asaresult,thiscounterdisplaysasmall,nonzeroQUEDnumber, butanysignificant(doubledigit)averageofqueuedcommandsmeansthestoragehardwareis unabletokeepupwiththehostsneeds. Theaverageamountoftimeittakesadevice(HBA,array,andeverythinginbetween)toservicea singlerequest.

QUED

DAVG/cmd

Copyright 2008 VMware, Inc. All rights reserved.

13

Performance Analysis Methods

Table 3. Storage counters


Counter KAVG/cmd Description TheaverageamountoftimeittakestheVMkerneltoserviceadiskoperation.Becausethisnumber representstimespentbytheCPUtomanageI/Oandprocessorsareordersofmagnitudefasterthan disks,itshouldbemuch,muchlessthanDAVG. ThetotallatencyseenfromthevirtualmachinewhenperforminganI/Ooperation.GAVG= DAVG+KAVG. Therateatwhichdiskoperationsarebeingaborted.Abortcommandsareissuedbytheguest operatingsystemwhenthestoragesystemhasnotrespondedwithinanacceptableamountoftime (asdefinedbytheguestoperatingsystemorapplication.)

GAVG/cmd ABRTS/s

Table 4. Network counters


Counter MbTX/s MbRX/s %DRPTX %DRPRX Description Thenumberofmegabitspersecondthataretransmittedfromthenetworkitem. Thenumberofmegabitspersecondthatarereceivedatthenetworkitem. Thepercentageofpacketsforwhichtransmissionwasattemptedbutunsuccessful. Thepacketsweredropped. Thepercentageofpacketsthatshouldhavebeenreceivedbutwerenot.Thepackets weredropped.

VMware, Inc. 3401 Hillview Ave., Palo Alto, CA 94304 www.vmware.com Copyright 2008 VMware, Inc. All rights reserved. Protected by one or more of U.S. Patent Nos. 6,397,242, 6,496,847, 6,704,925, 6,711,672, 6,725,289, 6,735,601, 6,785,886, 6,789,156, 6,795,966, 6,880,022, 6,944,699, 6,961,806, 6,961,941, 7,069,413, 7,082,598, 7,089,377, 7,111,086, 7,111,145, 7,117,481, 7,149, 843, 7,155,558, 7,222,221, 7,260,815, 7,260,820, 7,269,683, 7,275,136, 7,277,998, 7,277,999, 7,278,030, 7,281,102, and 7,290,253; patents pending. VMware, the VMware boxes logo and design, Virtual SMP and VMotion are registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation. Linux is a registered trademark of Linus Torvalds. All other marks and names mentioned herein may be trademarks of their respective companies. Revision 20080311 Item: TN-056-PRD-01-01

14

You might also like