Anda di halaman 1dari 14

BP VMWARE TECHINICAL DOCUMENT

///

Making a Greener IT

Document Status

Version Number:

Version 1.0

Document Authorization
Title: Date: VMWARE TECHINICAL DOCUMENT 10 September 2009

Type of VMware product supported : ESX with VC

Table of Contents :

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Versions used :....................................................................................................... 3 Virtual Center Info : ............................................................................................. 3 ESX Servers information and other general Information : ................................. 3 Important Clusters:............................................................................................... 4 ESX/VC Monitoring Tools :.................................................................................. 4 Backup :................................................................................................................. 5 Guest VM :............................................................................................................. 6 VM build :.............................................................................................................. 6 Patching : ............................................................................................................... 8 ESX Build : ........................................................................................................ 8 Post ESX build tasks : ....................................................................................... 8 Commissioning ESX and Decommissioning ESX Servers : ............................. 9 Commissioning VM and Decommissioning VM :............................................. 9 Projects under pipeline : ................................................................................... 9 Access given to ESX/VC for other teams :...................................................... 10 Method of Access to the Virtual Center : ....................................................... 10 Log Files Information :.................................................................................... 10 VC License and database Servers Info :......................................................... 10 H/W Failure - Action to be taken :.................................................................. 12 Vendor Support Info : ..................................................................................... 12 BAU - Daily Checks Info :............................................................................... 13 ESX Upgrade : ................................................................................................. 13 Storage : ........................................................................................................... 13 Latest Info :...................................................................................................... 14 Notes : .............................................................................................................. 14

1 Versions used :
ESX 2.5 ( Will be decom soon ) ESX 3.0.X ESX 3.5 VC 1.3 VC 2.5 ( Will be decom soon )

2 Virtual Center Info :


VC Server Name BP1XEUAP183 BP1XEUAA484 BP1XEUAA485 BP1XGBAP027 BP1XGBAP029 Version 1.3 2.5 2.5 2.5 2.5 Location EMDC1 EMDC1 EMDC1 EMDC2 EMDC2 Server classification Production Development Production Development Production

3 ESX Servers information and other general Information :


Manufacturer Mainly HP , also have some DELL Models used Blade servers , HP 485 Series and Dell 6950 All ESX Boot from Local Hardisk RAID configuration for ESX servers : RAID 1+0 for the ESX server local hard drive RAID 5 for the SAN drives All users in VMWare support team has to be added individually in all the ESX hosts. Theyll be having login access and have to switch to the root account to perform any administrative tasks No scheduled tasks are being configured in the ESX servers 15G space MUST be available in all the datastores. If less than 15G we must not provision any VM or deploy any file/template in that datastore Incase if more space is required, we need to raise a request to the shared infrastructure group to check with SAN team to increase the drive space EVC Enabled in latest version of ESX All the physical NICs in the ESX servers must have the 1000 Full Duplex mode and must be set to Negotiate The network teaming is done in the below mentioned way in the ESX servers. One onboard NIC port and one port from the PCI board is teamed for SC and the same way the other onboard NIC port and the second port from the PCI board is teamed for VMKernel port. All the ESX servers have ILO and the info is available in the CMDB There are no resource pools being used in the current environment

NO VDI present Only Fiber Channel SAN is being used. NO ISCSI and no NAS All the clusters are DRS and HA enabled with fully automatic settings Affinity rules Yes being set, but not CPU affinity, only Running apart and Running together affinity rules are being set in DRS. Mostly our guest OS are Windows VMs and now were in the process of creating Solaris VMs too. No RDM being used by any of the VM in the entire setup Mostly the multipathing technique used in ESX is MRU Avant Garde is an application / automatic setup using which VMs are being deployed in our environment. This is being done by Shared Infrastructure Team None of the ESX servers are being part of the LDAP domain Building the database for VMWareUpdate manager is in progress Different projects handled currently are upgradation / patching of ESX and VC servers and VM slice builds SAN stack upgrade(firmware/drivers for HBA and emulex version) at the server end will be informed to us by the SAN team and we need to upgrade the stack as per their recommendations There is no monitoring for the HBA connectivity status Mostly teaming is being done for the Service Console and VMKernel Port physical NICs. Some VM servers have HA rules such as leave Powered Off in Host isolation mode whereas some VM servers have Leave powered ON under the Host isolation mode

4 Important Clusters:
The cluster name which has IF in it for ex : EMDC1-CRIT-IF-001 are the clusters which has ESX/VM slices which are Internet facing Internet Facing cluster is also being owned by the SI team. But the operation wise decision can be taken only with the permission from Geoff Ansel

5 ESX/VC Monitoring Tools :


There are 3 types of monitoring currently in place. VC Alarms : VC Alarms are being set in all the VC servers and smtp address are being given to send out alert emails incase of any performance of the ESX hosts / VMs turns yellow to red HP SIM : SNMP settings are being done in ESX servers so that when the heartbeat misses from the SIM console server, it alerts the team by sending out email alerts. Also the SIM picks up any hardware issues and sends the team an email with the alerts

DELL servers also report to the SIM console servers. Only SNMP settings needs to be done with the trap info so that the DELL servers also report to the HP SIM console server HP-SIM Server Information EMDC1 - BP1XEUAB578-V - Virtual Machine(can be logged on only with the local account - User ID : vmsimadmin Pwd : VMSIM@vmware) EMDC2 - BP1XGBAP031 - PHysical Machine -Any issues with the SIM server will be taken care by the Wintel Team

IBM Netcool monitoring : SNMP settings are being done in the ESX server and the trap settings info is being sent to the IBM netcool server and the RMC help desk from EMC monitors this and generates tickets based upon their own escalation algorithm. They assign the tickets to our queue. Netcool picks up VM ping checkouts and ESX ping checkouts and hearbeat missing issues. Netcool server : reuxeuus012.bp.com (EMDC1) 149.184.192.160(EMDC2)

This is in use only for EMDC2 : EMDC2 servers are being monitored by an application "ISD Node down monitoring" which does a continous ping and sends alerts to the ECT team and they act upon by checking the server type(prod/dev)

6 Backup :
VCB not being used ESX servers are not being backed up now GUest VMs are being backed up by installing the backup s/w client in the Guest OS itself(This is being decided by the guest OS owners-Probably wintel team) Mostly Veritas netbackup client is being used by the Guest OS machines to being backed up VMWare team plays no role in backing up the Guest OS machines neither in restoration of Guest OS VMs Snapshots are NOT being used for backing up VMs. Virtual Center servers are being backed up. The netbackup client version used is 5.1. Observed the system drive and the data drive f:\data\MSSQL1 is being backed up. Observed that the DB backup is run and the file is being saved in the above location and the normal file level backup is being taken later Observed a full backup is being run once in a week(Need to check whether the settings is the same in EMDC2 as well)

7 Guest VM :
Environment folder under the shared drive has the files(ex: \\bp1xeuss001f01\mor_sme_team\VMware\Environment\ ESX_VM_Platform_Info(PRD).xls) with the affinity rules set VMs information ISO images for windows OS is also available in the datastore. No standard location for storing the iso images. Just being stored in any datastore where space is available. No guest VMs are being stored in the local hard drive. All are stored in the shared storage All the VM files regarding a VM slice is being stored within the name of the VM slice folder itself in the shared data storage. There are NO shares/reservations being configured for any VM LImits are being set for CPU alone since some VM requirements are to have .25 and .5 CPU being configured All the VMs have 2 NIC One for Production and other for backup. Incase of the backup NIC is not being used by the VM, then its being disabled Regarding the backup network in the VM a manual route needs to be added and no default gateway is present for the backup network within the VM. Route info given in the section VM Build in this doc. Wintel team requests us to update VM templates used for deployment of VMs in the clusters. There is also a version number(Recent version 008) being attached to the template. Internet facing VM also has a separate template. Templates are present for 2k3 and 2k3 64-bit editions. Procedure to build a template is present in the share point portal (https://wss2.bp.com/DCT/GO/teams/GDCSLM/WSHS/Shared%20Documents/Forms/AllIt ems.aspx?RootFolder=https%3a%2f%2fwss2%2ebp%2ecom%2fDCT%2fGO%2fteams %2fGDCSLM%2fWSHS%2fShared%20Documents%2fBuild%20Documentation) When doing VM build, we need to deploy from the latest template and hand over to Wintel and its not the VMWare team responsibility to update the patches in the Wintel VMs.

8 VM build :
Default name of the admin account in the windows OS after building the guest OS. ID : bpgdb_administrator Pwd : password Newly built Internet Facing(IF) VM default Administrator ID : OGMZ2FXAdmin Pwd : password Information(IP, VLAN ID and the port group to which the VM to be attached) will be given by the Shared Infrastructure group and will be available in CMDB Guest VM machines will be handed over to the respective wintel teams once the OS installation is being done. No Post VM Windows OS installation steps are being done by VMWare team All the VMs have one system drive having a space of 15G(Standard being maintained) Some old VMs may have less space in the system drive. But going forward this is the standard to be maintained The size of the data drive in the VMs depend upon the requirement of the requestor and the info from Shared Infrastructure Group.(But should not exceed 250Gig)

Whenever a guest VM being built, it must undergo some assurance test which needs to be passed. This is applicable even for the physical to virtual conversions. We must pass the assurance test so that the conversion team would start the conversions. Assurance test is nothing but we need to confirm whether the requested resources are being given for the VM or not and also whether the VM will be running in a crunch or whether this VM will be denied any resource in the current environment or not Motive is to ensure that the VM must be running fine once being handed over to the respective team Template is being updated with all the latest secure updates and converted to template again in a periodic manner Even if a VM build to be done in a Production cluster, it needs a change

Post VM Build

Backup Routes that need to be set for VMs based on Backup VLAN Backup Routes Z:\VMware\Projects_Work\Completed\Backup VLAN Migration\VMware Backup VLANs 20061107.xls For the 930 VLAN > route add p 172.28.0.0 mask 255.255.254.0 172.28.24.1 For the 934 VLAN > route add p 172.28.0.0 mask 255.255.254.0 172.28.20.1 For the 974 VLAN > Route add -p 172.28.22.0 mask 255.255.254.0 172.28.22.1 For the 1907 VLAN > Route add p 172.28.29.0 mask 255.255.255.0 172.20.152.1 For the 1910 VLAN > Route add p 172.28.29.0 mask 255.255.255.0 172.30.8.1 For the 1911 VLAN > Route add p 172.28.29.0 mask 255.255.255.0 172.30.8.1 For all other VLANs > Route add p 172.28.0.0 mask 255.255.254.0 172.28.n.n (where 172.28.n.n is the Default Gateway for the subnet) Check that the backup server is pingable. Eg, ping BP1XEUAP163-c-b(172.28.0.13) or reuxgbus010-c(172.28.29.31).

9 Patching :
VM Update Manager is NOT used for patching and patching is being done manually using esxupdate tool and patches are being downloaded from the VMWare site directly During installation, the patches are being downloaded to a /tmp directory in the ESX server and then installation is done Space should be available in the tmp directory before downloading the patch installable source files Patching can be referred through the patch documents No common repository is being maintained for patches All the patches are being downloaded from the VMWare website ISO images for installation is also being downloaded from VMWare website. There is no customized BP image being available for any of the VM products (ESX image / VC image) Rollback information - Incase of any serious issues with the patching Servers has to be rebuilt. No specific roll back procedures Mostly GOI sends us the information regarding any critical patch release from the vendor and also gives us a timeline when this can be implemented Sometimes we may get a suggestion or note from Shared Infrastructure group as well regarding the patch release and their deadline dates It would also be a good practice to check the vmware website often and be pro-active in recommending the updates / patches to be installed to be up-to date VMTS Patch manager is not being used for patch installation. The document(UK-E-VM-G223 ESX 3 Patch Management.doc) in the sharepoint portal is obsolete(being used by the previous team).

10 ESX Build :
Need to refer to the ESX build document SNMP settings being done as per build document to report to the HP SIM/Dell OpenManage central management servers Server hardening No particular hardening is being done Only steps in the build document is being followed Partition Info Available in ESX build document Shared Infrastructure provides us with all the basic information including licensing during any build of any ESX/Cluster Steps as per SMG Assurance checklist doc in sharepoint portal is not carried out regularly. May be carried out as a post installation task(for ESX) to assure that the redundancy part is perfect

11 Post ESX build tasks :


SmartCentre team(as mentioned in build doc-BP Reference Build Document VMware ESX Server 3.0.x V2.0.doc) is nothing but the Event control team who needs to be intimated before configuring the SNMP settings in a new ESX server and before adding it to the SIM console

The contents of the snmp.conf file can be copied from an existing ESX server to the newly built ESX server Once the ESX server built has been done, its our responsibility to add the ESX server into the SIM console NTP configuration is being done in the ESX server using command line as part of postbuild ESX tasks

12 Commissioning ESX and Decommissioning ESX Servers :


The SI team will do all paper work and will handle until the server is being racked up in the datacenter.(Creating change, ensuring paper work, and racking up the server with the help of site support, updating the CMDB, Obtaining IPR details) Will give us the details and will request us to install the server Once were done with installation, we own the server and normal operations support applies for that server Decommission requests will be raised by the SI team and there is a separate team who takes care of raising requests and updating the CMDB well be involved only in powering off the machine / removing the server from the cluster as applicable

13 Commissioning VM and Decommissioning VM :


SI team will have all the paper work done(Generating IPR, Updating CMDB, Getting all configuration information) and will send us the request form attached in the GSMS service request and weve to deploy the VM from the template as per the document in the sharepoint portal Incase of decommissioning, the decomm team will take care of all the paper work and itll come up with a change requesting us to power Down the VM and to delete the VM after a couple of weeks(once being powered down)

14 Projects under pipeline :


Most of the G1 585 servers will be moved out. But not immediately since EOL for them are by 2012 Have plans to include more latest servers into this cluster so that by the time moving out the old servers we dont need to put much work. SI will plan that and well take up that as a project work ESX 3.5 upgradations are being planned for rest of the clusters. Well be performing the upgradation VM Update manager to be implemented for the patching activity Proposal to have Solaris machines also as VM slices in our farms. But solaris build will not be done by us. We just provision the VM with the recommended configuration. The solaris team will build the guest OS(Applicable only for SOlaris) Currently they are testing for Storage VMotion so that spaces can be freed up from different datastores and clusters can be clubbed Win2k8 Build may be expected to be ready by 3rd Week Sept

There is a proposal from the Engineering team for getting the normal servers to be replaced with blade servers

15 Access given to ESX/VC for other teams :


Shared Infrastructure group has read only access Conversion teams and the wintel teams also have permissions No standards being maintained till now. Need to find out which roles are being assigned to which teams by logging into the Virtual Center servers (Information present in the spreadsheet in the shared drive under Admin folder File name : VC Access)

16 Method of Access to the Virtual Center :


Webconsole is not being used to access the virtual center server Only VI client is being used Lockdown mode is not being enabled

17 Log Files Information :


VC c:\winnt\temp\vpx ESX - /var/log/vmware/hostd log files /var/log/vmware/vpx log files

18 VC License and database Servers Info :


Licensing is being installed in the same VC servers so far Database being used is SQL. EMDC1 DB installed in the same VC servers EMDC2 DB installed in different servers other than VC (In the process of moving the DB support to the DBA team) -Database information for SDC / EMDC1 / EMDC2

SDC : (Decommed) BP1XEUAP831 SQL 2000 BP1XEUAP889 SQL 2000

EMDC1 : BP1XEUAP183 SQL 2000 BP1XEUAA484 SQL 2000 BP1XEUAA485 SQL 2000 EMDC2 : BP1XGBAP027 BP1XGBAP029 Both the EMDC2 servers have their database housed in the server BP1XGBAP030 SQL 2005 DB Configuration Info: EMDC1 DB server - Bp1xeuaa484 There are two database used in this server. AA484VCDB Virtual Center Database AA484VCUM Update Manager database Still not in use But being created for future use of Update manager DSN settings : DSN Name VMWareVirtualCenter Account used for connectivity SQL ID vmwaresa Password Pa55w0rd1 DNS Name VirtualCenter Update Manager Account used for connectivity SQL ID vmwaresa Password Pa55w0rd1 DB Server BP1xeuaa485 2 DB in the server AA485VCDB Virtual Center Database AA485VCUM Update manager databases DSN settings : DSN Name VMware VirtualCenter Account used for connectivity SQL ID vmwaresa Password He!!0-m0tt* DSN Name VirtualCenter Update Manager Account used for connectivity SQL ID vmwaresa Password - He!!0-m0tt* EMDC2 Note : For both the VC servers(bp1xgba027 and 029) the database server is the same DB Server Bp1xgbap030

Bp1xgba027 - DB Names VCDB_EMDC2_DEV Virtual Center DB VCUM_EMDC2_DEV Virtual Center Update Manager DB DSN settings :

DSN Name VMware VirtualCenter2 Account used for connectivity SQL ID vmwaresa Password - vmw4444are DSN Name VMware Update Manager Account used for connectivity SQL ID vmwaresa Password - vmw4444are

Bp1xgba029 - DB Names VCDB_EMDC2_PROD Virtual Center DB VCUM_EMDC2_PROD Virtual Center Update Manager DB DSN settings : DSN Name VMware VirtualCenter2 Account used for connectivity SQL ID vmwaresa Password - vmw4444are DSN Name VMware Update Manager Account used for connectivity SQL ID vmwaresa Password - vmw4444are Alarms are being created in VC also. Depends upon our requirements whether we need alarms to be created in VC GRP VC Admin is the local windows group which has been assigned the Administrator role assigned in the Virtual Center console

19 H/W Failure - Action to be taken :


Incase of any datacenter support is being required, the wintel team must be contacted, theyll help with the physical activities to be done in the server. Also need to coordinate with them for any hardware failures H/W replacement/issues in ESX servers - Take screenshot of the error in the systems homepage and send an email to the wintel team with a ticket to them in GSMS and they'll take care of replacing. Any issues related to h/w or any host crash issues must be updated in the excel sheet Hostcrash spreadsheet in the shared drive Coordination with hardware vendor is being taken care only by the Wintel team and they do have the h/w vendor contract info

20 Vendor Support Info :


VMWare Support Contract information :Contract Number : 40228452 Login to the www.vmware.com/support and go to create a new support ticket - For online logging our team members official email ID to be added to the relevant group Or call the VMWare support team with the contract Info.

Since Hardware vendor coordination will be done by the Wintel team we dont have the contract information

21 BAU - Daily Checks Info :


Manual checks to be done on daily morning(UK Time) Checklist file(\\bp1xeuss001-f01\mor_sme_team\VMware\BAU\Monitoring Checklist v2.xls) is present in the shared drive mentioned above under the folder BAU. File needs to be updated after daily morning checkout Alerts are sent to the EMDC SME VMWare mailbox(Mails are being triggered by the HP SIM). Mailbox to be monitored and issues to be attended regularly. The SMTP of the mailbox is emdcsmevmware@uk.bp.com (Need to find a way to get either added to this mailbox or the mails to be forwarded to us) Vmotion also being checked - But no manual migration being done for checking. Just all the VC's Events and tasks windows are monitored for any VMotion errors to detect incase of any DRS failure Incase of getting alerts / tickets from HP SIM, issues are being checked and then if issues are being resolved, the alerts are being cleared in SIM console Incase if we observe a RED alert for any esx server / VM during migration or during any other activity. Incase if it gets resolved itself within 4 hrs, we ignore them since DRS handles it automatically. Incase if the RED Alert persists for more than 4 hours then we look into it by raising a ticket

22 ESX Upgrade :
Whenever we upgrade the ESX 3.0 to 3.5 we never update the VM Tools since that may require a downtime for all the VM slices. So we wait till the VM Slices are getting into the patching window. If regular scheduled patching is being done by the wintel team, we coordinate with them to update the VMTools. Were updating all the ESX servers to the version 3.5 build version 169697 If the ESX 3.0.2 build version is 63195 then we go for the procedure sent by Tony IF the ESX 3.0.3 build version is 122206 then we go for the ISO image to update the ESX. Connect through the Virtual CDROM and then update the ESX Mostly the downloaded image is stored in the location /vmimages folder SNMP settings done in the VirtualCenter sends trap to the Netcool server IP 172.28.59.243 Model Change for upgrading ESX 3.0 to 3.5 - CHG72692

23 Storage :
There are usually two HBA cards being installed in the server and mostly two port cards are being used. Clarion SAN Identifier WWN number change ????????? middle number changes ??????

Hitachi SAN identifier WWN number change ????????????????????? last number change

24 Latest Info :
VC Servers BP1XEUAP831 and BP1XEUAP889 are being decommissioned Ref Change : CHG000000075123

25 Notes :
Incase of any Java issues in accessing the console of servers thru DRAC/ILO, please RDP to the server BP1XEUAA485(EMDC2) and then try to open ILO and try to connect

Anda mungkin juga menyukai