Setting up limited membership for VServers (part 2)

Setting up limited membership for VServers (part 2)

Published on: Category: IT development and operations

After setting up shop in part 1 of this post, it’s now time to get to business and change our vServers to limited membership on the IPoIB-default network.

In the previous post, I provided some background on Infiniband limited/full membership and why this is important from a security perspective on an Exalogic system. Because vServers are created as full members on the IPoIB-default network, you might have an open door for some unwelcome ‘ssh hopping’  in between vServers and even between vServers and compute nodes over this network.

Therefore, in part 1, we set up the Exalogic Control vServer with the tooling we need to set things straight on the IPoIB-default network if this is deemed necessary for proper network isolation of our application deployments. Thus, we installed the Infrastructure-as-a-Service API and CLI.

Changing vServers to limited membership

Now that everything is in place, I will demonstrate how changing vServers to limited membership is actually done. Ofcourse it would be best if Oracle changed the behaviour of Exalogic Control so it created the vServers as limited members instead of full members just as for the IPoIB-vserver-shared-storage and IPoIB-virt-admin, but for the moment there’s no EMOC patch for this and we only have a workaround in the form of a set of “toggle scripts”. You can download these from MOS note 1908498.1 and stage them in an appropriate place, in my case /u01/common/general/tools/toggle-scripts.

  1. [root@qel01ec01 toggle-scripts]# unzip MembershipToggle11080830.zip
  2. Archive: MembershipToggle11080830.zip
  3. inflating: CheckVServerPkey.sh
  4. inflating: CheckVServers.sh
  5. inflating: UpdateVServerPkey.sh
  6. inflating: UpdateVServers.sh

First however (there’s always more pre-reqs), we must give the Exalogic Control host (EC1) access to the Oracle VM repository, as the scripts need to update the vm.cfg files of the vServers we want to “toggle” membership for. Therefore, we need to mount the ExalogicRepo share in read/write mode.

  1. [root@qel01ec01 ~]# mount -t nfs 192.168.21.5:/export/ExalogicRepo /OVS-Repo -o rw
  2. [root@qel01ec01 ~]# df –h /OVS-Repo
  3. Filesystem            Size  Used Avail Use% Mounted on
  4. 192.168.21.5:/export/ExalogicRepo
  5.                        31T  916G   30T   3% /OVS-Repo
  6. [root@qel01ec01 ~]# touch /OVS-Repo/test.txt ; rm /OVS-Repo/test.txt

Finally, you must set up passwordless ssh of the Exalogic Control vServer with the compute nodes in the vDC. This is described in the MOS note. After this you can access the compute nodes using the Distributed Command Line Interface (DCLI), which is also used by the “toggle scripts”:

  1. [root@qel01ec01 ~]# dcli -g all-cnodes uptime
  2. qel01cn01: 18:43:14 up 34 days, 4:02, 3 users, load average: 0.73, 0.70, 0.68
  3. qel01cn02: 18:43:14 up 34 days, 2:45, 1 user, load average: 0.50, 0.46, 0.47
  4. qel01cn03: 18:43:14 up 36 days, 18 min, 1 user, load average: 0.50, 0.59, 0.54
  5. ...
  6. qel01cn08: 18:43:14 up 36 days, 1:12, 0 users, load average: 0.68, 0.57, 0.55

The all-cnodes file contains the list of all our compute nodes, which is eight in our 1/4 rack.

Proper authentication

The toggle scripts will use the IaaS API and CLI, so they need the proper authentication, just like when you login to the Exalogic Control WebUI. First of all we need access to the vDC account in Exalogic Control that holds our vServers to be changed. For that, we need to know the vDC account’s ID. We can obtain this from EMOC via the IaaS akm-describe-accounts function:

  1. [root@qel01ec01 ~]# export JAVA_HOME=/usr
  2. [root@qel01ec01 ~]# export IAAS_HOME=/opt/oracle/iaas/cli
  3. [root@qel01ec01 ~]# $IAAS_HOME/bin/akm-describe-accounts --base-url https://qel01ec01:9443/emoc --user jnijhoff
  4. password for user jnijhoff:

The IaaS API is called over https so obviously this could also be done from a separate utility/provisioning Linux host or vServer instead of the EC1 vServer itself, you would just have install the IaaS rpms there. If you have one or more authorised admin users set up in EMOC instead of root, it would be best to use those, as in my example.

Find your account ID

The akm-describe-accounts function now asks you to accept its certificate and then provides you with a list of vDC ID’s with their names and descriptions. The account I want to work in for this demonstration is the QEL-TrainingDemo-project account.

  1. ...
  2. ...
  3. Do you want to trust this certificate? [(Y)es/(N)o] (Y) > Y
  4. Certificate added to truststore /root/.oracle_iaas/truststore
  5. ACC-c882522f-7240-45ad-830f-cf670ca6bd8b       QEL-FMWDemoCloud               jnijhoff
  6. ACC-93aafadc-2d66-4ad3-80a6-0a17adec4161       QEL-AWBZ-PoV-project   Exalogic Cloud voor het opbouwen van de AWBZ demo-omgevingen       jnijhoff
  7. ACC-07a26f61-9e84-4a20-bfda-779540345e4c       QEL-ContDeliveryDemo           jnijhoff
  8. ACC-db08f972-115b-42db-8055-245d556a9063       QEL-DemoCloud-project          jnijhoff
  9. ACC-428c9d8e-89dc-42b1-86cf-ea6fa02e8468       QEL-TrainingDemo-project       This will be a demonstration account for the Exalogic Essentials training.
  10. In this project we wil create 4 vServers : 2 for OTD and 2 for OSB     jnijhoff
  11. ACC-3722d670-89f4-4096-892d-0da4263a6eee       QEL-MediationUTS-project        jnijhoff
  12. ACC-82a00b63-4251-4622-9b0c-ce16aa769987       QEL-infratest-project           jnijhoff
  13. ACC-67cf3446-e182-4ce6-b281-d1bb0940dbae       QEL-Account01   Lab environment of ExaLogic Implementation Boot Camp training. Owner: phuinen     jnijhoff
  14. ACC-65cf54d9-b527-46a1-b4d6-2b47a1822835       QEL-commonstuff-project        jnijhoff
  15. ACC-3c32d71d-84c3-47ce-b398-1087cb5af016       QEL-WLSTraining For Qualogy's WebLogic Training   jnijhoff
  16. ACC-163f6818-d0b7-4b3e-9367-413f33c18534       QEL-PSC-Epiqloud-project  PeopleSoft Campus Demo Qloud  jnijhoff

Testing an ssh session via IPoIB-default

So the account ID I am looking for is ACC-428c9d8e-89dc-42b1-86cf-ea6fa02e8468. This account holds my vServers. It is a small demo project with four vServers. Two vServers of type SMALL for running a Traffic Director cluster, and two vServers of type LARGE for running a Weblogic cluster.

The two vServers for Weblogic called qnl-train1-wls-vs1 and -vs2 are connected to the IPoIB-default network (192.168.10.0/24) as they need to access a database, so I need to fix them by making them limited members of the IPoIB-default network.

Currently they are full members and I can setup an ssh session from one to the other, if I can get hold of the IP number and password, as demonstrated below:

  1. [root@qnl-train1-wls-vs2 ~]# ssh 192.168.10.56
  2. The authenticity of host '192.168.10.56 (192.168.10.56)' can't be established.
  3. RSA key fingerprint is 35:a9:44:9c:47:98:17:bd:54:c5:45:4d:a6:90:89:3c.
  4. Are you sure you want to continue connecting (yes/no)? yes
  5. Warning: Permanently added '192.168.10.56' (RSA) to the list of known hosts.
  6. root@192.168.10.56's password:
  7. Last login: Mon Mar 10 16:59:36 2014 from 192.168.140.227
  8. [root@qnl-train1-wls-vs1 ~]# hostname
  9. qnl-train1-wls-vs1

This “backdoor” can be closed by making them limited members. For this, I need to know their vServer UUID’s so I can put those into a list file called vServers-2btoggled.lst. Unfortunately, the toggle scripts cannot work these out automatically yet. You can find the UUID’s via EMOC WebUI or from the vm.cfg files.

  1. [root@qnlel1-emoc ~]# cd /mnt/general/qualogy/tools/toggle-scripts ; ls
  2. CheckVServerPkey.sh CheckVServers.sh UpdateVServerPkey.sh UpdateVServers.sh
  3.  
  4. [root@qnlel1-emoc toggle-scripts]# cat vServers-2btoggled.lst
  5. 0004fb0000060000af022154e701ffc9
  6. 0004fb0000060000378a0f7c969e1bf5

Let’s have a look at the IB partition keys that are currently used by my two vServers, e.g. for the first one:

  1. [root@qnlel1-emoc toggle-scripts]# cd /OVS-Repo/VirtualMachines/0004fb0000060000af022154e701ffc9
  2. [root@qnlel1-emoc 0004fb0000060000af022154e701ffc9]# cat vm.cfg | grep simple
  3. OVM_simple_name = 'qnl-train1-wls-vs1'
  4.  
  5. [root@qnlel1-emoc 0004fb0000060000af022154e701ffc9]# cat vm.cfg | grep ipoib
  6. exalogic_ipoib = [{'pkey': ['0xffff', '0x0005', '0x0003'], 'port': '1'}, {'pkey': ['0xffff', '0x0005', '0x0003'], 'port': '2'}]

Change the pkey

Infiniband membership is determined by the partion key. As we see, the Infiniband partition key (pkey) that is currently used for my vServers is 0xffff. The most significant bit determines the IB membership. If it is turned on, we have full membership, if it’s turned off, we have limited membership. So we need to change the pkey to 0x7fff for limited membership and restart the vServers for this to take effect.

We could do this change manually, but the toggle scripts will take care of this for us, and perform various checks. This is especially convenient if we have many vServers to change instead of just the two, as is the case for most of the customer projects I worked on where 20+ vServers had to be “toggled”.

UpdateVServers.sh and CheckVServers.sh

There are two main scripts : UpdateVServers.sh which makes the changes and CheckVServers.sh which can perform pre- and post change checks. Both scripts take a lot of arguments as input:

  • n to provide the list of vServers to be changed
  • c to tell it about all the OVS compute nodes in my machine
  • url to specify the EMOC url to call the IaaS API
  • r to tell it where my OVS repository is
  • ak to specify my access key file
  • a to specify the account I want to work on
  • k to tell it which pkey value I want to toggle
  • t to specify the timeout, how long to wait for a shutdown to complete
  • log to specify a logfile name
     

Wrapper script

As most of these values do not change much, if at all, I made a wrapper script called ToggleVServers_TRAIN1.sh to make life a bit easier :

  1. # ToggleVServers_TRAIN1.sh
  2. # Jos Nijhoff, Qualogy, november 2015
  3. export IAAS_HOME=/opt/oracle/iaas/cli
  4. export JAVA_HOME=/usr
  5. SCRIPTS=/mnt/general/qualogy/tools/toggle-scripts
  6. EMOC_URL="https://qnlel1-emoc:9443/emoc"
  7. OVS_REPO=/OVS-Repo
  8. ACCESS_KEY="../iaas-keys/ak-jnijhoff-QEL-TrainingDemo.file"
  9. ACCOUNT=QEL-TrainingDemo-project
  10.  
  11. $SCRIPTS/UpdateVServers.sh -n vServers-2btoggled.lst -c all-cn-virtual -url $EMOC_URL -r $OVS_REPO -ak $ACCESS_KEY -a $ACCOUNT -k $1 -t 200 -log ToggleVServers.log

And I created the file with all my hypervisor compute nodes:

  1. [root@qnlel1-emoc toggle-scripts]# cat all-cn-virtual
  2. qnlel1cn01
  3. qnlel1cn02
  4. qnlel1cn03
  5. qnlel1cn04
  6. qnlel1cn05
  7. qnlel1cn06
  8. qnlel1cn07
  9. qnlel1cn08

Now that we have set this up, we can finally go about changing the partition key. Let’s go and run our toggle script now. The pkey to be “toggled” has value 0xffff:

  1. [root@qnlel1-emoc toggle-scripts]# ./ToggleVServers_TRAIN1.sh 0xffff
  2. --- /mnt/general/qualogy/tools/toggle-scripts/UpdateVServers.sh (version: 201311080830) ---
  3. [2015/11/09-10:49:47] Issued command: /mnt/general/qualogy/tools/toggle-scripts/UpdateVServers.sh -n vServers-2btoggled.lst -c all-cn-virtual -url https://qnlel1-emoc:9443/emoc -r /OVS-Repo -ak ../iaas-keys/ak-jnijhoff-QEL-TrainingDemo.file -a MNEL1-FusionMW-Acceptatie-project -k 0xffff -t 200 -log ToggleVServers.log
  4. === UpdateVServerPkey.sh (version: 201311080830) ===
  5. [2015/11/09-10:49:47] Issued command: UpdateVServerPkey.sh -n 0004fb0000060000af022154e701ffc9 -a MNEL1-FusionMW-Acceptatie-project -r /OVS-Repo -c /mnt/general/qualogy/tools/toggle-scripts/all-cn-virtual -url https://qnlel1-emoc:9443/emoc -log /mnt/general/qualogy/tools/toggle-scripts/ToggleVServers.log -ak /mnt/general/qualogy/tools/iaas-keys/ak-jnijhoff-QEL-TrainingDemo.file -k 0xffff -t 200
  6. [2015/11/09-10:49:47] Updating vServer's vm.cfg
  7. [2015/11/09-10:49:48] Found vm.cfg of 0004fb0000060000af022154e701ffc9: /OVS-Repo/VirtualMachines/0004fb0000060000af022154e701ffc9/vm.cfg.
  8. [2015/11/09-10:49:48] Found the exalogic_ipoib line.
  9. [2015/11/09-10:49:48] Found 0xffff in the line.
  10. [2015/11/09-10:49:48] The VM is running on qnlel1cn03.
  11. [2015/11/09-10:49:48] Stopping vserver 0004fb0000060000af022154e701ffc9 (name qnl-train1-wls-vs1 of account MNEL1-FusionMW-Acceptatie-project)
  12. Stopping vServer(s) [qnl-train1-wls-vs1]
  13.  
  14. [2015/11/09-10:49:51] Backing up /OVS-Repo/VirtualMachines/0004fb0000060000af022154e701ffc9/vm.cfg to vmcfgbak/vm.cfg.bak_201511091049470004fb0000060000af022154e701ffc9.
  15. [2015/11/09-10:49:51] Replacing 0xffff with 0x7fff.
  16. [2015/11/09-10:49:51] Waiting the VM to finish the shutdown.
  17. Status after 0 seconds: MNEL1-FusionMW-Acceptatie-project|qnl-train1-wls-vs1|Exalogic_vServer_host_for_Training_Demo,_to_host_Weblogic__instance_1
  18. Status after 5 seconds: MNEL1-FusionMW-Acceptatie-project|qnl-train1-wls-vs1|Exalogic_vServer_host_for_Training_Demo,_to_host_Weblogic__instance_1
  19. Status after 195 seconds: MNEL1-FusionMW-Acceptatie-project|qnl-train1-wls-vs1|Exalogic_vServer_host_for_Training_Demo,_to_host_Weblogic__instance_1
  20. Status after 200 seconds: MNEL1-FusionMW-Acceptatie-project|qnl-train1-wls-vs1|Exalogic_vServer_host_for_Training_Demo,_to_host_Weblogic__instance_1
  21. [2015/11/09-10:54:49] ERROR 0004fb0000060000af022154e701ffc9: The VM hasn't finished its shutdown process within 200 seconds. Please restart it manually later on. Exiting...
  22. === UpdateVServerPkey.sh (version: 201311080830) ===
  23. [2015/11/09-10:54:49] Issued command: UpdateVServerPkey.sh -n 0004fb0000060000378a0f7c969e1bf5 -a MNEL1-FusionMW-Acceptatie-project -r /OVS-Repo -c /mnt/general/qualogy/tools/toggle-scripts/all-cn-virtual -url https://qnlel1-emoc:9443/emoc -log /mnt/general/qualogy/tools/toggle-scripts/ToggleVServers.log -ak /mnt/general/qualogy/tools/iaas-keys/ak-jnijhoff-QEL-TrainingDemo.file -k 0xffff -t 200
  24. [2015/11/09-10:54:49] Updating vServer's vm.cfg
  25. [2015/11/09-10:54:49] Found vm.cfg of 0004fb0000060000378a0f7c969e1bf5: /OVS-Repo/VirtualMachines/0004fb0000060000378a0f7c969e1bf5/vm.cfg.
  26. [2015/11/09-10:54:49] Found the exalogic_ipoib line.
  27. [2015/11/09-10:54:49] Found 0xffff in the line.
  28. [2015/11/09-10:54:50] The VM is running on qnlel1cn07.
  29. [2015/11/09-10:54:50] Stopping vserver 0004fb0000060000378a0f7c969e1bf5 (name qnl-train1-wls-vs2 of account MNEL1-FusionMW-Acceptatie-project)
  30. Stopping vServer(s) [qnl-train1-wls-vs2]
  31.  
  32. [2015/11/09-10:54:52] Backing up /OVS-Repo/VirtualMachines/0004fb0000060000378a0f7c969e1bf5/vm.cfg to vmcfgbak/vm.cfg.bak_201511091054490004fb0000060000378a0f7c969e1bf5.
  33. [2015/11/09-10:54:52] Replacing 0xffff with 0x7fff.
  34. [2015/11/09-10:54:52] Waiting the VM to finish the shutdown.
  35. Status after 0 seconds: MNEL1-FusionMW-Acceptatie-project|qnl-train1-wls-vs2|RUNNING
  36. Status after 5 seconds: MNEL1-FusionMW-Acceptatie-project|qnl-train1-wls-vs2|RUNNING
  37. Status after 10 seconds: MNEL1-FusionMW-Acceptatie-project|qnl-train1-wls-vs2|RUNNING
  38. [2015/11/09-10:55:14] VM shutdown takes 15 seconds to complete.
  39. [2015/11/09-10:55:14] Restarting vserver 0004fb0000060000378a0f7c969e1bf5 (name qnl-train1-wls-vs2 of account MNEL1-FusionMW-Acceptatie-project).
  40. Starting vServer(s) [qnl-train1-wls-vs2]
  41.  
  42. SUCCESS 0004fb0000060000378a0f7c969e1bf5
  43. [2015/11/09-10:55:17] The membership of 0004fb0000060000378a0f7c969e1bf5 in the given partition has been toggled.
  44. [2015/11/09-10:55:17] Finished command: /mnt/general/qualogy/tools/toggle-scripts/UpdateVServers.sh -n vServers-2btoggled.lst -c all-cn-virtual -url https://qnlel1-emoc:9443/emoc -r /OVS-Repo -ak ../iaas-keys/ak-jnijhoff-QEL-TrainingDemo.file -a MNEL1-FusionMW-Acceptatie-project -k 0xffff -t 200 -log ToggleVServers.log

And remember..

Note that the vServers are fully stopped and then restarted in the process. This can be a problem (downtime for the users of the hosted applications) if there is no application high availability in place, and I have changed the scripts on occasion to delay the stop/restart to some later moment. However, remember that only when the VM is actually restarted does the desired network isolation kick in. By the way, you can also see your vServer going down in Exalogic Control as shown below.

Little problem

There seemed to be a problem with the fist vServer’s shutdown detection by the script, because it actually shutdown quite fast and the script did not pick it up and kept waiting for it until timeout. I suspect this may have something to do with my rather long vServer descriptions in EMOC, as it always works well with brief or empty description fields. When you have renamed a vServer in EMOC this may also occur, as the OVM_simple_name kept by the OVM Manager will have a “-renamed” postfix to it.

Check vm.cfg again

Let’s check our vm.cfg file again:

  1. [root@qnlel1-emoc 0004fb0000060000af022154e701ffc9]# cat vm.cfg | grep ipoib
  2. exalogic_ipoib = [{'pkey': ['0x7fff', '0x0005', '0x0003'], 'port': '1'}, {'pkey': ['0x7fff', '0x0005', '0x0003'], 'port': '2'}]

We see that the partion key has now been updated to specify limited membership for my vServer. Note that the script has also created a backup of the vm.cfg file. Ofcourse it’s OK to check on one vm.cfg file, but it’s a lot of work if you have “toggled” many vServers. For this we can use the CheckVServers.sh script, for which I have likewise created a wrapper script named CheckToggles_TRAIN1.sh:

  1. # CheckToggles_TRAIN1.sh
  2. # Jos Nijhoff, Qualogy, november 2015
  3. export IAAS_HOME=/opt/oracle/iaas/cli
  4. export JAVA_HOME=/usr
  5. SCRIPTS=/mnt/general/qualogy/tools/toggle-scripts
  6. EMOC_URL="https://qnlel1-emoc:9443/emoc"
  7. OVS_REPO=/OVS-Repo
  8. ACCESS_KEY="../iaas-keys/ak-jnijhoff-QEL-TrainingDemo.file"
  9. ACCOUNT=MNEL1-FusionMW-Acceptatie-project
  10.  
  11. $SCRIPTS/CheckVServers.sh -n vServers-2btoggled.lst -c all-cn-virtual -url $EMOC_URL -r $OVS_REPO -a $ACCOUNT -k $1

Run the script and provide the same pkey value as you used for the toggle script. Note that your vServers need to be running for this check to do it’s work.

  1. [root@qnlel1-emoc toggle-scripts]# ./CheckToggles_TRAIN1.sh 0xffff
  2. --- /mnt/general/qualogy/tools/toggle-scripts/CheckVServers.sh (version: 201311080830) ---
  3. [2015/11/09-11:21:49] Issued command: /mnt/general/qualogy/tools/toggle-scripts/CheckVServers.sh -n vServers-2btoggled.lst -c all-cn-virtual -url https://qnlel1-emoc:9443/emoc -r /OVS-Repo -a MNEL1-FusionMW-Acceptatie-project -k 0xffff
  4. === CheckVServerPkey.sh (version: 201311080830) ===
  5. [2015/11/09-11:21:49] Issued command: CheckVServerPkey.sh -n 0004fb0000060000af022154e701ffc9 -r /OVS-Repo -c /mnt/general/qualogy/tools/toggle-scripts/all-cn-virtual -url https://qnlel1-emoc:9443/emoc -log /mnt/general/qualogy/tools/toggle-scripts/CheckVServerPKey.log -a MNEL1-FusionMW-Acceptatie-project -k 0xffff
  6. [2015/11/09-11:21:49] Finding vServer's vm.cfg
  7. [2015/11/09-11:21:49] Found vm.cfg of 0004fb0000060000af022154e701ffc9: /OVS-Repo/VirtualMachines/0004fb0000060000af022154e701ffc9/vm.cfg.
  8. [2015/11/09-11:21:50] The VM is running on qnlel1cn07.
  9. [2015/11/09-11:21:50] The VM is attached to VF 0000:19:00.2.
  10. [2015/11/09-11:21:51] 0004fb0000060000af022154e701ffc9 port 1 is using pkey 0x7fff
  11. [2015/11/09-11:21:51] 0004fb0000060000af022154e701ffc9 port 1 is using pkey 0x0005
  12. [2015/11/09-11:21:52] 0004fb0000060000af022154e701ffc9 port 1 is using pkey 0x8007
  13. [2015/11/09-11:21:52] 0004fb0000060000af022154e701ffc9 port 1 is using pkey 0x0003
  14. [2015/11/09-11:21:53] 0004fb0000060000af022154e701ffc9 port 2 is using pkey 0x7fff
  15. [2015/11/09-11:21:54] 0004fb0000060000af022154e701ffc9 port 2 is using pkey 0x0005
  16. [2015/11/09-11:21:54] 0004fb0000060000af022154e701ffc9 port 2 is using pkey 0x8007
  17. [2015/11/09-11:21:54] 0004fb0000060000af022154e701ffc9 port 2 is using pkey 0x0003
  18. [2015/11/09-11:21:54] Found the exalogic_ipoib line.
  19. [2015/11/09-11:21:54] Found no 0xffff entry in the line.
  20. [2015/11/09-11:21:54] SUCCESS 0004fb0000060000af022154e701ffc9: The membership of 0004fb0000060000af022154e701ffc9 in the given partition has been toggled.
  21. === CheckVServerPkey.sh (version: 201311080830) ===
  22. [2015/11/09-11:21:54] Issued command: CheckVServerPkey.sh -n 0004fb0000060000378a0f7c969e1bf5 -r /OVS-Repo -c /mnt/general/qualogy/tools/toggle-scripts/all-cn-virtual -url https://qnlel1-emoc:9443/emoc -log /mnt/general/qualogy/tools/toggle-scripts/CheckVServerPKey.log -a MNEL1-FusionMW-Acceptatie-project -k 0xffff
  23. [2015/11/09-11:21:54] Finding vServer's vm.cfg
  24. [2015/11/09-11:21:54] Found vm.cfg of 0004fb0000060000378a0f7c969e1bf5: /OVS-Repo/VirtualMachines/0004fb0000060000378a0f7c969e1bf5/vm.cfg.
  25. [2015/11/09-11:21:55] The VM is running on qnlel1cn03.
  26. [2015/11/09-11:21:56] The VM is attached to VF 0000:19:00.2.
  27. [2015/11/09-11:21:57] 0004fb0000060000378a0f7c969e1bf5 port 1 is using pkey 0x7fff
  28. [2015/11/09-11:21:57] 0004fb0000060000378a0f7c969e1bf5 port 1 is using pkey 0x0005
  29. [2015/11/09-11:21:57] 0004fb0000060000378a0f7c969e1bf5 port 1 is using pkey 0x8007
  30. [2015/11/09-11:21:58] 0004fb0000060000378a0f7c969e1bf5 port 1 is using pkey 0x0003
  31. [2015/11/09-11:21:59] 0004fb0000060000378a0f7c969e1bf5 port 2 is using pkey 0x7fff
  32. [2015/11/09-11:21:59] 0004fb0000060000378a0f7c969e1bf5 port 2 is using pkey 0x0005
  33. [2015/11/09-11:21:59] 0004fb0000060000378a0f7c969e1bf5 port 2 is using pkey 0x8007
  34. [2015/11/09-11:22:00] 0004fb0000060000378a0f7c969e1bf5 port 2 is using pkey 0x0003
  35. [2015/11/09-11:22:00] Found the exalogic_ipoib line.
  36. [2015/11/09-11:22:00] Found no 0xffff entry in the line.
  37. [2015/11/09-11:22:00] SUCCESS 0004fb0000060000378a0f7c969e1bf5: The membership of 0004fb0000060000378a0f7c969e1bf5 in the given partition has been toggled.
  38. [2015/11/09-11:22:00] Finished command: /mnt/general/qualogy/tools/toggle-scripts/CheckVServers.sh -n vServers-2btoggled.lst -c all-cn-virtual -url https://qnlel1-emoc:9443/emoc -r /OVS-Repo -a MNEL1-FusionMW-Acceptatie-project -k 0xffff

The script checks where your vServer is running (compute nodes 7 and 3 in my case) and then checks if it is running with the correct partition key for both the links in the bond (to GW1 and to GW2). It provides a lot of output, so for many vServers you would actually grep the output or grep the CheckVServerPKey.log file.

  1. [root@qnlel1-emoc toggle-scripts]# ./CheckToggles_TRAIN1.sh 0xffff | grep SUCCESS
  2. [2015/11/09-11:30:41] SUCCESS 0004fb0000060000af022154e701ffc9: The membership of 0004fb0000060000af022154e701ffc9 in the given partition has been toggled.
  3. [2015/11/09-11:30:47] SUCCESS 0004fb0000060000378a0f7c969e1bf5: The membership of 0004fb0000060000378a0f7c969e1bf5 in the given partition has been toggled.

We check to see if our network isolation is now in place:

  1. [root@qnl-train1-wls-vs2 ~]# ssh 192.168.10.56
  2. ssh: connect to host 192.168.10.56 port 22: No route to host
  3. [root@qnl-train1-wls-vs2 ~]# ping 192.168.10.56
  4. PING 192.168.10.56 (192.168.10.56) 56(84) bytes of data.
  5. From 192.168.10.57 icmp_seq=2 Destination Host Unreachable
  6. From 192.168.10.57 icmp_seq=3 Destination Host Unreachable
  7. From 192.168.10.57 icmp_seq=4 Destination Host Unreachable

Thus, we can no longer go over the IPoIB-default network (192.168.10.0/24) to the other vServer, as both are now limited members.

Conclusion

The scripts provided by Oracle Support do a pretty good job if you have many vServers to fix, but it still is some work to set things up, run the procedure, deal with unwilling vServers manually and check the results. Also, you need to collect all the UUID strings of the vServers you want to fix and put them in a file. You cannot get the UUIDs readily from the IaaS CLI, so you need to get a little creative and do a search on the vm.cfg files with “find” through the OVS repository or do some other cleverness. However, if you treat your vServers as disposable and replacable (as is best practice), their UUID’s may have changed after over time and your list file needs to be updated as well.

Apart from the amount of effort, there’s also the downtime issue as every vServer needs to be stopped/restarted for this. It would have been better if we had “opt-out” security instead of “opt-in”, meaning that Exalogic Control should create our vServers as limited members on the IPoIB-default network to begin with, and then let us opt-out if we need a few vServers to be full members after all.

Jos Nijhoff
About the author Jos Nijhoff

Jos Nijhoff is an experienced Application Infrastructure consultant at Qualogy. Currently he plays a key role as technical presales and hands-on implementation lead for Qualogy's exclusive Exalogic partnership with Oracle for the Benelux area. Thus he keeps in close contact with Oracle presales and partner services on new developments, but maintains an independent view. He gives technical guidance and designs, reviews, manages and updates the application infrastructure before, during and after the rollout of new and existing Oracle (Fusion) Applications & Fusion Middleware implementations. Jos is also familiar with subjects like high availability, disaster recovery scenarios, virtualization, performance analysis, data security, and identity management integration with respect to Oracle applications.

More posts by Jos Nijhoff
Comments
Reply