Nutanix REST API Browser - How to get Container Info.

July 11, 2013, 11:34 am

≫ Next: Nutanix REST-API using python Requests - Get container info

Basic Definition:
REST is an alternative to SOAP based web services. Where SOAP tries to model the exchange between client and server as calls to objects, REST tries to be faithful to the web domain. So when calling a web service written in SOAP, you may write

productService.GetProduct("1")

in REST, you may call a url with HTTP GET

http://someurl/products/product/1
CRUD - Create, Request, Update and Delete are done via http requests POST GET, PUT and DELETE respectively.

One can use curl, links, http browser to get the details of the nutanix cluster.

Here is to most inteRESTing REST API!

Now the most interesting Nutanix has became more inteRESTing!
Or we can call Nutanix REST as BeautyREST! (someone already called it sexy! http://wahlnetwork.com/2013/07/01/nutanix-and-veeam-spread-rest-api-joy/)

Nutanix REST API Browser: http://any_Nutanix_CVM

To access Rest API browser:

Rest API Explorer: https://10.3.101.61:9440/console/api/

To get NFS datastores /Nutanix containers (vstore)

Get Response:

jerome@ithaca:~$ curl https://10.3.101.61:9440/PrismGateway/services/rest/v1/vstores --insecure --user admin:admin

[{"id":2482,"name":"ctr3","containerId":2482,"backedup":false,"protectionDomain":null,"markedForRemoval":false},{"id":2483,"name":"ctr4","containerId":2483,"backedup":false,"protectionDomain":null,"markedForRemoval":false},{"id":1898767,"name":"ctr5","containerId":1898767,"backedup":true,"protectionDomain":"ctr5_1372277619664","markedForRemoval":false},{"id":2430343,"name":"test","containerId":2430343,"backedup":false,"protectionDomain":null,"markedForRemoval":false},{"id":2430528,"name":"testing","containerId":2430528,"backedup":false,"protectionDomain":null,"markedForRemoval":false},{"id":18973731,"name":"testStats","containerId":18973731,"backedup":false,"protectionDomain":null,"markedForRemoval":false},{"id":19113654,"name":"dummyCTR_to_delete1","containerId":19113654,"backedup":false,"protectionDomain":null,"markedForRemoval":false}]

to get specific container:

curl https://10.3.101.61:9440/PrismGateway/services/rest/v1/vstores/1898767 --insecure --user admin:admin

{"id":1898767,"name":"ctr5","containerId":1898767,"backedup":true,"protectionDomain":"ctr5_1372277619664","markedForRemoval":false}

↧

Nutanix REST-API using python Requests - Get container info

July 12, 2013, 11:24 am

≫ Next: Creating service center via python REqueSTs

≪ Previous: Nutanix REST API Browser - How to get Container Info.

From:
http://docs.python-requests.org/en/latest/

Installing Pip and Requests on Nutanix Controller VM/or any Linux system:
curl -O https://raw.github.com/pypa/pip/master/contrib/get-pip.py
sudo python get-pip.py
sudo pip install requests
sudo pip install requests --upgrade
cd /usr/lib/python2.6/site-packages/;sudo chmod -R 755 requests-1.2.3-py2.6.egg/
PYTHONPATH=${PYTHONPATH}:/usr/lib/python2.6/site-packages/requests-1.2.3-py2.6.egg;export PYTHONPATH

Sample Script:
cat test_resp.py - to print container info

#!/usr/bin/python
import json as json
import requests

def main():
base_url = "https://10.3.101.59:9440/PrismGateway/services/rest/v1/"
s = requests.Session()
s.auth = ('admin', 'admin')
s.headers.update({'Content-Type': 'application/json; charset=utf-8'})

print s.get(base_url + 'vstores', verify=False).json()

if __name__ == "__main__":
main()

Output

./test_resp.py
[{u'protectionDomain': None, u'name': u'ctr3', u'backedup': False, u'markedForRemoval': False, u'id': 2482, u'containerId': 2482}, {u'protectionDomain': None, u'name': u'ctr4', u'backedup': False, u'markedForRemoval': False, u'id': 2483, u'containerId': 2483}, {u'protectionDomain': u'ctr5_1372277619664', u'name': u'ctr5', u'backedup': True, u'markedForRemoval': False, u'id': 1898767, u'containerId': 1898767}, {u'protectionDomain': None, u'name': u'test', u'backedup': False, u'markedForRemoval': False, u'id': 2430343, u'containerId': 2430343}, {u'protectionDomain': None, u'name': u'testing', u'backedup': False, u'markedForRemoval': False, u'id': 2430528, u'containerId': 2430528}, {u'protectionDomain': None, u'name': u'testStats', u'backedup': False, u'markedForRemoval': False, u'id': 18973731, u'containerId': 18973731}, {u'protectionDomain': None, u'name': u'dummyCTR_to_delete1', u'backedup': False, u'markedForRemoval': False, u'id': 19113654, u'containerId': 19113654}]

Get the name and size of storage pool:

#!/usr/bin/python
import json as json
import requests

def main():
base_url = "https://10.3.101.59:9440/PrismGateway/services/rest/v1/"
s = requests.Session()
s.auth = ('admin', 'admin')
s.headers.update({'Content-Type': 'application/json; charset=utf-8'})

data = s.get(base_url + 'storage_pools', verify=False).json()
spname= data ["entities"][0]["name"]
size= data ["entities"][0]["capacity"]
print spname,size/(1024*1024*1024*1024)
if __name__ == "__main__":
main()

↧

Creating service center via python REqueSTs

July 19, 2013, 5:15 pm

≫ Next: CBRC storage Accelerator stats

≪ Previous: Nutanix REST-API using python Requests - Get container info

How to find the json config :
Python script:

import requests

def main():
base_url = "https://10.3.101.59:9440/PrismGateway/services/rest/v1/"
s = requests.Session()
s.auth = ('admin', 'admin')
s.headers.update({'Content-Type': 'application/json; charset=utf-8'})

data1 = {
"port": 9443,
"name": "testservice",
"userName": "support",
"ipAddress": "10.10.10.10"
}
s.post(base_url + 'service_centers' ,data=json.dumps(data1))
print s.get(base_url + 'service_centers/testservice', verify=False).json()

if __name__ == "__main__":
main()

↧

CBRC storage Accelerator stats

July 25, 2013, 3:27 pm

≫ Next: Nutanix NFS VAAI troubleshooting

≪ Previous: Creating service center via python REqueSTs

/~ # vsish
/> cat /vmkModules/cbrc_filter/dcacheStats
CBRC cache statistics {
   Cache chunk size:4096
   Number of LRU lists:100
   Number of hash buckets:5000
   Total buffers:262144
   Buffers with memory allocated:262144
   Minimum requirement of alloced buffers:0
   VMs using cache:9
   VMs using active cache:9
   Stats counters:Data cache counters {
      Active buffers:0
      Buffer invalidations:0
      Evicts of valid buffers:0
      Evicts of valid buffers during getfreebuf:0
      VM issued read io count:1003241
      VM issued write io count:291538
      Read io count within cache limits:1001264
      Write io count within cache limits:291538
      Backend io read count:270005
      Backend io write count:291445
      Abort count:0
      Reset count:170
      Digest not found count:213818
      Skipped read count:1977
      Backened io read failures:0
      Getfreebuf failures:0
      On demand buf alloc failures:0
      Deleted during fetch count:0
      Transit during fetch count:168814
      Transit Pop count:168814
      Transit waits:0
      Timeouts in buffer transit waits:0
      Transit errors:0
      Cache fill io errors:0
      Copy to user buf errors:0
      Races in getting free buffers:56868
      Reclaimed buffers:0
   }
}

↧

Nutanix NFS VAAI troubleshooting

July 29, 2013, 1:12 pm

≫ Next: How much of a detail can you get about a VM from your storage ?

≪ Previous: CBRC storage Accelerator stats

vStorage APIs for Array Integration is a feature introduced in ESXi that provides

hardware acceleration functionality. It enables your host to offload specific virtual

machine and storage management operations to compliant storage hardware. With

the storage hardware assistance, your host performs these operations faster and
consumes less CPU, memory, and storage fabric bandwidth.

NAS DISK-LIB:
1. Full /Fast Copy, which is used to copy or migrate data within the same physical array

2. RESERVE Space ( Nutanix 3.5)
3. extended statistics.

To verify if nutanix vaai is installed:

~ # esxcli software vib list |grep vaai

nfs-vaai-plugin 1.1-101 Nutanix CommunitySupported 2013-05-12

~ # esxcli storage nfs list

Volume Name Host Share Accessible Mounted Read-Only HW Acceleration

----------------- ----------- ------------------------ ------- --------- ---------------------

NFS-StorageA 192.168.5.2 /NFS-A true true false Supported

NFS Uses Full File Clone Primitive and Block storage uses XCOPY primitive.

Storage vMotion, Hot Cloning cannot leverage Full File Clone Primitive.

Full file clone primitive can be leveraged only when a cold migration is initiated.

Reserve Space primitive is implemented Nutanix 3.5 software version.
More details in
http://cormachogan.com/2012/11/08/vaai-comparison-block-versus-nas/
VMWARE kb-1021976

Troubleshooting:

VAAI clone will fail:

1. if VM has snapshot

2. if vm is powered on

3. if container name has space or other characters

4. cloning between datastores.

esxtop:
First use ‘u’ to select device view, then enable fields o and p for VAAI stats.

VAAI Logs:

grep -i VAAI var/log/vpxa.log and vmkernel.log

Possible Errors in /var/log/vpxa.log/vmkernel.log

Error:
2012-06-21T01:30:09.165Z [25DAEB90 info 'DiskLib' opID=B2F4E13F-00001118-66]
DISKLIB-LIB : Failed to create native clone on destination handle :
One of the parameters supplied is invalid (1).

Cause:

Cloning to a different datastore
Creating a clone from a VM that has a snapshot

Error:

2012-07-18T23:15:49.378Z [4AB51B90 info 'DiskLib' opID=55E1C10D-00000188-3a]
DISKLIB-LIB : Failed to create native clone on destination handle :
The system cannot find the file specified (25).

Cause:

A container name that contains a space
A container name that was changed

Error:

2012-07-19T01:59:02.865Z [362ACB90 info 'DiskLib' opID=4A5CFD5E-00006113-5f]
DISKLIB-LIB : Failed to create native clone on destination handle :
The specified feature is not supported by this version (24).

Cause:

Cloning from the local datastore (VMFS)

Also you can vmware.log

2013-07-18T04:09:50.903Z| vcpu-0| SNAPSHOT: SnapshotDumperOpenFromInfo: Creating checkpoint file /vmfs/volumes/78993998-9e053604/ENCmsOLY1007-SCCM/ENCmsOLY1007-Snapshot168.vmsn
2013-07-18T04:09:51.494Z| vcpu-0| nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2
2013-07-18T04:09:51.526Z| vcpu-0| DISKLIB-VMFS : "/vmfs/volumes/78993998-9e053604/ENCmsOLY1007-SCCM/ENCmsOLY1007-000002-delta.vmdk" : open successful (29) size = 1618106368, hd = 0. Type 8
2013-07-18T04:09:52.255Z| vcpu-0| DISKLIB-LIB : DiskLibCreateCreateParam: vmfssparse grain size set to : 1

↧

How much of a detail can you get about a VM from your storage ?

August 2, 2013, 12:36 pm

≫ Next: Unable to enable HA on one ESXi host

≪ Previous: Nutanix NFS VAAI troubleshooting

Most storage vendors say that they are VM aware, but it is difficult for the centralized storage vendors to identify VMs, even getting host level statistics is pain , because you need map a LUN to WWNN and then WWNN to a host.

With Nutanix, it is a breeze, because it is a converged platform and Nutanix is VM aware in addition to statistics, Nutanix localizes data based on where the VM is accessing data from. This is quick
overview, in no way it is complete in what other stats we can get.

From CLI:

ncli vm list -- list of VMs running, it gives CPU ,memory, vdisks configured.
ncli vdisk ls vm-name="name of the VM"
ncli vm ls-stats name="name of the VM"

Example:
Snippet of ncli vm ls
   ID                        : 50160a6e-d5c2-041d-7a2d-541530f8c86b
    Name                      : nfs-ubu-stress-Colossus09-1-4
    VM IP Addresses           :
    Hypervisor Host ID        : 3
    Hypervisor Host Name      : 10.3.177.183
    Memory (MB)               : 4096
    Virtual CPUs              : 2
    VDisk Count               : 1
    VDisks                    : NFS:19812

ncli vm ls-stats name=nfs-ubu-stress-Colossus09-1-20    Name                      : nfs-ubu-stress-Colossus09-1-20
VM IP Addresses           : 10.3.58.235
    Hypervisor Host ID        : 746301033
    Memory (MB)               : 4096
    Virtual CPUs              : 2
    Disk Bandwidth (Kbps)     : 25230
    Network Bandwidth (Kbps) : 0
    Latency (micro secs)      : 2215
    CPU Usage Percent         : 100%
    Memory Usage              : 1.02 GB (1,090,516,000 bytes)

GUI:

From REST API:

nutanix@NTNX-450-A-CVM:10.1.59.66:~$ cat test_resp.py
#!/usr/bin/python
import json as json
import requests

def main():
base_url = "https://colossus09-c1.corp.nutanix.com:9440/PrismGateway/services/rest/v1/"
s = requests.Session()
s.auth = ('admin', 'admin')
s.headers.update({'Content-Type': 'application/json; charset=utf-8'})

print s.get(base_url + 'vms/50169534-35e1-a1de-c23e-1d1135151293', verify=False).json()
#just VMs will get the all VMs and then you actively get specific vm
if __name__ == "__main__":
main()

run test_resp.py

Output for one VM:

 {
"vmId": " {
"vmId": "50169534-35e1-a1de-c23e-1d1135151293",
"powerState": "on",
"vmName": "nfs-ubu-stress-Colossus09-1-4",
"guestOperatingSystem": "Ubuntu Linux (64-bit)",
"ipAddresses": [],
"hostName": "10.3.177.183",
"hostId": 3,
"memoryCapacityInMB": 4096,
"memoryReservedCapacityInMB": 0,
"numVCpus": 2,
"cpuReservedInHz": 0,
"numNetworkAdapters": 1,
"nutanixVirtualDisks": [
"/ctr1/nfs-ubu-stress-Colossus09-1-4/nfs-ubu-stress-Colossus09-1-4.vmdk"
      ],
"vdiskNames": [
"NFS:18594"
      ],
"vdiskFilePaths": [
"/ctr1/nfs-ubu-stress-Colossus09-1-4/nfs-ubu-stress-Colossus09-1-4-flat.vmdk"
      ],
"diskCapacityInBytes": 53687091200,
"timeStampInUsec": 1375472003986000,
"protectionDomianName": null,
"consistencyGroupName": null,
"stats": {
"hypervisor_memory_usage_ppm": "330000",
"avg_io_latency_usecs": "218757",
"write_io_ppm": "1000000",
"seq_io_ppm": "411998",
"read_io_ppm": "0",
"hypervisor_num_transmitted_bytes": "-1",
"hypervisor_num_received_bytes": "-1",
"total_transformed_usage_bytes": "0",
"hypervisor_avg_read_io_latency_usecs": "0",
"hypervisor_num_write_io": "15760",
"num_iops": "113",
"random_io_ppm": "588001",
"total_untransformed_usage_bytes": "-1",
"avg_read_io_latency_usecs": "-1",
"io_bandwidth_kBps": "18807",
"hypervisor_avg_io_latency_usecs": "6000",
"hypervisor_num_iops": "788",
"hypervisor_cpu_usage_ppm": "460000",
"hypervisor_io_bandwidth_kBps": "31636"
      }
    },",
"powerState": "on",
"vmName": "nfs-ubu-stress-Colossus09-1-4",
"guestOperatingSystem": "Ubuntu Linux (64-bit)",
"ipAddresses": [],
"hostName": "10.3.177.183",
"hostId": 3,
"memoryCapacityInMB": 4096,
"memoryReservedCapacityInMB": 0,
"numVCpus": 2,
"cpuReservedInHz": 0,
"numNetworkAdapters": 1,
"nutanixVirtualDisks": [
"/ctr1/nfs-ubu-stress-Colossus09-1-4/nfs-ubu-stress-Colossus09-1-4.vmdk"
      ],
"vdiskNames": [
"NFS:18594"
      ],
"vdiskFilePaths": [
"/ctr1/nfs-ubu-stress-Colossus09-1-4/nfs-ubu-stress-Colossus09-1-4-flat.vmdk"
      ],
"diskCapacityInBytes": 53687091200,
"timeStampInUsec": 1375472003986000,
"protectionDomianName": null,
"consistencyGroupName": null,
"stats": {
"hypervisor_memory_usage_ppm": "330000",
"avg_io_latency_usecs": "218757",
"write_io_ppm": "1000000",
"seq_io_ppm": "411998",
"read_io_ppm": "0",
"hypervisor_num_transmitted_bytes": "-1",
"hypervisor_num_received_bytes": "-1",
"total_transformed_usage_bytes": "0",
"hypervisor_avg_read_io_latency_usecs": "0",
"hypervisor_num_write_io": "15760",
"num_iops": "113",
"random_io_ppm": "588001",
"total_untransformed_usage_bytes": "-1",
"avg_read_io_latency_usecs": "-1",
"io_bandwidth_kBps": "18807",
"hypervisor_avg_io_latency_usecs": "6000",
"hypervisor_num_iops": "788",
"hypervisor_cpu_usage_ppm": "460000",
"hypervisor_io_bandwidth_kBps": "31636"
      }
    },

↧

Unable to enable HA on one ESXi host

August 5, 2013, 9:23 am

≫ Next: Introducing NEDE ( Nutanix Elastic Dedup Engine)

≪ Previous: How much of a detail can you get about a VM from your storage ?

Problem Description:

Host ABCD (x.y.z.150) is unable to start vSphere HA. The current state is "vSphere HA Agent Unreachable". I have tried to start HA twice, but this did not resolve the issue.

KBs to review:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2010626

ttp://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2011974

Logs to look for in ESXi: /var/log/vpxa.log and /var/log/fdm.log ( /var/run/log)

fdm.log snippet:

2013-08-04T21:01:18.968Z [FFDD3B90 error 'Cluster' opID=SWI-79b9207c] [ClusterDatastore::DoAcquireDatastoreWork] open(/vmfs/volumes/9e9989cf-f687e31c/.vSphere-HA/FDM-F78AC28A-8862-48C5-BC1C-F369CCABE58E-1480-9c9b8fc-ANTHMSASVC5/protectedlist)failed: Device or resource busy

2013-08-04T21:01:44.224Z [38498B90 error 'Default' opID=SWI-4593d696] SSLStreamImpl::BIOWrite (0d3fe098) Write failed: Broken pipe

2013-08-04T21:01:44.224Z [38498B90 error 'Default' opID=SWI-4593d696] SSLStreamImpl::DoClientHandshake (0d3fe098) SSL_connect failed with BIO Error

2013-08-04T21:01:44.224Z [38498B90 error 'Message' opID=SWI-4593d696] [MsgConnectionImpl::FinishSSLConnect] Error N7Vmacore3Ssl12SSLExceptionE(SSL Exception: BIO Error) on handshake

Workaround:

- check if there are high latencies on the storage

- restart services.sh

- enable/refresh HA again in the vcenter.

Errors on FDM.log:

2013-08-04T23:23:33.577Z [FFF18B90 verbose 'Cluster' opID=SWI-5a8f10c4] [ClusterManagerImpl::IsBadIP] x.y.z.199 is bad ip

2013-08-04T23:23:34.578Z [FFF18B90 verbose 'Cluster' opID=SWI-5a8f10c4] [ClusterManagerImpl::IsBadIP] x.y.z.199 is bad ip

Workaround:

On x.y.z.199, review the fdm.log and run services.sh restart.( you could disconnect and connect the host, which restarts services, but I find services.sh restart fixing more issues)

↧

Introducing NEDE ( Nutanix Elastic Dedup Engine)

August 20, 2013, 12:44 pm

≫ Next: What node removal process does in the background ?

≪ Previous: Unable to enable HA on one ESXi host

Nutanix elastic dedup engine is

software driven,
scalable/distributed/inline data reduction technology for Flash and cache tiers
fingerprints on sequential write ,
dedup in RAM/Flash on Read.

Nutanix software is so modular, that Nutanix software development team developed this module and plugged into existing modules (cassandra/stargate).
NEDE uses no-SQL database for indexing using already existing keyspace extentgroup map . We did not even have to create a new keyspace.

Next steps:
NEDE will utilize scale-out Map-Reduce technology already existing in NDFS for offline deduplication.

Nutanix DR will be using NEDE to reduce the amount data transferred across the WAN.

NEDE eliminates the duplicate 4K blocks, so we have more space available in hot tier and memory, for unique blocks.

We have seen the need for it as more and more of our users migrate VMs from
other storage vendors to us ( where VAAI plugin snapshot or Linked Clone can't be taken advantage) and reduce the usage of hot tier. CBRC (http://myvirtualcloud.net/?p=3094) is of some help but 2G cache is not enough.

Convergence and Dedup: Dedup engine is distributed, and the data/indexing is localized based
as it is VM aware and less network utilization than centralized storage/SAN/NAS solution.

Nutanix uses fixed block dedup as vmdk is a block device which is formatted on the guest OS with specific block size. For example Windows NTFS uses 4K block size, so dedup uses fixed 4K block size.

Content Cache: (new with this feature) refers to the <SHA1 Fingerprint> to <chunkData> cache. This will be on RAM+flash.

Outline:

On seq writes , compute SHA1 for 4k blocks , store it in keyspace.
On reads, if it has SHA1 index already, serve it from Content Cache, else read from Extent store and populate the cache.

Overhead due to dedup:

Computing the index causes 10% additional overhead on CPU. Indexing additional storage is
less than 1% in the table.

Write path overview:

Seq Writes will be fingerprinted at 4K chunk size with SHA1.

Read Path Overview

Check the extent cache for data
If not, if it has fingerprints, check Content Cache
if not, read from Extent store
if it has fingerprints in egroup map, populate the read into content cache
else populate the read into Extent cache.

We have further LRU and hash tables, single touch and multiple touch LRUs for memory
and flash chunks. I will explain this more later.

Glags Configurable:

stargate_content_cache_max_flash
stargate_content_cache_max_memory
stargate_content_cache_single_touch_memory_pct,
stargate_content_cache_single_touch_flash_pct

Metrics

http://<CVM Name or IP>:2009/h/vars?regex=stargate.dedup

http://<CVM Name or IP>:2009/h/vars?regex=content_cache

I will expand this blog after VMworld2013 - with various terminologies used and also
how this helps boot storms, virus scan,what percentage of hot tier usage is reduced.

Config: Create a container with Finger print option
ID                        : 38608
    Name                      : dedup-test-container
   VStore Name(s)            : dedup-test-container
    Random I/O Pri Order      : SSD-PCIe,SSD-SATA,DAS-SATA
    Sequential I/O Pri Order : SSD-PCIe,SSD-SATA,DAS-SATA
    Oplog Configured          : true
    Oplog Highly Available    : true
    Fingerprint On Write      : on

↧

What node removal process does in the background ?

September 4, 2013, 4:28 pm

≫ Next: Nutanix cluster start -Troubleshooting

≪ Previous: Introducing NEDE ( Nutanix Elastic Dedup Engine)

Note: These two commands should work and has been tested in QA.
ncli host start-remove id=X
ncli host get-remove-status
ncli host remove-finish id=X says that host MARKED_FOR_REMOVAL_BUT_NOT_DETACHABLE

Look at dynamic_ring_changer.out and dynamic_ring_changer.INF), Cassandra Logs for any crashes,

if it does not complete.

Please note that the following explains how a node removal works: (please note that using the procedure other than Nutanix support/Engineering will cause data corruption)

In older version of the NOS, if you have to manually remove the node, follow this procedure with
help from nutanix support

1. nodetool -h localhost ring - find the token

2. dynamic remove node

ring_changer -node_ring_token=O8yDYNgicJraBlfZrHAsORTseZq0MQ

2ke6uhuzNh8Y6bMLEH046M7yQPz5q5 -do_delete -dynamic_ring_change=true

( you can skip_keyspaces="stats")

ring_changer -node_ring_token="KlbSuLpdEFJDIoUXp1TPVEwmcYrlo9pJlm4yHemOtpmnBowMcyYQAbTcF8Vh" -dynamic_ring_change=true -skip_keyspaces="stats,alerts_keyspace,alerts_index_keyspace,medusa_extentgroupaccessdatamap,pithos"

3. Look at dynamic_ring_changer.out and dynamic_ring_changer.INFO, Cassandra Logs for any crashes,

if it does not complete.

4. compact and remove stats if there are cassandra crashes

a. Connect to cassandra using 'cassandra-cli'

cassandra-cli -h localhost -p 9160 -k stats

b. Remove the desired rows:

del stats_realtime['PWhm:vdisk_usage'];

del stats_realtime['nYeQ:vdisk_derived_usage'];

del stats_realtime['ha4h:vdisk_perf'];
del stats_realtime['E0Tl:vdisk_frontend_adapter_perf'];

c. Exit

quit;
d.for i in `svmips`; do echo $i;ssh $i "source /etc/profile;nodetool -h localhost compact";done

5. run the ring_changer again

6. Verify all the data has replication 2 (inspite the node to be removed is powered off)

for i in `svmips`; do ssh $i "cd data/stargate-storage/disks; find . -name \*.egroup -print|cut -d/ -f6"; done|sort | uniq -c|grep -v " 2 "

It will print extent groups with replication other than 2 .

7. Now we can verify zeus_config (zeus_config_printer)

- make sure all the disks have data migrated is true and disk marked to remove.(

to_remove: true
data_migrated: true)

- node has kOkToBeRemoved and node_removal_ack: 273 (0x111)

0x100 - zookeeper ok to be removed
0x10 -- cassandra ok to be removed
0x1 -curator ok to be removed.

8. Now run
ncli host remove-finish id=X

9. nodetool -h localhost removetoken O8yDYNgicJraBlfZrHAsORTseZq0MQ
2ke6uhuzNh8Y6bMLEH046M7yQPz5q5

↧

Nutanix cluster start -Troubleshooting

September 10, 2013, 1:21 pm

≫ Next: Nutanix DR troubleshooting

≪ Previous: What node removal process does in the background ?

1. Cluster create -

cluster discover-nodes

- Network connectivity - all the nodes in same broadcast domain

- /etc/nutanix/factory_config.json - has right config - node location, serial number

- avahi-browser (/var/log/messages)

- genesis running

2. Cluster start - three main process that needs to work and svm boot , rest of the process - prism, pithos,

even stargate will startup if these process work.

a. Genesis - Logs in data/logs/genesis.out

b. Zookeeper - data/logs/zookeeper.out

c. cassandra - data/logs/cassandra/system.log
d. svm_boot - /usr/local/nutanix/bootstrap.log

3. Genesis issues:

a. ESXi password

b. ESXi network /internal vswitch

c. Genesis was started as root

pkill genesis and rm /home/nutanix/data/locks/genesis, chown genesis.out to nutanix

c. Wrong zookeeper config

E0725 13:15:55.644642 14286 configuration_validator.cc:532] Zeus config check failed: invalid management_server_name 192.168.5.1
F0725 13:15:55.646738 14286 zeus.cc:698] Check failed: validator->config_valid() logical_timestamp: 44

4. zookeeper fails to start

a. /etc/hosts on all the hosts does not have same zookeeper nodes (zk1/zk2 and zk3)

https://nutanix.atlassian.net/browse/ENG-9697

b. snapshot corruption:

https://na4.salesforce.com/articles/Knowledge_Base/ENG-7471-Procedure-to-reboot-shutdown-a-CVM?popup=true

c. edit_zeus was used , there was a misconfiguration or zeus_config validation bug

5. cassandra failures

- SSD tier full/ SSD tier not accessible

- cassandra configs did not reset during cluster destroy -

ERROR [main] 2012-06-19 14:21:03,868 AbstractCassandraDaemon.java (line 154) Fatal exception during initialization
org.apache.cassandra.config.ConfigurationException: enable_cluster_name_change is true but saved cluster name '1131' does not match the allowed prior name 'Test Cluster' and the configured name '1132'

- timestamp in future.

6. CVM boot failures

svm_boot - disk inventory, formatting and mounting disks , vmx configs ( repopulating vmx disk entries)

Marker : /tmp/svm_boot_succeeded

/usr/local/nutanix/bootstrap/log/svm_boot.log/gen2_svm_boot.log

/usr/local/nutanix/bootstrap/bin/svm_boot_reboot.dat ( this prevents svm_boot to be run)

copies ipmitool to ESXi

↧

Nutanix DR troubleshooting

September 12, 2013, 9:09 am

≫ Next: vmware: Unidesk VM on NFS taking about a minute to start booting up.

≪ Previous: Nutanix cluster start -Troubleshooting

This is my scratch pad that I use for DR troubleshooting.

Basics:
Cerebro Master(CM): - maintains persistent metadata (PDs/VMs/Snapshots), acts on Active PD schedule, replicate snapshots, handshake with remote CM, receives remote snapshots. one per cluster.
Elected by zookeeper, register/start/stop/unregister VMs by communicating with hyperint.
Offload replicating individual file to Cerebro slaves.

Cerebro Slaves: - runs on every node, receives request from the master to replicate NFS file, request
might contain a reference NFS file that has been replicated earlier. Compute and ship diffs from the reference NFS file.

Information about remote cluster maintained in zookeeper. - ip and port info of remote cerebro,
if proxy needed , maintains mapping between container (datastore ) names, maintains transfer rate
(within max allowed)

Cerebro slave asks stargate to replicate the file. So it leverages stargate for actual datatransfer.

It is true DDR (Distributed DR) - all cerebro slaves and stargate work to replicate the data.Log files to look for.

-data/logs/cerebro.[INFO,ERROR], data/logs/cerebro.out

-data/logs/stargate.INFO,ERROR - grep -i Cerebro

three components that are mostly used for DR - prism_gateway, cerebro(2020), stargate(2009)
interacts with ESXi to register VM when rolling back.

Quick topology

ProtectionDomain(VM:xclone) -- Primary Container:LocalCTR---<primary SVM ips> - ports 2020/2009

====== ports 2009/2020 --<remoteDR SVMips> ---RemoteCTR_noSSD (.snapshot directory)

Troubleshooting steps:

- links http://remote-site-cvm:2009 and 2020 (to all remote CVMs and from all local CVMs) -- do it on both sides.

Traces:

http://cvm:2020

http://cvm:2020/h/traces

http://cvm:2009/h/traces

Verify Routes, so that replication traffic goes through non-production path

Quick steps:

1. Note the container that has VMs that has to be replicated.

2. Create container remote site (no SSD)

3. Create Remote sites in both location

ncli> rs ls
    Name                      : backup
    Replication Bandwidth     :
    Remote Address(es)        : remote_cvm1_ip,

remote_cvm2_ip,etc

Container Map : LocalCTR:Remote_DR_CTR
Proxy Enabled

4. Create the Protection Domain
ncli pd create name="Platinum-1h" on primary site.

5. Add VMs in the local CTR defined in (ncli rs ls) to the protection domain

pd protect name="Platinum-1h" vm-names="xclone"

6. Create one time snapshot to verify replication is working

pd create-one-time-snapshot name="Platinum-1h" remote-sites="backup" retention-time=36000

(if you see oob error repeatedly, pkill cerebro and start cerebro as well restart prism bootstrapdaemon - call nutanix support)

6b. pd set-schedule name="Platinum-1h" interval="3600" remote-sites="backup" min-snap-retention-count=10 retention-policy=1:30 - take snapshot every 1 hour, retain atleast 10 counts,

7. ncli> pd list-replication-status

ID                        : 16204522
    Protection Domain         : Platinum-1h
    Replication Operation     : Sending
    Start Time                : 09/12/2013 09:51:33 PDT
    Remote Site               : backup
    Snapshot Id               : 16204518
    Aborted                   : false
    Paused                    : false
    Bytes Completed           : 17.57 GB (18,861,841,805 bytes)
    Complete Percent          : 6.0113173

8. list previous snaps

ncli> pd ls-snaps name=Platinum-1h

    ID                        : 16883844
    Protection Domain         : Platinum-1h
    Create Time               : 09/12/2013 13:52:42 PDT
    Expiry Time               : 09/13/2013 19:52:42 PDT
    Virtual Machine(s)        : 4

        VM Name                   : xclone
        Consistency Group         :xclone
        Power state on recovery   : Powered On

    Replicated To Site(s)     : backup

    ID                        : 17258039
    Protection Domain         : Platinum-1h
    Create Time               : 09/12/2013 15:52:43 PDT
    Expiry Time               : 09/13/2013 21:52:43 PDT
    Virtual Machine(s)        : 4

        VM Name                   : xclone
        Consistency Group         : xclone
        Power state on recovery   : Powered On

        Replicated To Site(s)     : backup

    ID                        : 16967424
    Protection Domain         : Platinum-1h
    Create Time               : 09/12/2013 14:52:42 PDT
    Expiry Time               : 09/13/2013 20:52:42 PDT
    Virtual Machine(s)        : 4

        VM Name                   : xclone
        Consistency Group         : xclone
        Power state on recovery   : Powered On

Additional notes:
1. If Remote site is in different subnet, try (all remote CVMs) links http://remote_host:2009 and 2020, if it does not work,
enable iptable rules:

for i in `svmips`; do ssh $i "sudo iptables -t filter -A WORLDLIST -p tcp -m tcp --dport 2009 -j ACCEPT;sudo service iptables save;sudo /etc/init.d/iptables save"; done

this opens up 2009 and needs to be done on both sides. Please repeat for port 2020 as well

Run this command on one local as well as one remote CVM site to verify the connectivity:

for cvm in `svmips`; do (echo "From the CVM $cvm:"; ssh -q -t $cvm 'for i in ` source /etc/profile;ncli rs ls |grep Remote|cut -d : -f2|sed 's/,//g'`; do echo Checking stargate and cerebro port connection to $i ...; nc -v -w 2 $i -z 2009; nc -v -w 2 $i -z 2020; done');done

From the CVM 192.168.3.207:

Checking stargate and cerebro port connection to 192.168.3.191 ...
Connection to 192.168.3.191 2009 port [tcp/news] succeeded!
Connection to 192.168.3.191 2020 port [tcp/xinupageserver] succeeded!
Checking stargate and cerebro port connection to 192.168.3.192 ...
Connection to 192.168.3.192 2009 port [tcp/news] succeeded!
Connection to 192.168.3.192 2020 port [tcp/xinupageserver] succeeded!
Checking stargate and cerebro port connection to 192.168.3.193 ...
Connection to 192.168.3.193 2009 port [tcp/news] succeeded!
Connection to 192.168.3.193 2020 port [tcp/xinupageserver] succeeded!
From the CVM 192.168.3.208:

Checking stargate and cerebro port connection to 192.168.3.191 ...
Connection to 192.168.3.191 2009 port [tcp/news] succeeded!
Connection to 192.168.3.191 2020 port [tcp/xinupageserver] succeeded!
Checking stargate and cerebro port connection to 192.168.3.192 ...
Connection to 192.168.3.192 2009 port [tcp/news] succeeded!
Connection to 192.168.3.192 2020 port [tcp/xinupageserver] succeeded!
Checking stargate and cerebro port connection to 192.168.3.193 ...
Connection to 192.168.3.193 2009 port [tcp/news] succeeded!
Connection to 192.168.3.193 2020 port [tcp/xinupageserver] succeeded!
From the CVM 192.168.3.209:

Checking stargate and cerebro port connection to 192.168.3.191 ...
Connection to 192.168.3.191 2009 port [tcp/news] succeeded!
Connection to 192.168.3.191 2020 port [tcp/xinupageserver] succeeded!
Checking stargate and cerebro port connection to 192.168.3.192 ...
Connection to 192.168.3.192 2009 port [tcp/news] succeeded!
Connection to 192.168.3.192 2020 port [tcp/xinupageserver] succeeded!
Checking stargate and cerebro port connection to 192.168.3.193 ...
Connection to 192.168.3.193 2009 port [tcp/news] succeeded!
Connection to 192.168.3.193 2020 port [tcp/xinupageserver] succeeded!

If it does not work, enable the iptable rules

Enable 2020 and 2009 on remote site:
for i in ` source /etc/profile;ncli rs ls |grep Remote|cut -d : -f2|sed 's/,//g'`; do ssh $i sudo iptables -t filter -A WORLDLIST -p tcp -m tcp --dport 2009 -j ACCEPT;sudo service iptables save; done
for i in ` source /etc/profile;ncli rs ls |grep Remote|cut -d : -f2|sed 's/,//g'`; do ssh $i sudo iptables -t filter -A WORLDLIST -p tcp -m tcp --dport 2020 -j ACCEPT;sudo service iptables save; done

Enable it locally:
for i in `svmips`; do ssh $i "sudo iptables -t filter -A WORLDLIST -p tcp -m tcp --dport 2009 -j ACCEPT;sudo service iptables save"; done
for i in `svmips`; do ssh $i "sudo iptables -t filter -A WORLDLIST -p tcp -m tcp --dport 2020 -j ACCEPT;sudo service iptables save"; done

Quick Verification:

. verify the directory listing on remote container

sudo mount localhost:/Rep_container /mnt

cd /mnt/.snapshot

and you can see the replication files there.

On the Primary site Cerebro master 2020 page ( you can find it from http://cvmip:2020 - it will point to master) Cerebro

Master

Protection Domains (1)

Name	Is Active	Total Consistency Groups	Total Entities	Total Snapshots	Total Remotes	Executing Meta Ops	Tx or Rx KB/s
Platinum-1h	yes	4	4	3	1	1	46561.28

Remote Bandwidth

Remote	KB/s
	Tx	Rx
backup	46561.28

On the PD webpage ( ?pd=Platinum-1h )-- --:Start time--:-- 1020130910-21:57:41-GMT-070004k Build Versionrelease-congo-3.1.2-stable-101e3a8c77c9b8a2b3173a9e040bc41b339cdb5cBuild Last Commit Date2013-08-27 23:43:31 -0700Component ID105Incarnation ID4475694Local stargate incarnation ID4623290Highest allocated opid18095Highest contiguous completed opid18094

Master

Protection Domain 'Platinum-1h'
Is Active	yes
Total Consistency Groups	4
Total Entities	4
Total Snapshots	3
Total Remotes	1
Executing Meta Ops	1
Tx KB/s	54947.84

Consistency Groups (4)

Name	Total VMs	Total NFS Files	To Remove
x-clone	1	0	no

Snapshots (3)

Handle	Start Date	Finish Date	Expiry Date	Total Consistency Groups	Total Files	Total GB	Aborted	Replicated To Remotes
(3103, 1361475834422355, 16204518)	20130912-09:51:31-GMT-0700	20130912-09:51:32-GMT-0700	20130916-13:51:32-GMT-0700	4	36	1871	no	-
(3103, 1361475834422355, 16205073)	20130912-09:51:39-GMT-0700	20130912-09:51:40-GMT-0700	20130912-10:51:40-GMT-0700	4	36	1871	no	-
(3103, 1361475834422355, 16507948)	20130912-11:00:14-GMT-0700	20130912-11:00:15-GMT-0700	20130912-12:00:15-GMT-0700	4	36	1871	no	-

Meta Ops (1)

Meta Op Id	Creation Date	Is Paused	To Abort
16219199	20130912-09:53:45-GMT-0700	no	no

Snapshot Schedule

Periodic Schedule

Out Of Band Schedules
[no out of band schedules]

Pending Actions

replication {
  remote_name: "backup"
  snapshot_handle_vec {
    cluster_ id: 3103
    cluster_incarnation_id: 1361475834422355
    entity_id: 16507948
  }
}

Latest completed top-level meta ops

Meta opid	Opcode	Creation time	Duration (secs)	Attributes	Aborted	Abort detail
15421792	SnapshotProtectionDomain	20130911-15:07:48-GMT-0700	1	snapshot=(3103, 1361475834422355, 15421793)	No
15421797	Replicate	20130911-15:07:49-GMT-0700	36555	remote=backup snapshot=(3103, 1361475834422355, 15421793) replicated_bytes=2009518144619 tx_bytes=1109779340685	No
16204509	SnapshotGarbageCollector	20130912-01:17:05-GMT-0700	1	snapshot=(3103, 1361475834422355, 15421793)	No
16204517	SnapshotProtectionDomain	20130912-09:51:31-GMT-0700	1	snapshot=(3103, 1361475834422355, 16204518)	No
16205072	SnapshotProtectionDomain	20130912-09:51:39-GMT-0700	1	snapshot=(3103, 1361475834422355, 16205073)	No
16204522	Replicate	20130912-09:51:33-GMT-0700	132	remote=backup snapshot=(3103, 1361475834422355, 16204518) replicated_bytes=145570717069 tx_bytes=21713968525
16205072	SnapshotProtectionDomain	20130912-09:51:39-GMT-0700	1	snapshot=(3103, 1361475834422355, 16205073)	No
16204522	Replicate	20130912-09:51:33-GMT-0700	132	remote=backup snapshot=(3103, 1361475834422355, 16204518) replicated_bytes=145570717069 tx_bytes=21713968525	Yes	Received RPC to abort
16507947	SnapshotProtectionDomain	20130912-11:00:14-GMT-0700	1	snapshot=(3103, 1361475834422355, 16507948)	No

Advanced Cerebro Commands:
cerebro_cli query_protection_domain pd_name list_consistency_groups=true

Remove a VM from PD
cerebro_cli remove_consistency_group pd_name vm_name

Sample Error
On cerebro Master:
grep Silver-24h data/logs/cerebro.*

cerebro.WARNING:E0913 11:00:59.830462 3361 protection_domain.cc:1601] notification=ProtectionDomainReplicationFailure protection_domain_name=Silver-24hremote_name=backup timestamp_usecs=1379095259228275 reason=Attempt to lookup attributes of path /AML_NFS01/.snapshot/48/3103-1361475834422355-18439248/AML-CTX-WIN7-A failed with NFS error 2

We found this error was due to multiple VMs in the PD accessing same vmdk due to independent
non-persistent config
./AML-CTX-WIN7-A/AML-CTX-WIN7-A.vmx
scsi0:0.mode = "independent-nonpersistent"
./AML-CTX-WIN7-B/AML-CTX-WIN7-B.vmx
scsi0:0.mode = "independent-nonpersistent"
./AML-CTX-WIN7-C/AML-CTX-WIN7-C.vmx
scsi0:0.mode = "independent-nonpersistent"
./AML-CTX-WIN7-D/AML-CTX-WIN7-D.vmx
scsi0:0.mode = "independent-nonpersistent"
From NFS master page (/h/traces/ -- nfs adapter --> error page)

The following snapshot is Cerebro master webpage:

↧

vmware: Unidesk VM on NFS taking about a minute to start booting up.

October 23, 2013, 9:59 am

≫ Next: No SAN, No Cry

≪ Previous: Nutanix DR troubleshooting

Here we look at traces (NFS) to find few inefficiencies of VMware NFS client
unidesk config:

there is Workaround available: (from nutanix/unidesk support)

The workaround is to follow these guidelines:
1. Ensure the root of the datastore has as few directories as possible.
2. The desktop VM's files need to be two directories below the root of the data store.

Assuming a greenfield setup, this is how to setup the directory structure.

1. Create one directory for the desktop VMs.
2. Create one directory for all the CPs and the rest of the infrastructure VMs.
3. If VMs were automatically deployed, unregister VMs, move into the appropriate subdirectory, register VMs.
4. Use the "synchronize infrastructure" in the Unidesk Management Appliance UI to discover the new VM ids which change after re-registering.

Example of a 3 desktop setup:
/vmfs/volumes/ctr1/DesktopVMs/VM1
/vmfs/volumes/ctr1/DesktopVMs/VM2
/vmfs/volumes/ctr1/DesktopVMs/VM3

/vmfs/volumes/ctr1/InfraVMs/CachePoint1
/vmfs/volumes/ctr1/InfraVMs/CachePoint2
/vmfs/volumes/ctr1/InfraVMs/CachePoint3
/vmfs/volumes/ctr1/InfraVMs/MA
/vmfs/volumes/ctr1/InfraVMs/MasterCachePoint
/vmfs/volumes/ctr1/InfraVMs/InstallationMachine1

More details on the issue:

How to capture NFS traffic on Nutanix, so that we can review through the wireshark.

1. Power off all the VMs other than CVM and Unidesk VM, so that we get clean trace.

sudo tcpdump -i eth1 port 2049 -w nfs.cap

Bootup time in Vcenter:
~Unidesk-TESTVM Completed Administrator pax-vCenter
start time:10/21/2013 1:53:19 PM
Bootup complete as per Vcenter: 10/21/2013 1:54:13 PM ( it still needs to boot the OS), It waits for more than
40 seconds at 95%)

Four things I can see that are odd :

1. NFS client is looking for entire / directory rather than focussing on the files .

(Unidesk KB specifically addresses how to reduce this issue this : http://www.unidesk.com/support/kb/unidesk-configuration-considerations-nfs-based-storage-including-nutanix-your-boot-images)

and looking UP sre-theos2,sretheos1 which are not part of ~Unidesk directory.

2. It is creating too many vmBigTestFile ( it should be enough to create one).

3. REDO logs - looking up and then verifying and then creating the files (

so it will take while before sending create files)

repeated create and remove of redo logs ( not sure why ESXi nfs client has to do that)

4. Workers (4 workers) do not work parallely, they seem to work serially. - deduced

from wireshark and vmware.log file. During the Pause I see other worker is active.

On our system, it took about 56seconds. It is a 2400 system.

It took about 37 seconds for opening and closing of the vmdks. ( of which 13-15 seconds was ESXi waiting after open successful to close).

It is same as your system.

/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM # grep vmfs vmware.log|head -1

2013-10-21T20:28:03.795Z| vmx| I120: Command line: "/bin/vmx""-ssched.group=host/user""-#""product=4;name=VMware ESX;version=5.1.0;buildnumber=1065491;licensename=VMware ESX Server;licenseversion=5.0;""-@""duplex=3;msgs=ui""/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM.vmx"

/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM # grep vmfs vmware.log|tail -1

2013-10-21T20:28:40.674Z| Worker#3| I120: DISKLIB-LIB : Opened "./~Unidesk-TESTVM_13.vmdk.REDO_6uPnSb" (flags 0x8, type vmfsSparse).

/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM # grep Worker#3 vmware.log

2013-10-21T20:28:04.503Z| Worker#3| I120: DISK: OPEN scsi0:3 '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_17.vmdk' independent-nonpersistent R[]

2013-10-21T20:28:04.870Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:04.877Z| Worker#3| I120: DISKLIB-VMFS : "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_17-flat.vmdk" : open successful (14) size = 10737418240, hd = 157193863. Type 3

2013-10-21T20:28:04.877Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_17-flat.vmdk" (0xe)

2013-10-21T20:28:04.877Z| Worker#3| I120: DISKLIB-LINK : Opened '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_17.vmdk' (0xe): vmfs, 20971520 sectors / 10 GB.

2013-10-21T20:28:04.922Z| Worker#3| I120: DISKLIB-LIB : Opened "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_17.vmdk" (flags 0xe, type vmfs).

2013-10-21T20:28:05.847Z| Worker#3| I120: DISKLIB-LIB : DiskLibCreateCreateParam: vmfssparse grain size set to : 1

2013-10-21T20:28:06.030Z| Worker#3| I120: DISKLIB-LIB : CREATE CHILD: "./~Unidesk-TESTVM_17.vmdk.REDO_grMSJ6" -- vmfsSparse cowGran=0 allocType=0

2013-10-21T20:28:06.033Z| Worker#3| I120: CREATE-CHILD: Creating disk backed by 'default'

2013-10-21T20:28:06.656Z| Worker#3| I120: DISKLIB-VMFS_SPARSE : VmfsSparseExtentCreate: "./~Unidesk-TESTVM_17.vmdk-delta.REDO_grMSJ6" : success

2013-10-21T20:28:06.706Z| Worker#3| I120: DISKLIB-DSCPTR: "./~Unidesk-TESTVM_17.vmdk.REDO_grMSJ6" : creation successful.

2013-10-21T20:28:07.304Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:07.310Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_17.vmdk-delta.REDO_grMSJ6" : open successful (17) size = 24576, hd = 0. Type 8

-- 2 seconds

2013-10-21T20:28:09.704Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_17.vmdk-delta.REDO_grMSJ6" : closed.

2013-10-21T20:28:10.088Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:10.100Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_17.vmdk-delta.REDO_grMSJ6" : open successful (8) size = 24576, hd = 152606383. Type 8

2013-10-21T20:28:10.100Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_17.vmdk-delta.REDO_grMSJ6" (0x8)

2013-10-21T20:28:10.100Z| Worker#3| I120: DISKLIB-LINK : Opened './~Unidesk-TESTVM_17.vmdk.REDO_grMSJ6' (0x8): vmfsSparse, 20971520 sectors / 10 GB.

2013-10-21T20:28:10.100Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 1, numSubChains = 1

2013-10-21T20:28:10.100Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 152606383, extentType = 0

2013-10-21T20:28:10.123Z| Worker#3| I120: DISKLIB-LIB : Opened "./~Unidesk-TESTVM_17.vmdk.REDO_grMSJ6" (flags 0x8, type vmfsSparse).

2013-10-21T20:28:10.123Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 2, numSubChains = 1

2013-10-21T20:28:10.123Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 157193863, extentType = 2

2013-10-21T20:28:10.124Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(1) fid = 152606383, extentType = 0

2013-10-21T20:28:10.149Z| Worker#3| I120: DISK: Disk './~Unidesk-TESTVM_17.vmdk.REDO_grMSJ6' has UUID '60 00 c2 95 e9 99 ac 11-ff da ee ef 7d d0 cc 3d'

2013-10-21T20:28:10.149Z| Worker#3| I120: DISK: OPEN './~Unidesk-TESTVM_17.vmdk.REDO_grMSJ6' Geo (1305/255/63) BIOS Geo (0/0/0)

2013-10-21T20:28:10.196Z| Worker#3| I120: DISK: OPEN scsi0:8 '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_21.vmdk' independent-nonpersistent R[]

2013-10-21T20:28:10.386Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:10.495Z| Worker#3| I120: DISKLIB-VMFS : "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_21-flat.vmdk" : open successful (14) size = 10737418240, hd = 158897845. Type 3

2013-10-21T20:28:10.495Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_21-flat.vmdk" (0xe)

2013-10-21T20:28:10.495Z| Worker#3| I120: DISKLIB-LINK : Opened '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_21.vmdk' (0xe): vmfs, 20971520 sectors / 10 GB.

2013-10-21T20:28:10.498Z| Worker#3| I120: DISKLIB-LIB : Opened "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_21.vmdk" (flags 0xe, type vmfs).

2013-10-21T20:28:11.418Z| Worker#3| I120: DISKLIB-LIB : DiskLibCreateCreateParam: vmfssparse grain size set to : 1

2013-10-21T20:28:11.514Z| Worker#3| I120: DISKLIB-LIB : CREATE CHILD: "./~Unidesk-TESTVM_21.vmdk.REDO_QnAph7" -- vmfsSparse cowGran=0 allocType=0

2013-10-21T20:28:11.523Z| Worker#3| I120: CREATE-CHILD: Creating disk backed by 'default'

2013-10-21T20:28:12.114Z| Worker#3| I120: DISKLIB-VMFS_SPARSE : VmfsSparseExtentCreate: "./~Unidesk-TESTVM_21.vmdk-delta.REDO_QnAph7" : success

2013-10-21T20:28:12.258Z| Worker#3| I120: DISKLIB-DSCPTR: "./~Unidesk-TESTVM_21.vmdk.REDO_QnAph7" : creation successful.

2013-10-21T20:28:13.161Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:13.168Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_21.vmdk-delta.REDO_QnAph7" : open successful (17) size = 24576, hd = 0. Type 8

2 seconds

2013-10-21T20:28:15.207Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_21.vmdk-delta.REDO_QnAph7" : closed.

2013-10-21T20:28:15.775Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:15.819Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_21.vmdk-delta.REDO_QnAph7" : open successful (8) size = 24576, hd = 141858511. Type 8

2013-10-21T20:28:15.819Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_21.vmdk-delta.REDO_QnAph7" (0x8)

2013-10-21T20:28:15.819Z| Worker#3| I120: DISKLIB-LINK : Opened './~Unidesk-TESTVM_21.vmdk.REDO_QnAph7' (0x8): vmfsSparse, 20971520 sectors / 10 GB.

2013-10-21T20:28:15.819Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 1, numSubChains = 1

2013-10-21T20:28:15.819Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 141858511, extentType = 0

2013-10-21T20:28:15.837Z| Worker#3| I120: DISKLIB-LIB : Opened "./~Unidesk-TESTVM_21.vmdk.REDO_QnAph7" (flags 0x8, type vmfsSparse).

2013-10-21T20:28:15.838Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 2, numSubChains = 1

2013-10-21T20:28:15.838Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 158897845, extentType = 2

2013-10-21T20:28:15.838Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(1) fid = 141858511, extentType = 0

2013-10-21T20:28:15.876Z| Worker#3| I120: DISK: Disk './~Unidesk-TESTVM_21.vmdk.REDO_QnAph7' has UUID '60 00 c2 95 84 24 a5 34-e5 25 bc b7 82 0f f9 35'

2013-10-21T20:28:15.876Z| Worker#3| I120: DISK: OPEN './~Unidesk-TESTVM_21.vmdk.REDO_QnAph7' Geo (1305/255/63) BIOS Geo (0/0/0)

2013-10-21T20:28:15.957Z| Worker#3| I120: DISK: OPEN scsi0:12 '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_25.vmdk' independent-nonpersistent R[]

2013-10-21T20:28:16.235Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:16.255Z| Worker#3| I120: DISKLIB-VMFS : "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_25-flat.vmdk" : open successful (14) size = 10737418240, hd = 154441437. Type 3

2013-10-21T20:28:16.255Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_25-flat.vmdk" (0xe)

2013-10-21T20:28:16.255Z| Worker#3| I120: DISKLIB-LINK : Opened '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_25.vmdk' (0xe): vmfs, 20971520 sectors / 10 GB.

2013-10-21T20:28:16.380Z| Worker#3| I120: DISKLIB-LIB : Opened "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_25.vmdk" (flags 0xe, type vmfs).

2013-10-21T20:28:17.299Z| Worker#3| I120: DISKLIB-LIB : DiskLibCreateCreateParam: vmfssparse grain size set to : 1

2013-10-21T20:28:17.497Z| Worker#3| I120: DISKLIB-LIB : CREATE CHILD: "./~Unidesk-TESTVM_25.vmdk.REDO_4Isqws" -- vmfsSparse cowGran=0 allocType=0

2013-10-21T20:28:17.500Z| Worker#3| I120: CREATE-CHILD: Creating disk backed by 'default'

2013-10-21T20:28:18.249Z| Worker#3| I120: DISKLIB-VMFS_SPARSE : VmfsSparseExtentCreate: "./~Unidesk-TESTVM_25.vmdk-delta.REDO_4Isqws" : success

2013-10-21T20:28:18.423Z| Worker#3| I120: DISKLIB-DSCPTR: "./~Unidesk-TESTVM_25.vmdk.REDO_4Isqws" : creation successful.

2013-10-21T20:28:19.280Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:19.304Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_25.vmdk-delta.REDO_4Isqws" : open successful (17) size = 24576, hd = 0. Type 8

2 seconds

2013-10-21T20:28:21.714Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_25.vmdk-delta.REDO_4Isqws" : closed.

2013-10-21T20:28:22.184Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:22.189Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_25.vmdk-delta.REDO_4Isqws" : open successful (8) size = 24576, hd = 155686649. Type 8

2013-10-21T20:28:22.189Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_25.vmdk-delta.REDO_4Isqws" (0x8)

2013-10-21T20:28:22.189Z| Worker#3| I120: DISKLIB-LINK : Opened './~Unidesk-TESTVM_25.vmdk.REDO_4Isqws' (0x8): vmfsSparse, 20971520 sectors / 10 GB.

2013-10-21T20:28:22.189Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 1, numSubChains = 1

2013-10-21T20:28:22.189Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 155686649, extentType = 0

2013-10-21T20:28:22.222Z| Worker#3| I120: DISKLIB-LIB : Opened "./~Unidesk-TESTVM_25.vmdk.REDO_4Isqws" (flags 0x8, type vmfsSparse).

2013-10-21T20:28:22.222Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 2, numSubChains = 1

2013-10-21T20:28:22.222Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 154441437, extentType = 2

2013-10-21T20:28:22.222Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(1) fid = 155686649, extentType = 0

2013-10-21T20:28:22.253Z| Worker#3| I120: DISK: Disk './~Unidesk-TESTVM_25.vmdk.REDO_4Isqws' has UUID '60 00 c2 96 68 ce 90 d3-83 68 6b 87 bc a4 a7 f6'

2013-10-21T20:28:22.253Z| Worker#3| I120: DISK: OPEN './~Unidesk-TESTVM_25.vmdk.REDO_4Isqws' Geo (1305/255/63) BIOS Geo (0/0/0)

2013-10-21T20:28:22.297Z| Worker#3| I120: DISK: OPEN scsi1:2 '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_2.vmdk' independent-nonpersistent R[]

2013-10-21T20:28:22.868Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:23.843Z| Worker#3| I120: DISKLIB-VMFS : "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_2-flat.vmdk" : open successful (14) size = 10737418240, hd = 151295754. Type 3

2013-10-21T20:28:23.843Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_2-flat.vmdk" (0xe)

2013-10-21T20:28:23.843Z| Worker#3| I120: DISKLIB-LINK : Opened '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_2.vmdk' (0xe): vmfs, 20971520 sectors / 10 GB.

2013-10-21T20:28:23.907Z| Worker#3| I120: DISKLIB-LIB : Opened "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_2.vmdk" (flags 0xe, type vmfs).

2013-10-21T20:28:24.402Z| Worker#3| I120: DISKLIB-LIB : DiskLibCreateCreateParam: vmfssparse grain size set to : 1

2013-10-21T20:28:24.483Z| Worker#3| I120: DISKLIB-LIB : CREATE CHILD: "./~Unidesk-TESTVM_2.vmdk.REDO_yXB1Cf" -- vmfsSparse cowGran=0 allocType=0

2013-10-21T20:28:24.532Z| Worker#3| I120: CREATE-CHILD: Creating disk backed by 'default'

2013-10-21T20:28:25.310Z| Worker#3| I120: DISKLIB-VMFS_SPARSE : VmfsSparseExtentCreate: "./~Unidesk-TESTVM_2.vmdk-delta.REDO_yXB1Cf" : success

2013-10-21T20:28:25.352Z| Worker#3| I120: DISKLIB-DSCPTR: "./~Unidesk-TESTVM_2.vmdk.REDO_yXB1Cf" : creation successful.

2013-10-21T20:28:26.020Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:26.026Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_2.vmdk-delta.REDO_yXB1Cf" : open successful (17) size = 24576, hd = 0. Type 8

2 seconds

2013-10-21T20:28:28.685Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_2.vmdk-delta.REDO_yXB1Cf" : closed.

2013-10-21T20:28:29.066Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:29.097Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_2.vmdk-delta.REDO_yXB1Cf" : open successful (8) size = 24576, hd = 140744474. Type 8

2013-10-21T20:28:29.097Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_2.vmdk-delta.REDO_yXB1Cf" (0x8)

2013-10-21T20:28:29.097Z| Worker#3| I120: DISKLIB-LINK : Opened './~Unidesk-TESTVM_2.vmdk.REDO_yXB1Cf' (0x8): vmfsSparse, 20971520 sectors / 10 GB.

2013-10-21T20:28:29.097Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 1, numSubChains = 1

2013-10-21T20:28:29.097Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 140744474, extentType = 0

2013-10-21T20:28:29.116Z| Worker#3| I120: DISKLIB-LIB : Opened "./~Unidesk-TESTVM_2.vmdk.REDO_yXB1Cf" (flags 0x8, type vmfsSparse).

2013-10-21T20:28:29.116Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 2, numSubChains = 1

2013-10-21T20:28:29.116Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 151295754, extentType = 2

2013-10-21T20:28:29.116Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(1) fid = 140744474, extentType = 0

2013-10-21T20:28:29.139Z| Worker#3| I120: DISK: Disk './~Unidesk-TESTVM_2.vmdk.REDO_yXB1Cf' has UUID '60 00 c2 9a 23 51 be 1d-8c 5a 0e c8 72 31 85 d2'

2013-10-21T20:28:29.139Z| Worker#3| I120: DISK: OPEN './~Unidesk-TESTVM_2.vmdk.REDO_yXB1Cf' Geo (1305/255/63) BIOS Geo (0/0/0)

2013-10-21T20:28:29.235Z| Worker#3| I120: DISK: OPEN scsi1:6 '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_6.vmdk' independent-nonpersistent R[]

2013-10-21T20:28:29.644Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:29.651Z| Worker#3| I120: DISKLIB-VMFS : "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_6-flat.vmdk" : open successful (14) size = 10737418240, hd = 199005995. Type 3

2013-10-21T20:28:29.651Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_6-flat.vmdk" (0xe)

2013-10-21T20:28:29.651Z| Worker#3| I120: DISKLIB-LINK : Opened '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_6.vmdk' (0xe): vmfs, 20971520 sectors / 10 GB.

2013-10-21T20:28:29.656Z| Worker#3| I120: DISKLIB-LIB : Opened "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_6.vmdk" (flags 0xe, type vmfs).

2013-10-21T20:28:30.684Z| Worker#3| I120: DISKLIB-LIB : DiskLibCreateCreateParam: vmfssparse grain size set to : 1

2013-10-21T20:28:30.766Z| Worker#3| I120: DISKLIB-LIB : CREATE CHILD: "./~Unidesk-TESTVM_6.vmdk.REDO_oUoy4b" -- vmfsSparse cowGran=0 allocType=0

2013-10-21T20:28:30.777Z| Worker#3| I120: CREATE-CHILD: Creating disk backed by 'default'

2013-10-21T20:28:31.230Z| Worker#3| I120: DISKLIB-VMFS_SPARSE : VmfsSparseExtentCreate: "./~Unidesk-TESTVM_6.vmdk-delta.REDO_oUoy4b" : success

2013-10-21T20:28:31.423Z| Worker#3| I120: DISKLIB-DSCPTR: "./~Unidesk-TESTVM_6.vmdk.REDO_oUoy4b" : creation successful.

2013-10-21T20:28:32.308Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:32.341Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_6.vmdk-delta.REDO_oUoy4b" : open successful (17) size = 24576, hd = 0. Type 8

--- 2 seconds

2013-10-21T20:28:34.457Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_6.vmdk-delta.REDO_oUoy4b" : closed.

2013-10-21T20:28:34.717Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:34.722Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_6.vmdk-delta.REDO_oUoy4b" : open successful (8) size = 24576, hd = 181901123. Type 8

2013-10-21T20:28:34.722Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_6.vmdk-delta.REDO_oUoy4b" (0x8)

2013-10-21T20:28:34.722Z| Worker#3| I120: DISKLIB-LINK : Opened './~Unidesk-TESTVM_6.vmdk.REDO_oUoy4b' (0x8): vmfsSparse, 20971520 sectors / 10 GB.

2013-10-21T20:28:34.722Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 1, numSubChains = 1

2013-10-21T20:28:34.722Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 181901123, extentType = 0

2013-10-21T20:28:34.769Z| Worker#3| I120: DISKLIB-LIB : Opened "./~Unidesk-TESTVM_6.vmdk.REDO_oUoy4b" (flags 0x8, type vmfsSparse).

2013-10-21T20:28:34.769Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 2, numSubChains = 1

2013-10-21T20:28:34.769Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 199005995, extentType = 2

2013-10-21T20:28:34.769Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(1) fid = 181901123, extentType = 0

2013-10-21T20:28:34.805Z| Worker#3| I120: DISK: Disk './~Unidesk-TESTVM_6.vmdk.REDO_oUoy4b' has UUID '60 00 c2 9c 9e 2d 8c 87-b7 d5 50 37 2a 7d 39 e7'

2013-10-21T20:28:34.805Z| Worker#3| I120: DISK: OPEN './~Unidesk-TESTVM_6.vmdk.REDO_oUoy4b' Geo (1305/255/63) BIOS Geo (0/0/0)

2013-10-21T20:28:34.897Z| Worker#3| I120: DISK: OPEN scsi1:14 '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_13.vmdk' independent-nonpersistent R[]

2013-10-21T20:28:35.334Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:35.348Z| Worker#3| I120: DISKLIB-VMFS : "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_13-flat.vmdk" : open successful (14) size = 3758096384, hd = 146904906. Type 3

2013-10-21T20:28:35.348Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_13-flat.vmdk" (0xe)

2013-10-21T20:28:35.348Z| Worker#3| I120: DISKLIB-LINK : Opened '/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_13.vmdk' (0xe): vmfs, 7340032 sectors / 3.5 GB.

2013-10-21T20:28:35.438Z| Worker#3| I120: DISKLIB-LIB : Opened "/vmfs/volumes/2fbb0409-ee3410bd/~Unidesk-TESTVM/~Unidesk-TESTVM_13.vmdk" (flags 0xe, type vmfs).

2013-10-21T20:28:35.876Z| Worker#3| I120: DISKLIB-LIB : DiskLibCreateCreateParam: vmfssparse grain size set to : 1

2013-10-21T20:28:35.954Z| Worker#3| I120: DISKLIB-LIB : CREATE CHILD: "./~Unidesk-TESTVM_13.vmdk.REDO_6uPnSb" -- vmfsSparse cowGran=0 allocType=0

2013-10-21T20:28:36.002Z| Worker#3| I120: CREATE-CHILD: Creating disk backed by 'default'

2013-10-21T20:28:36.632Z| Worker#3| I120: DISKLIB-VMFS_SPARSE : VmfsSparseExtentCreate: "./~Unidesk-TESTVM_13.vmdk-delta.REDO_6uPnSb" : success

2013-10-21T20:28:36.780Z| Worker#3| I120: DISKLIB-DSCPTR: "./~Unidesk-TESTVM_13.vmdk.REDO_6uPnSb" : creation successful.

2013-10-21T20:28:37.618Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:37.655Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_13.vmdk-delta.REDO_6uPnSb" : open successful (17) size = 12288, hd = 0. Type 8

3 seconds

2013-10-21T20:28:40.221Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_13.vmdk-delta.REDO_6uPnSb" : closed.

2013-10-21T20:28:40.562Z| Worker#3| I120: nutanix_nfs_plugin: Established VAAI session with NFS server 192.168.5.2

2013-10-21T20:28:40.594Z| Worker#3| I120: DISKLIB-VMFS : "./~Unidesk-TESTVM_13.vmdk-delta.REDO_6uPnSb" : open successful (8) size = 12288, hd = 157652962. Type 8

2013-10-21T20:28:40.594Z| Worker#3| I120: DISKLIB-DSCPTR: Opened [0]: "~Unidesk-TESTVM_13.vmdk-delta.REDO_6uPnSb" (0x8)

2013-10-21T20:28:40.594Z| Worker#3| I120: DISKLIB-LINK : Opened './~Unidesk-TESTVM_13.vmdk.REDO_6uPnSb' (0x8): vmfsSparse, 7340032 sectors / 3.5 GB.

2013-10-21T20:28:40.594Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 1, numSubChains = 1

2013-10-21T20:28:40.594Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 157652962, extentType = 0

2013-10-21T20:28:40.674Z| Worker#3| I120: DISKLIB-LIB : Opened "./~Unidesk-TESTVM_13.vmdk.REDO_6uPnSb" (flags 0x8, type vmfsSparse).

2013-10-21T20:28:40.674Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain: numLinks = 2, numSubChains = 1

2013-10-21T20:28:40.674Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(0) fid = 146904906, extentType = 2

2013-10-21T20:28:40.674Z| Worker#3| I120: DISKLIB-CHAINESX : ChainESXOpenSubChain:(1) fid = 157652962, extentType = 0

2013-10-21T20:28:40.757Z| Worker#3| I120: DISK: Disk './~Unidesk-TESTVM_13.vmdk.REDO_6uPnSb' has UUID '60 00 c2 96 78 0e 37 cd-a2 10 ae b7 8f 64 f0 68'

2013-10-21T20:28:40.757Z| Worker#3| I120: DISK: OPEN './~Unidesk-TESTVM_13.vmdk.REDO_6uPnSb' Geo (456/255/63) BIOS Geo (0/0/0)

↧

No SAN, No Cry

October 28, 2013, 3:22 pm

≫ Next: Enabling, Monitoring and Disabling Dedup

≪ Previous: vmware: Unidesk VM on NFS taking about a minute to start booting up.

"No SAN No Cry"

No, SAN, no cry;
No, SAN , no cry;
No, SAN, no cry;
No, SAN, no cry.

'Cause - 'cause - 'cause I remember when we used to sit
In a dark server room in Trenchtown,
Oba - obaserving the bad pWWN - yeah! -
Zoned with the good pWWN , yeah!
Storage Admins we have, oh, SAN Admins we have lost
Along the way, yeah!
In this great Nutanix converged future, you can forget your pLOGI;
So dismantle your Storage Array, I seh. Yeah!

No, SAN, no cry;
No, SAN, no cry. Eh, yeah!
Storage admin', don't shed no tears:
No, SAN, no cry. Eh!

Said - said - said I remember when we used to sit
In the dark server room in Trenchtown, yeah!
And then SAN Admin would make the Fiber Channel no shut,
I seh, PRLI crashing all through the nights, yeah!
Then SAN admin would reassign the LUN, say,
Of which to a proper server pwwn, yeah!
Now,Your compute is your storage
And so I've got Nutanix Converged Architecture.
Oh, while SAN is gone,
Everything's gonna be all right!
Everything's gonna be all right!
Everything's gonna be all right, yeah!
Everything's gonna be all right!
Everything's gonna be all right-a!
Everything's gonna be all right!
Everything's gonna be all right, yeah!
Everything's gonna be all right!

So no, SAN, no cry;
No, SAN, no cry.
I seh, O Server Admin - O Storage Admin', don't shed no tears;
No, SAN, no cry, eh.

No, SAN - no, SAN - no, SAN, no cry;
No, SAN, no cry.
One more time I got to say:
O Server Admin - Storage Admin', please don't shed no tears;
No, SAN, no cry.

↧

Enabling, Monitoring and Disabling Dedup

December 12, 2013, 3:55 pm

≫ Next: Disable video in Nutanix UI login screen

≪ Previous: No SAN, No Cry

dedup info:
http://nutanix.blogspot.com/2013/08/introducing-nede-nutanix-elastic-dedup.html

Enabling Dedup

1.
- it can be done per container
ncli ctr edit name=xyz fingerprint-on-write=on
- it can be done per vdisk
ncli vdisk edit name NFS:2389 fingerprint-on-write=on

2. When you upgrade to 3.5.1, enable dedup right away if needed, as FP is done on write, we may not
get the benefit of dedup if enable later ( unless you vdisk manipulator tool is used).

Why it works on upgrade is curator converts 16 MB extent group to 4 MB extentgroup, so there
will be a lot write activities which will be used for Finger printing.

3. Increase medusa_extent_group_id_map_cache_size_mb=2048 (stargate gflag) to reduce evictions. Make sure of 24G CVM

4. Content cache and extent cache are allocated based on CVM memory (ENG-10798)

Monitoring Dedup example: - 3.5.2 prism GUI has dedup stats.

have to look around in 2009 page and Curator master logs.)

#0. Overall container usage & amount of data which is fingerprinted:

On master curator (data/logs/curator.INFO)

I1203 12:27:53.777602 curator_execute_job_op.cc:2452] ContainerEgroupFileSizeBytes[912] = 4168785526784

I1203 12:27:53.777606 curator_execute_job_op.cc:2452] ContainerEgroupFileSizeBytes[107907] = 1100420612096

I1203 12:27:53.777609 curator_execute_job_op.cc:2452] ContainerUntransformedSizeBytes[912] = 4136075722752

I1203 12:27:53.777613 curator_execute_job_op.cc:2452] ContainerUntransformedSizeBytes[107907] = 1100301139968

I1203 12:27:53.777616 curator_execute_job_op.cc:2452] ContainerInternalGarbageBytes[912] = 32709804032

I1203 12:27:53.777621 curator_execute_job_op.cc:2452] ContainerInternalGarbageBytes[107907] = 119472128

I1203 12:27:53.777623 curator_execute_job_op.cc:2452] ContainerFingerprintedBytes[912] = 3699631161344

From above, looks like customer has two containers (912 & 107907) and 912 is the one for which fingerprint on write has been turned on. Almost 90% of it is fingerprinted (3.6T out of 4.1T).

#1. Read Path Live (30sec) Metrics

wget -O- 'http://localhost:2009/h/vars?regex=stargate.content&format=text'

stargate/content_cache_adds 2921

stargate/content_cache_dedup_ref_count 2.4583 <--- effective RAM/Flash savings right now is ~2.5x

stargate/content_cache_evictions_flash 9815

stargate/content_cache_evictions_memory 15969

stargate/content_cache_flash_page_in_pct 4

stargate/content_cache_flash_spills 9815

stargate/content_cache_hits_pct 98 <---- good

stargate/content_cache_lookups 261701

stargate/content_cache_multi_touch_flash_max 21474836480

stargate/content_cache_multi_touch_flash_usage 21474836480

stargate/content_cache_multi_touch_memory_max 1899180856

stargate/content_cache_multi_touch_memory_usage 1900703744

stargate/content_cache_page_in_from_flash 13062

stargate/content_cache_single_touch_flash_max 0

stargate/content_cache_single_touch_flash_usage 0

stargate/content_cache_single_touch_memory_max 474795208

stargate/content_cache_single_touch_memory_usage 473272320

stargate/content_cache_usage_flash_mb 20480 <---- 20G flash

stargate/content_cache_usage_memory_mb 2264 <---- 2.2G RAM

#2. Medusa ExtentGroupId Map (where SHA1s are kept) hit/miss info:

wget -O- 'http://localhost:2009/h/vars?regex=medusa.cache.extent_group_id_map&format=text'

medusa/cache/extent_group_id_map/current_size_bytes 268430856

medusa/cache/extent_group_id_map/entries 9496

medusa/cache/extent_group_id_map/evictions 293020 <<< to reduce evictions increase the extent_group_id cache.

medusa/cache/extent_group_id_map/hits 28665029

medusa/cache/extent_group_id_map/misses 923660

medusa/cache/extent_group_id_map/insertions 9417038

medusa/cache/extent_group_id_map/max_size_bytes 268435456 ---> 256MB cache size for this map

Medusa Extent ID map hit ratio is 96.8% (28665029/(28665029+923660)).

After increasing medusa extent id cache to 2G:

medusa/cache/extent_group_id_map/current_size_bytes 1644951348

medusa/cache/extent_group_id_map/entries 75678 (increased)

medusa/cache/extent_group_id_map/evictions 0 ( zero evictions)

medusa/cache/extent_group_id_map/hits 71550917

medusa/cache/extent_group_id_map/insertions 24404990

medusa/cache/extent_group_id_map/max_size_bytes 2147483648

medusa/cache/extent_group_id_map/misses 875467 (reduced)

#3. Write Path Live (30sec) Metrics

wget -O- 'http://localhost:2009/h/vars?regex=stargate.dedup&format=text'

stargate/dedup_fingerprint_added_bytes 8355840

stargate/dedup_fingerprint_cleared_bytes 8290304

III. Disable Dedup:
ncli ctr edit name=xyz fingerprint-on-write=off

stargate.gflags: stargate_disable_dedup_on_read=true

↧

Disable video in Nutanix UI login screen

March 27, 2014, 10:07 am

≫ Next: Nutanix support delivers world class support - no surprise!

≪ Previous: Enabling, Monitoring and Disabling Dedup

Some of the customers who are in a home office managing the Nutanix cluster in a remote datacenter , the loading of the video ( though cool) might take longer.

To disable the video, please append the url with novideo=true

https://demo01-c1.nutanix.com:9440/console/#login?novideo=true ( it will be stored in a cookie on your browser, so next time you don't have set novideo again)

To turn it back on:
https://demo01-c1.eng.nutanix.com:9440/console/#login?novideo=false

From 4.0 onwards, there is a link on the login page to enable or disable video:

↧

Nutanix support delivers world class support - no surprise!

April 29, 2014, 5:10 pm

≫ Next: Nutanix 4.0 Feature - Fault Domain Awareness

≪ Previous: Disable video in Nutanix UI login screen

Omega Management Group recognized 35 companies for “Delivering World-Class Customer Service” in 2013. Nutanix is one of the companies that received the award based on top-notch Net Promoter Score and Overall Customer Satisfaction score. We received a whopping +73 for our NPS for the calendar year 2013, and close to 4.7 on a 5 point scale for Customer Satisfaction.

http://www.omegascoreboard.com/news/news-releases/omega-honors-35-companies-for-delivering-world-class-customer-service/

As an Engineer from Nutanix Customer Support Organization, IMO, these are the reasons why Nutanix consistently received high ratings in the survey.

Customer First: Customer focus is *everyone’s* top priority, not just the Customer Support team - it runs very deep in the veins of the entire organization, all the way from the top (CEO watches the customer support cases/scores and twitter for feedback) to individual contributors. Customer issue trumps cool feature development - it’ not just maintenance - and is accorded the highest priority. Nutanix Support team works as a true extension to Nutanix Engineering.

Closing the loop: After resolving the case, SRE team works closely with Engineering on preventing similar issues in other modules, QA augments the existing test suite (creates new and/or modifies existing test cases) to catch similar issues in-house, Documentation team works on fixing any documentation gaps, as needed. The Customer Support team actively participates in these activities, to continually improve the end-user experience.

The Sales team works closely with the Customer, Support and Engineering to develop workarounds until a fix is made available. They make sure that the loop is closed (it does not hurt to have a technically savvy sales team either :-).

The Product Management team monitors the on-call and support issues and provides feedback to engineering on serviceability aspects of the current product and in-flight features.

Finally, each member of Support team has expertise in multiple areas (Networking, Virtualization, Storage and Compute) - If your product is converged, your support team's skills need to be converged as well.

Quotes from the customer:
http://www.nutanix.com/support-quotes

From Twitter:

↧

Nutanix 4.0 Feature - Fault Domain Awareness

May 1, 2014, 5:59 pm

≫ Next: Nutanix 4.0 Feature - Protection against Simultaneous Double points of failure

≪ Previous: Nutanix support delivers world class support - no surprise!

“A fault domain is a set of hardware components – computers, switches, and more – that share a single point of failure.”

Nutanix provides many ways of achieving resiliency and fault tolerance.

For the ease of explanation, let us focus on how block awareness works on RF=2 cluster. RF=2 cluster stores two copies of the data at a 4MB extent group level - one copy stored on the local node and other on a different random node.

In NOS 3.x and 4.0, a block is considered a fault domain. This means if there is one block failure, data/metadata/config data will still be available as the replicas are placed in different blocks.

In future versions, we can customize the fault domain depending rack location or data center location or any set of nodes.

History:
NOS 3.x version is extent group(data) and metadata block aware. Configuration servers can be configured/migrated to make them block aware. Oplog is not block aware. So if there is a block failure, some data can be unavailable.

4.0 - Block Awareness:

In NOS 4.0, this framework is extended to provide block awareness for oplog as well.

If a new cluster with three uniform blocks are created, NOS will select configuration nodes to be in different block, will form block-aware metadata ring , and will place data (oplog/extent group) in block aware fashion. After the cluster is formed, curator scan will confirm that the cluster is block aware and will update the block aware status.

Clusters formed in 3.x will be metadata/data and configuration block aware. New oplog writes will be block aware. Oplog drains quickly, so oplog of the cluster will become block aware within few hours of the upgrade to 4.x.

For the clusters created in pre-3.x NOS version, there is a non-disruptive method to make configuration and metadata ring block aware.

In the NOS 4.0, Prism UI will provide information whether cluster is block and/or node fault tolerant.

Commands to verify the fault tolerance state:

nclicluster get-fault-tolerance-state
nclicluster get-domain-fault-tolerance-status type=rackable_unit
ncli cluster get-domain-fault-tolerance-status type=node
ncli rack ls - how many blocks are in the cluster

↧

Nutanix 4.0 Feature - Protection against Simultaneous Double points of failure

June 4, 2014, 12:31 pm

≫ Next: Why Nutanix is a web-scale architecture ?

≪ Previous: Nutanix 4.0 Feature - Fault Domain Awareness

Nutanix provided replication factor/resiliency factor of two from day one. Due to request from customers who wanted to provided higher level of resiliency, Nutanix has RF=3 support from 4.0.

Pre 4.0 feature: Resiliency Factor of two:(RF=2) - Network Mirroring

Provides protection for single point of failure.
Please note that RF=2 provides high resiliency even from two related failures in addition to providing single point of failure, like two disk failures on a same node or two node failures on a same block if there are multiple blocks.( block aware feature in 4.0)
RF=2 will still protect from two failures, if they are non-simultaneous, second failure happens after the replication status is restored from the first failure and atleast two configuration ie., zookeeper nodes are active ( thanks to curator and metadata auto-node detach). Non disruptive manual zookeeper migration can be done in 4.0 . If first zookeeper node fails, you can manually migrate zookeeper to other active node, in that way second failure can be tolerated even if it is a zookeeper node. In 4.1, Nutanix Engineering is implementing auto-zookeeper migration.
(Ref:Figure 1)First copy of the data in 4 MB chunks (extent group) stored in node 1 where the VM is running.(data locality)
Second copy of the data will be stored in other nodes as well as in different disks on those nodes.
So if one disk/node fails, curator will fix the under-replication faster as all remaining nodes and disks will participate in rebuilt process. VM can be migrated to other nodes and curator will eventually bring the data local from other nodes and will rebuild the replication.
When placing the second copy, software is intelligent enough to place it in a way that it is block aware/disks and node data is always within the range of balance.
so data is available if there is a disk failure or multiple disk failure within same node or one node failure.
In 4.0, Nutanix UI provides cluster health which provides information on replication status and progress monitor on replication status.
RF=2 will be active even with a three node cluster.
Space utilization consideration: to rebuild RF=2 on a node failure, on a 5 node cluster, use 80% of the datastore space available.

Figure 1:

The following output provides the container (datastore) configuration.

nutanix@NTNX-13SM35300010-B-CVM:10.1.60.98:~$ ncli ctr ls

    ID                        : 1504
    Replication Factor        : 2
    Oplog Replication Factor : 2

If you need to verify if the system can take one node failure:
nutanix@NTNX-13SM35300010-B-CVM:10.1.60.98:~$ ncli cluster get-domain-fault-tolerance-status type=node ( states 1 which means we can tolerate single point of failure)
    Component Type            : EXTENT_GROUPS
    Current Fault Tolerance   : 1

    Component Type            : OPLOG
    Current Fault Tolerance   : 1

Figure 2:
UI Provides similar output:

4.0 Feature: Resiliency Factor=3:

Minimum of 5 nodes required.
RF=3 provides high resiliency from two unrelated simultaneous failures , like two disk failures on different nodes or two node failures on a different block

Resiliency factor on a cluster needs to be set during cluster init process where it requires to configure five zookeeper nodes, five copies of metadata. ( Nutanix Engineering is working on allowing this change dynamically))

Within RF=3 cluster, we can have containers of RF=2 or RF=3 and VMs that need higher resiliency can be created in RF=3 container/datastore. You can change the RF of a container dynamically.

Nutanix will have three copies of same extent group and will try to provide block awareness if possible.

RF=3 and Block awareness can be configured in the same cluster ( requires 5 blocks in 4.0 and 3 blocks in 4.0.1)
RF=3 and Dedup will make sure that only three copies of same extent group are stored in the cluster. So in some cases, RF=2 without dedup may consume more space than RF=3 with dedup.
RF=3 has only 10-15% overhead on performance

Steps (cluster init webpage may refer this feature as FT=2 cluster)

CLI commands: cluster -f -s <svmips> --redundancy_factor=3 create ; ncli sp create name=sp add-all-free-disks=true; ncli ctr create name=red rf=3 sp-name=sp

ncli cluster get-domain-fault-tolerance-status type=node

    Domain Type               : NODE
    Component Type            : STATIC_CONFIGURATION
    Current Fault Tolerance   : 2
    Fault Tolerance Details   :
    Last Update Time          : Wed Jun 04 11:52:11 PDT 2014

    Domain Type               : NODE
    Component Type            : ZOOKEEPER
    Current Fault Tolerance   : 2
    Fault Tolerance Details   :
    Last Update Time          : Wed Jun 04 11:49:55 PDT 2014

    Domain Type               : NODE
    Component Type            : METADATA
    Current Fault Tolerance   : 2
    Fault Tolerance Details   :
    Last Update Time          : Wed Jun 04 11:46:55 PDT 2014

    Domain Type               : NODE
    Component Type            : OPLOG
    Current Fault Tolerance   : 2
    Fault Tolerance Details   :
    Last Update Time          : Wed Jun 04 11:51:18 PDT 2014

    Domain Type               : NODE
    Component Type            : EXTENT_GROUPS
    Current Fault Tolerance   : 2
    Fault Tolerance Details   :
    Last Update Time          : Wed Jun 04 11:51:18 PDT 2014

↧

Why Nutanix is a web-scale architecture ?

June 12, 2014, 2:22 pm

≫ Next: Hyper-Converge DELL server into server-SAN with Nutanix

≪ Previous: Nutanix 4.0 Feature - Protection against Simultaneous Double points of failure

Recently, there has been a lot of buzz surrounding web-scale IT and
every vendor claims that they are web-scale IT.
So, what qualifies a software or architecture as a web-scale architecture?
What qualifies Nutanix as a web-scale IT architecture ?

Here is my list of reasons why Nutanix is a true web-scale Technology:
(Andre's list: http://www.nutanix.com/blog/2014/03/11/understanding-web-scale-properties/ )

Architecture should be defined in software, running on standard x86 hardware and no hardware crutches. Nutanix runs on a General purpose Supermicro hardware and Nutanix has been able to come up new hardware models frequently because of it. (NX-1000 to NX-7000)

No single point of failure or bottleneck for management services. Fault Tolerance is key to a stable, scalable distributed system, and ability to function in the presence of failures. Every component in Nutanix software has fail-fast, NFS leader, Configuration node leader, Paxos Leader to provide consistency, UI leader, multiple replicas at data/metadata layer, QoS rate limiting, Traffic forwarding when CVM fails. Nutanix has a lot of distributed goodness that provides ability to tolerate single point of failure. In addition, DR capabilities provide additional Disaster recovery. Nutanix snapshot browser helps to Recover a VM to its previous snapshot.
Web-scale system enables a non-disruptive approach to disruptive tasks, such as rolling or forklift upgrades, expandable clusters, always-on clusters, and all workflows always done online. Examples of Nutanix scalability include ability of adding and removing nodes dynamically, rolling upgrades of one node at a time ( different nodes can be at different version during the rolling upgrade), datastore availability when there is a disk failure and many other features. Nutanix Engineering is adding more features to improve resiliency and simplify workflows.

Please refer to: Add node: http://stevenpoitras.com/2013/07/advanced-nutanix-add-node/ and rolling upgrade in http://craigwaters.org/2014/04/29/nos-4-0-feature-1-click-os-upgrade/

Web-scale systems are built from the ground up and should expect and tolerate failures while upholding the promised performance, availability guarantees and service level agreements. Nutanix provides different levels of resiliency and UI provides information on what kind of tolerance is enabled on the cluster.

Please refer to: Fault domain awareness - http://nutanix.blogspot.com/2014/06/nutanix-40-feature-increased-resiliency.html and http://nutanix.blogspot.com/2014/05/40-features-fault-domain-awareness.html )

Web-scale systems should provide programmatic interfaces to allow complete control and automation via HTTP-based services. Nutanix provides REST APIs and Powershell commandlets for programmatically configuring the block.

Please refer:- http://prismdevkit.com/?page_id=539 , http://nutanix.blogspot.com/2013/07/nutanix-rest-api-browser-how-to-get.html and myvirtualcloud.net/?p=6382 )

Analytics software to reduce human interaction should be present in a true web-scale architecture. Nutanix cluster health in 4.0 provides analytics http://www.nutanix.com/blog/2014/04/16/nos-4-0-cluster-health-slices-dices/ .
http://nutanix.blogspot.com/2014/07/40-feature-cluster-health-framework.html

↧

Hyper-Converge DELL server into server-SAN with Nutanix

June 24, 2014, 2:55 pm

≫ Next: 4.0 Feature: Cluster Health Framework

≪ Previous: Why Nutanix is a web-scale architecture ?

Nutanix has always included software capable of running on generic off-the shelf hardware.
(as in above figure, x86 hardware) Even though the software is abstracted to run on any hardware, Nutanix recommends choosing from a shorter list of hardware possibilities rather than providing a compatibility matrix. Choosing from this shorter list enables the customer to purchase a solution which has been thoroughly tested by the Nutanix QA and hardware experts on the Nutanix Engineering team (Quality over Quantity). It can be compared to the synergy between a horse (HW) and a jockey (SW), so that the horse can run faster under guidance of the jockey.

Problems with providing software only solution with a compatibility matrix:

It is impossible for any software vendor to test all hardware, even if individual components are certified. It is so difficult to draft a reliable compatibility matrix. There are Knowledge base articles explaining how to use compatibility matrix. Furthermore, no software vendor should put the burden of validation on the customer. Many vendors use this matrix to avoid responsibility for failures in their products.

Just because a particular software is hardware agnostic, it is not guaranteed that the hardware as a whole provides capability of running the software at an optimal level. Just a small queue-depth in a controller could cause performance issues and take days to pin point the problem even with all the KBs and blog posts.

Nutanix Appliance Approach:

Nutanix has taken a different approach by owning the hardware qualification process even though Nutanix's IP is entirely software related. Nutanix ships super tested, qualified and validated off-the-shelf hardwareappliances so customers can run their production workload at ease and have peace of mind with their data. Due to the ease of setup and configuration, most Nutanix customers are able to put their new hardware into production within a day.

In taking this approach, Nutanix has built a world-class hardware engineering team with industry experts, allowing them to take on the additional responsibilities of testing, qualifying, and certifying additional hardware. Nutanix is excited to announce a new partnership with Dell to deliver another exciting hardware option for its customers. (Refer: http://www.nutanix.com/the-nutanix-solution/tech-specs/#nav). Nutanix will continue to provide the appliances and add additional models as new innovations occur (HDD/SSD/NVRAM-e/CPU/Memory/GPU/etc).

DELL+Nutanix Partnership:

Official Blogs:

http://www.nutanix.com/2014/06/24/nutanix-announces-global-agreement-with-dell/

http://en.community.dell.com/dell-blogs/dell4enterprise/b/dell4enterprise/archive/2014/06/24/an-exciting-industry-transformation-driven-by-our-customers.aspx

Refer above blogs for more detailed information on DELL and Nutanix alliance.

DELL is a valuable IT partner to many of the big enterprises. Many of these enterprises have tested and purchased Nutanix Appliances by getting one time exemption from their purchase department. DELL and Nutanix OEM agreement removes this exemption requirement.

Nutanix Software will convert powerful standalone DELL R-720 servers into hyper-converged Compute and Software Defined Distributed Storage Server.

Quote from DELL (eXCellent)
"Dell XC Series of Web-scale Converged AppliancesDell today also announced plans to deliver the Dell XC Series of Web-scale Converged Appliances powered by Nutanix. With the announcement, Dell extends its software defined storage portfolio with plans to offer customers a new series of appliances, which combine compute, storage and networking into a single offering. These new appliances will benefit customers seeking an integrated IT approach, offering simple deployment, management and scale as needed."

More details on DELL R720 is provided in this link. - http://www.dell.com/us/business/p/poweredge-r720/pd?ST=r720&dgc=ST&cid=265368&lid=4829940&acd=1230980794347400

https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcSd8HinZ2EsR4f4FGmXgjANOM0cYAWaKgzu_FEkW-erd6D9o8A8

↧