# vSphere Monitoring

#### Method #1: Telegraf + InfluxDB

- [VMware vSphere - Overview | Grafana Labs](https://grafana.com/grafana/dashboards/8159-vmware-vsphere-overview/)
- [Telegraf: VMware vSphere Input Plugin](https://github.com/influxdata/telegraf/tree/release-1.8/plugins/inputs/vsphere)

##### Install Telegraf

Download: [https://portal.influxdata.com/downloads/](https://portal.influxdata.com/downloads/)

```
yum localinstall telegraf-1.18.3-1.x86_64.rpm
```

##### Configure Telegraf

Create a configuration file

```bash
telegraf config > /etc/telegraf/telegraf-vmware.conf
```

vi /etc/telegraf/telegraf-vmware.conf

Log file

```
...
[agent]
...
  logfile = "/var/log/telegraf/telegraf-vmware.log"
...
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = true
```

Output for InfluxDB 1.x

```
# Configuration for sending metrics to InfluxDB 1.x
[[outputs.influxdb]]
    urls = ["http://10.10.2.209:8086"]
    database = "vmware"
    timeout = "0s"
    username = "admin"
    password = "dba4mis"
    retention_policy = "200d"
```

Output for InfluxDB 2.x

```
[[outputs.influxdb_v2]]
  ## The URLs of the InfluxDB cluster nodes.
  ##
  ## Multiple URLs can be specified for a single cluster, only ONE of the
  ## urls will be written to each interval.
  ##   ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
  urls = ["http://127.0.0.1:8086"]

  ## Token for authentication.
  token = "Your-Token"

  ## Organization is the name of the organization you wish to write to.
  organization = "Your-Org-Name"

  ## Destination bucket to write into.
  bucket = "Tour-Bucket-Name"
  
  ## Timeout for HTTP messages.
  timeout = "5s"
```

Input

參考範例: [Telegraf: VMware vSphere Input Plugin](https://github.com/influxdata/telegraf/tree/release-1.8/plugins/inputs/vsphere)

```
###############################################################################
#                            INPUT PLUGINS                                    #
###############################################################################


## Realtime instance
[[inputs.vsphere]]
  interval = "60s"

  ## List of vCenter URLs to be monitored. These three lines must be uncommented
  ## and edited for the plugin to work.
  vcenters = [ "https://vcenter-server-ip/sdk" ]
  username = "admin@vsphere.local"
  password = "ThisPassword"

  # Exclude all historical metrics
  datastore_metric_exclude = ["*"]
  cluster_metric_exclude = ["*"]
  datacenter_metric_exclude = ["*"]
  resourcepool_metric_exclude = ["*"]

  #max_query_metrics = 256
  #timeout = "60s"
  insecure_skip_verify = true
  force_discover_on_init = true

  collect_concurrency = 5
  discover_concurrency = 5


## Historical instance
[[inputs.vsphere]]
 interval = "300s"

  vcenters = [ "https://vcenter-server-ip/sdk" ]
  username = "admin@vsphere.local"
  password = "ThisPassword"

  host_metric_exclude = ["*"] # Exclude realtime metrics
  vm_metric_exclude = ["*"] # Exclude realtime metrics

  insecure_skip_verify = true
  force_discover_on_init = true
  max_query_metrics = 256
  collect_concurrency = 3

```

Configure systemd

```bash
cp /usr/lib/systemd/system/telegraf.service /usr/lib/systemd/system/telegraf-vmware.service
sed -i 's/telegraf.conf/telegraf-vmware.conf/g' /usr/lib/systemd/system/telegraf-vmware.service
```

Startup Telegraf

```bash
systemctl daemon-reload
systemctl start telegraf-vmware
systemctl enable telegraf-vmware
```

##### Configure InfluxDB

Set the retention policy

```
[root@mm-mon ~]# influx -username admin -password dba4mis
Connected to http://localhost:8086 version 1.8.5
InfluxDB shell version: 1.8.5
> show retention policies on vmware
name    duration shardGroupDuration replicaN default
----    -------- ------------------ -------- -------
autogen 0s       168h0m0s           1        true
> alter retention policy "autogen" on "vmware" duration 200d shard duration 1d
> show retention policies on vmware
name    duration  shardGroupDuration replicaN default
----    --------  ------------------ -------- -------
autogen 4800h0m0s 24h0m0s            1        true
```

##### Configure Grafana

1. Add a datasource for InfluxDB 
    - Name: VMware
    - Type: InfluxDB
    - Database: vmware
    - Username: &lt;InfluxDB Credential&gt;
    - Password: &lt;InfluxDB Credential&gt;
2. Import the dashboards 
    1. [https://grafana.com/grafana/dashboards/8159](https://grafana.com/grafana/dashboards/8159)
    2. [https://grafana.com/grafana/dashboards/8165](https://grafana.com/grafana/dashboards/8165)
    3. [https://grafana.com/grafana/dashboards/8168](https://grafana.com/grafana/dashboards/8168)
    4. [https://grafana.com/grafana/dashboards/8162](https://grafana.com/grafana/dashboards/8162)

##### FAQ

Q: 之後新增的 VM 不會出現在 Dashoboard。

A: 先確認 InfluxDB 是否已寫入新 VM 的 data，如果有，只要更新 Dashboard Settings &gt; Variables &gt; virtualmachine &gt; 執行 Update，檢查 Preview of values 是否有出現新 VM name。

檢查 InfluxDB

```sql
# Check all current VM names
select DISTINCT("vmname") from (select "ready_summation","vmname" from "vsphere_vm_cpu" WHERE time > now() - 10m)
```

Q: Telegraf 錯誤訊息

> \[inputs.vsphere\] Error in plugin: while collecting vm: ServerFaultCode: A specified parameter was not correct: querySpec\[0\].endTime

A: 確認是否包含以下參數

```
force_discover_on_init = true
```

Q: Issue: VMware vSphere - Overview

> vCenter CPU/RAM 區塊沒有圖形顯示

A: 編輯區塊 &gt; Flux language syntax

將 &lt;vcenter-name&gt; 改成實際的 vm 名稱

```
from(bucket: v.defaultBucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "vsphere_vm_cpu")
  |> filter(fn: (r) => r["_field"] == "usage_average")
  |> filter(fn: (r) => r["vmname"] == "<vcenter-name>_vCenter")
  |> group(columns: ["vmname"])
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> yield(name: "mean")
```

> Cluster 選單無法正確顯示 cluster name

A: 編輯 Dashboard &gt; Variables &gt; clustername &gt; Flux language syntax

```
from(bucket: v.defaultBucket)
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "vsphere_host_cpu")
  |> filter(fn: (r) => r["clustername"] != "")
  |> filter(fn: (r) => r["vcenter"] == "${vcenter}")
  |> keep(columns: ["clustername"])
  |> distinct(column: "clustername")
  |> group()
```

#### Method #2: SexiGraf

- Official: [http://www.sexigraf.fr/quickstart/](http://www.sexigraf.fr/quickstart/)
- OS-based: Ubuntu 16.04.6 LTS

##### Download the OVA appliance

- [http://www.sexigraf.fr/quickstart/](http://www.sexigraf.fr/quickstart/)
- [https://github.com/sexibytes/sexigraf](https://github.com/sexibytes/sexigraf)

##### vCenter/vSphere Credential for monitor only

vCenter Web Client &gt; 功能表 &gt; 系統管理 &gt; Single Sign On: 使用者與群組 &gt; 新增

- 使用者名稱: winmon
- 密碼: xxxx
- 確認密碼: xxxx

vCenter Web Client &gt; 功能表 &gt; 主機與叢集 &gt; 權限 &gt; 新增權限

- 使用者: vsphere.local , 搜尋 winmon
- 角色: 唯讀
- 散佈到子係: 勾選

##### Deploy the OVA to vCenter/ESXi

部署到 ESXi 6.5 時失敗，錯誤訊息

> Line 163: Unable to parse 'tools.syncTime' for attribute 'key' on element 'Config'.

解決方法: 使用 OVF-Tool 先解開 OVA 檔，編輯 OVF 檔的內容

```
# Before
<vmw:Config ovf:required="true"  vmw:key="tools.syncTime" vmw:value="true"/>

# After
<vmw:Config ovf:required="false"  vmw:key="tools.syncTime" vmw:value="true"/>
```

存檔後，重新再部署一次。

##### First to Start the VM

1\. SSH Credential: root / Sex!Gr@f

2\. Need to manually configure the IP, Edit the `/etc/network/interfaces` .

3\. Configure the hostname

```
hostnamectl set-hostname esx-mon
```

4\. Configure the timezone and time server

```
timedatectl set-timezone Asia/Taipei
```

vi /etc/ntp.conf

```
#pool 0.ubuntu.pool.ntp.org iburst
#pool 1.ubuntu.pool.ntp.org iburst
#pool 2.ubuntu.pool.ntp.org iburst
#pool 3.ubuntu.pool.ntp.org iburst

# Use Ubuntu's ntp server as a fallback.
#pool ntp.ubuntu.com

# Added the local time server
server 192.168.21.86 prefer iburst
```

Restart the ntpd

```
systemctl stop ntp
systemctl start ntp

# Check the timeserver
ntpq -p
```

##### First to Login the Grafana Web

1. Login: admin / Sex!Gr@f
2. Add the credential to connect to the vCenter server managed: Search &gt; SexiGraf &gt; SexiGraf Web Admin &gt; Credential Store 
    - vCenter IP: &lt;vCenter/ESXi IP or FQDN&gt;
    - Username: &lt;Username to login to vCenter/ESXi&gt;
    - Password: &lt;Password to login to vCenter/ESXi&gt;