Troubleshooting Tips
Log Files
檔案目錄: /u01/app/oracle/ovm-manager-3/domains/ovm_domain/servers/AdminServer/logs/
- access.log: Used to track HTTP access to the Web interface of the Oracle VM Manager and to the underlying Oracle WebLogic Server HTTP interface. This log can be used to track access and HTTP operations within Oracle VM Manager to help debug access issues and to audit access to the Oracle VM Manager.
- AdminServer.log: Used to track events within the underlying Oracle WebLogic Server framework, including events triggered by Oracle VM Manager. This log can be used to track a variety of issues within Oracle VM Manager including TLS/SSL certificate issues, server availability issues, and any actions performed within Oracle VM Manager which are usually identifiable by searching for items containing the string com.oracle.ovm.mgr. Log in failures resulting from locked accounts (as opposed to incorrect credentials) are also in this file.
- AdminServer-diagnostic.log: Used to track exceptions within the underlying Oracle WebLogic Server framework, including particular events triggered by Oracle VM Manager such as log in failures due to incorrect credentials. This log can be used to track Oracle VM Manager behavior that results in an exception or for log in failure, which can be tracked by searching for the string An incorrect username or password was specified.
Log Parsing Tool: OvmLogTool.py
檔案目錄: /u01/app/oracle/ovm-manager-3/ovm_tools/
, 由於 AdminServer.log 的內容不易讀取,使用這指令格式化 log 內容。
cd /u01/app/oracle/ovm-manager-3/ovm_tools/
python OvmLogTool.py -s -o ~/ovm_logs/summary.`date +%y%m%d_%H%M`
格式化後的結果會儲存在 ~/ovm_logs/summary.<todaty_now>
TIP:
-s , 只會顯示 Error 相關的 Log; 不加則會顯示所有 Log。
Q: 執行手動備份 OVMM 資料庫失敗
執行 /u01/app/oracle/ovm-manager-3/ovm_tools/bin/BackupDatabase -w
mysqlbackup: WARNING: The value of 'innodb_checksum_algorithm' option provided to mysqlbackup might be incompatible with server config.
mysqlbackup: ERROR: Page at offset 5242880 in /u01/app/oracle/mysql/data/appfw/APPFW_EVENTS.ibd seems corrupt!
解決方案:
- 檢查資料表 APPFW_EVENTS 是否已損壞
- 如果已損壞,嘗試執行修復資料表
- 如果顯示正常,嘗試手動重建資料表
- 再執行備份一次
檢查資料表狀態
mysqlcheck -uroot -p -S /u01/app/oracle/mysql/data/mysqld.sock --databases appfw
mysql -u appfw -p -S /u01/app/oracle/mysql/data/mysqld.sock appfw
mysql> select count(*) from APPFW_EVENTS;
+----------+
| count(*) |
+----------+
| 18650 |
+----------+
1 row in set (0.01 sec)
重建 table APPFW_EVENTS
service ovmm stop
mysqldump -uappfw -p -S /u01/app/oracle/mysql/data/mysqld.sock --databases appfw --tables APPFW_EVENTS > table_dump.appfw_events.sql
mysql -u appfw -p -S /u01/app/oracle/mysql/data/mysqld.sock appfw
mysql> create table APPFW_EVENTS_NEW like APPFW_EVENTS;
mysql> rename table APPFW_EVENTS to APPFW_EVENTS_OLD;
mysql> rename table APPFW_EVENTS_NEW to APPFW_EVENTS;
# Import the table
mysql -uappfw -p -S /u01/app/oracle/mysql/data/mysqld.sock appfw < table_dump.appfw_events.sql
# Drop the old table
mysql -u appfw -p -S /u01/app/oracle/mysql/data/mysqld.sock appfw
mysql> drop table APPFW_EVENTS_OLD;
Q: OVMM 主機的 MySQL DB 耗盡所有磁碟空間
檢查 MySQL 的資料表使用空間
# du -chs /u01/app/oracle/mysql/data/ovs/OVM_STATISTIC*
16K /u01/app/oracle/mysql/data/ovs/OVM_STATISTIC.frm
121G /u01/app/oracle/mysql/data/ovs/OVM_STATISTIC.ibd <===
解決:
- 先釋出一些其他的可用空間,使 MySQL 可正常運作。
- 關閉 ovmm 服務,避免更多資料的寫入。
- 清除資料表 OVM_STATISTIC 的內容。
關閉 ovmm
service ovmm stop
檢查資料表 OVM_STATISTIC 的筆數
mysql -u ovs -p -S /u01/app/oracle/mysql/data/mysqld.sock ovs
Enter password: <網頁登入密碼>
mysql> select count(*) from OVM_STATISTIC;
+-----------+
| count(*) |
+-----------+
| 184795278 |
+-----------+
1 row in set (6 min 35.98 sec)
清除資料表 OVM_STATISTIC
mysql> truncate table OVM_STATISTIC;
TIP:
truncate 基本上是先執行 drop 再 create,就算有 1 億多筆資料在幾秒鐘就會完成清除。
另一個方式取代 truncate
mysql> create table NEW_OVM_STATISTIC like OVM_STATISTIC;
mysql> rename table OVM_STATISTIC to OLD_OVM_STATISTIC, NEW_OVM_STATISTIC to OVM_STATISTIC;
mysql> drop table OLD_OVM_STATISTIC;
Q: VM 無法結束,使用 Kill 也沒用
VM 狀態一直顯示 Stopping,執行 Kill 失敗,出現錯誤:
tpeoddovm-db01 <1108> is locked. job info: job id(time):1525839275699 name:Stop VM: tpeoddovm-db01 description:Stop VM: tpeoddovm-db01
解決:試試重啟該 VM 所在的 OVS 主機裡的 ovs-agent 服務
service ovs-agent stop
service ovs-agent start
Q: 無法建立 Server Pool
重裝完 OVS 與 OVM Manager 主機後,無法使用原有的 LUNs 建立 Server Pool 與 Repository。
解決:SSH 登入 OVS 主機,清空 LUNs 的資料
# 找出 LUN 路徑
multipath -ll
dd if=/dev/zero of=/dev/mapper/360a980004434375a385d4747374b5155 bs=1M count=256
Q: [OVM 3.3.x] 管理介面網頁突然無法登入
錯誤訊息:
Unexpected error during login (java.lang.NullPointerException)
解決:二擇一
- 重啟 ovmm 服務
- 重啟 OVMM 主機
Q: Server Pool 的 Master Server 硬體故障並意外關機,隨後將 VM 遷移(Migrate) 至另一部 OVS 後,啟動 VM 時發生錯誤
錯誤訊息
Caught during invoke method: com.oracle.ovm.mgr.api.exception.IllegalOperationException....
解決:在遷移 VM 之前,請先將它的 Event Serverity 狀態從 Critical 變更成 Informational。步驟如下:
- 確認主機故障原因
- 確認第二部主機服務正常,與所有 Repositories Storage 正常
- 將 VM 從故障主機遷移(Migrate) 至另一部主機
- OVMM > Servers and VMs > 選擇第二部OVS主機 > 選擇要啟動的 VM > 按右鍵 Display Events
將每個 Critical Event 做完 Acknowledge,完成後這 VM 的 Event Serverity 狀態應該會顯是正常 Informational。 - 啟動 VM 試試
Q: Repositories 總是顯示 Error 圖示
先前由於硬體維護工作,造成 Storage 短暫的連接異常,但解決異常後,Error 圖示 仍保持顯示
解決:OVMM Admin > Repositories > 選擇 Storage Repository > Perspective: 選擇 Events > 選擇尚未處理的舊事件,按 Acknowledge,完成。