Symptoms
You tried to delete/create a VM/volume and it shows spinning action of eg: Creating status. After a while the spinning status still shows up, and you only notice the action you made (create/delete VM) already took place after refreshing your browser.
Diagnostic steps
In the beholder log file (from the primary node) you notice an error related to "Failed to notify backend". Example:
# tail -f /var/log/hci/beholder/beholder.log
2024-09-21 13:46:39.203 INFO [-] 847700ad-514d-43ca-a22b-785db375d6f6 compute.hv3125dev.vstoragedomain compute.instance.update (info)
2024-09-21 13:46:39.204 INFO [-] {"timestamp": "2024-09-21 04:46:39.197865", "event_type": "compute.instance.update", "instance_uuid": "847700ad-514d-43ca-a22b-785db375d6f6", "instance_name": "vztest-428330-1", "state": "building", "host": "hv3125dev.vstoragedomain", "cpus": 24, "tags": []}
2024-09-21 13:46:39.205 INFO [-] 847700ad-514d-43ca-a22b-785db375d6f6 'updated' event recorded
2024-09-21 13:46:39.274 INFO [-] 847700ad-514d-43ca-a22b-785db375d6f6 compute.vztest1001.vstoragedomain compute.instance.update (info)
2024-09-21 13:46:39.274 INFO [-] {"timestamp": "2024-09-21 04:46:39.269962", "event_type": "compute.instance.update", "instance_uuid": "847700ad-514d-43ca-a22b-785db375d6f6", "instance_name": "vztest-428330-1", "state": "building", "host": "hv3125dev.vstoragedomain", "cpus": 24, "tags": []}
2024-09-21 13:46:39.275 INFO [-] 847700ad-514d-43ca-a22b-785db375d6f6 'updated' event recorded
2024-09-21 13:46:39.407 INFO [-] 847700ad-514d-43ca-a22b-785db375d6f6 compute.hv3119dev.vstoragedomain compute.instance.update (info)
2024-09-21 13:46:39.407 INFO [-] {"timestamp": "2024-09-21 04:46:39.393331", "event_type": "compute.instance.update", "instance_uuid": "847700ad-514d-43ca-a22b-785db375d6f6", "instance_name": "vztest-428330-1", "state": "building", "host": "hv3125dev.vstoragedomain", "cpus": 24, "tags": []}
2024-09-21 13:46:39.408 INFO [-] 847700ad-514d-43ca-a22b-785db375d6f6 'updated' event recorded
2024-09-21 13:46:39.750 INFO [-] network.hv3125dev.vstoragedomain port.create.start (info)
2024-09-21 13:46:40.023 ERROR [-] Failed to notify Backend: 403
Cause
The issue is caused by a bug in 6.2.1 (51). Bug ID VSTOR-89213
Resolution
The bug should be fixed in version 6.3. Workaround available. Consult support if you hesitate:
Step 1 : Identify Controller Node
[root@vhinode1 ~]# vinfra service compute node list -c id -c host -c state -c roles | grep controller
| 69e787fa-2d12-43ce-9e2b-321d9628f8b8 | vhinode1.vstoragedomain | healthy | - controller |
| 64a8f673-9a89-445b-9276-202e63a215a0 | vhinode2.vstoragedomain | healthy | - controller |
| 0ccfd274-7a83-4c46-83dc-2fd43573f6d6 | vhinode3.vstoragedomain | healthy | - controller |
Step 2 : Copy over two files that require synchronization to all controller node
/usr/libexec/vstorage-ui-backend/etc/backend.local.cfg
/usr/libexec/vstorage-ui-backend/etc/keystone/keystone-hci.conf
Before copying over, ensure that the value of "notifier_hmac_key" from keystone-hci.conf config have same value as the backend.local.cfg file
[root@vhinode1~]# cat /usr/libexec/vstorage-ui-backend/etc/keystone/keystone-hci.conf [hci] notifier_address = 127.0.0.1 notifier_port = 39737 notifier_api_prefix = notifier/api/v1 notifier_hmac_key = '.......'
Step 2.1: Copy over file backend.local.cfg from 1st controller node (normally the primary node) to other controller nodes:
scp -i /etc/kolla/id_rsa /usr/libexec/vstorage-ui-backend/etc/backend.local.cfg vhinode2.vstoragedomain:/usr/libexec/vstorage-ui-backend/etc/backend.local.cfg
Step 2.2: Copy over file to keystone-hci-conf to other controller node:
scp -i /etc/kolla/id_rsa /usr/libexec/vstorage-ui-backend/etc/keystone/keystone-hci.conf vhinode2.vstoragedomain:/usr/libexec/vstorage-ui-backend/etc/keystone/keystone-hci.conf
Repeat over the step until all controller node is covered.
Step 3: Check on which node that keystone public service is running, and restart them on that particular node
[root@vhinode1~]# for n in $(vinfra node list -c host -f value | sort -V); do echo "===$n==="; ssh -i /etc/kolla/id_rsa $n "systemctl status vstorage-ui-keystone-public.service | grep running"; done
===vhinode.vstoragedomain===
===vhinode3.vstoragedomain===
===vhinode2.vstoragedomain===
===vhinode1.vstoragedomain===
Active: active (running) since Fri 2024-09-13 07:37:00 JST; 1 week 5 days ago
Step 3.1: Restart the keystone public service:
systemctl reload vstorage-ui-keystone-public.service
Step 4: Redeploy beholder
Notes: Ensure you are in the primary node before executing.
[root@vhi-node1 ~]# su - vstoradmin
Last login: Wed Sep 25 03:00:07 +08 2024
[vstoradmin@vhi-node1 ~]$ kolla-ansible deploy-beholder
The step above will show a very long output. A successful one looks like following. If you hesitate, please reach out to support for assistance.:
RUNNING HANDLER [beholder : Restart beholder container] *************************************************************************************************************************************************************
changed: [5046e8c9-4881-9c10-6233-e12cf94ccb9a]
changed: [4f3847e9-0de2-d696-f467-3fd471d6f855]
changed: [2785a030-db0c-c506-385d-b6b7ff9270c2]
PLAY RECAP **********************************************************************************************************************************************************************************************************
5046e8c9-4881-9c10-6233-e12cf94ccb9a : ok=11 changed=4 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
4f3847e9-0de2-d696-f467-3fd471d6f855 : ok=9 changed=2 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
2785a030-db0c-c506-385d-b6b7ff9270c2 : ok=9 changed=2 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
-bash-5.1$ ^C
-bash-5.1$