Troubleshooting
Describes how to troubleshoot issues related to KubeSphere Backup.
After you import a Kubernetes cluster, a namespace qiming-backend
will be created on the cluster for installing backup and recovery components, including Velero installer and Velero itself.
This document describes how to troubleshoot issues related to KubeSphere Backup.
Kubernetes cluster resource request
To let backup and recovery components run smoothly, make sure your Kubernetes cluster has enough resource requests. The default resource settings are described as follows:
The Velero installer is a DaemonSet named
restic
that controls the transmission of local files to an S3 storage repository. Its resource request is as follows:name: restic resources: limits: cpu: "1" memory: 1Gi requests: cpu: 500m memory: 512Mi
common:NOTE
The default settings are recommended. To change the resource settings, run
kubectl edit daemonset restic -n qiming-backend
, modify the settings underspec.template.spec.containers.resources
, and save your changes.Velero is a 1-replica Deployment that handles the control logic of Velero, including backup and recovery. Its resource request is as follows:
name: velero resources: limits: cpu: "1" memory: 1000Mi requests: cpu: 500m memory: 256Mi
common:NOTE
The default settings are recommended. To change the resource settings, run
kubectl edit deploy velero -n qiming-backend
, modify the settings underspec.template.spec.containers.resources
, and save your changes.
Obtain the component logs
If the backup and recovery components do not run properly, please try to obtain the logs of relevant pods for troubleshooting.
On the control plane of your cluster, run the following command to obtain the information of pods running in the
qiming-backend
namespace.$ kubectl get pods -n qiming-backend NAME READY STATUS RESTARTS AGE restic-98lc4 1/1 Running 0 80m velero-564c9df5c6-mn2s7 1/1 Running 0 80m velero-installer-66d65557c4-dz86t 1/1 Running 0 83m
Run the following command to obtain the logs of Velero pods.
kubectl logs <Velero pod ID> -n qiming-backend > /tmp/velero.log
Run the following command to obtain the logs of the
restic
pods.kubectl logs <restic pod ID> -n qiming-backend > /tmp/restic.log
Obtain backup and recovery information and logs
If any backup job or recovery job is not running properly, please obtain the information and logs of the abnormal job for troubleshooting.
Prerequisites
Make sure your machine can access the cluster with an error reported.
Download the installation package for Velero v1.7.0 according to your environment from this page.
Run the following command to unzip the installation package and view the version information.
tar -xvf <installation package filename> cd <installation package directory> chmod +x velero ./velero version
Run the following command to create a configuration file and edit it.
mkdir -p $HOME/.config/velero vi $HOME/.config/velero/config.json
Enter the following content in the configuration file to link the configuration file to the
qiming-backend
namespace. Save the changes when you finish.{ "namespace": "qiming-backend" }
Save the kubeconfig of the cluster for troubleshooting to
~/.kube/config
. By default, Velero uses this kubeconfig to connect to the cluster. You can also usevelero --kubeconfig=<your-kubeconfig-file>
to specify the path of the kubeconfig.
Obtain information and logs of a backup job
Run the following command to view the information of backup jobs.
./velero get backups
Run the following command to view the details of backup jobs.
./velero describe backup <backup job name>
Run the following command to view the logs of backup jobs.
./velero backup logs <backup job name>
Obtain information and logs of a recovery job
Run the following command to view the information of recovery jobs.
./velero get restores
Run the following command to view the details of recovery jobs.
./velero describe restore <recovery job name>
Run the following command to view the logs of recovery jobs.
./velero restore logs <recovery job name>
Troubleshooting for components installation
- A pod cannot start running in the
qiming-backend
namespace.
Run the following command to view the causes of start failure.
kubectl -n qiming-backend describe pod <pod name>
If the output shows error in pulling images, run the following command to check whether the current node or cluster can pull images.
docker pull registry.cn-shanghai.aliyuncs.com/jibudata/velero-installer:xxx
If the output shows insufficient resources in the cluster, please ensure your cluster has sufficient resources.
restic
DaemonSet errors.Run the following command to view the causes of errors.
kubectl -n qiming-backend describe pod <restic pod name>
The
restic
DaemonSet accesses the PVC data for backup by mounting the kubelet directory/var/lib/kubelet/pods
. If the kubelet path of your Kubernetes cluster is different from that path, run the following command to change the value ofspec.template.spec.volumes.hostPath.path
to the correct path and save the changes.kubectl edit daemonset restic -n qiming-backend
Backup troubleshooting
Backup job remains in progress
Run the following command to view the information about backup jobs.
./velero get backups
Run the following command to view the details of the backup jobs remaining in progress and do troubleshooting according to the details.
./velero describe backup <backup job name> --details
Backup job failure
Run the following command to view the information about backup jobs.
./velero get backups
Run the following command to view the details of the failed backup jobs and do troubleshooting according to the output in
Errors
../velero describe backup <backup job name> --details
View backup/recovery CR information
View CRs of a backup job
Run the following command to view all the CRs of the backup jobs.
$ kubectl get backups.velero.io -n qiming-backend NAME AGE test-backup-4lx3w-vfg5q 2d19h
common:NOTE
The prefix of the backup CR name is the backup job name created by the system. For example, if you create a backup plan named
test-backup1
, then the corresponding backup job CR name istest-backup1-xxxxx-xxxxx
.Run the following command to view the status of the backup job CR.
kubectl get backups.velero.io -n qiming-backend <backup job CR name> -o yaml
common:NOTE
Pay attention to the
status.phase
in the output to look for any error or warning information.
View CRs of a recovery job
Run the following command to view all the CRs of the recovery jobs.
kubectl get restores.velero.io -n qiming-backend
Run the following command to view the status of the CR of a recovery job.
kubectl get restores.velero.io -n qiming-backend <restore cr name> -o yaml