Fix code issues for backup retention
parent
684827bcf9
commit
325b5954af
|
|
@ -0,0 +1,208 @@
|
||||||
|
# Repository Initialization Fix
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
The installation completed but the repository initialization failed with:
|
||||||
|
```
|
||||||
|
[INSTALL] WARNING: Repository initialization failed
|
||||||
|
```
|
||||||
|
|
||||||
|
When you try to backup, you get:
|
||||||
|
```
|
||||||
|
ERROR: Unable to access the Restic repository. Aborting backup.
|
||||||
|
```
|
||||||
|
|
||||||
|
## Root Cause
|
||||||
|
|
||||||
|
The `/data` directory exists but either:
|
||||||
|
1. It's not mounted to Shared Storage
|
||||||
|
2. It has permission issues
|
||||||
|
3. Restic cannot create the repository structure
|
||||||
|
|
||||||
|
## Quick Fix - Manual Repository Initialization
|
||||||
|
|
||||||
|
### Step 1: SSH into your environment
|
||||||
|
```bash
|
||||||
|
# Connect to your compute node
|
||||||
|
ssh root@node-<your-node-id>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Check /data mount status
|
||||||
|
```bash
|
||||||
|
# Check if /data is a mount point
|
||||||
|
df -h /data
|
||||||
|
mount | grep /data
|
||||||
|
|
||||||
|
# Expected: Should show it's mounted from shared storage
|
||||||
|
# If it shows just your local filesystem, /data is NOT mounted to shared storage
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Check current repository status
|
||||||
|
```bash
|
||||||
|
# Set the password
|
||||||
|
export RESTIC_PASSWORD=$(cat /etc/restic-password)
|
||||||
|
|
||||||
|
# Try to access repository
|
||||||
|
restic -r /data snapshots
|
||||||
|
|
||||||
|
# If it shows "Fatal: unable to open config file" - repository needs initialization
|
||||||
|
# If it shows "wrong password" - password mismatch issue
|
||||||
|
# If it shows list of snapshots - repository is working (shouldn't have error)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: Initialize the repository manually
|
||||||
|
```bash
|
||||||
|
# Make sure /data is writable
|
||||||
|
chmod 755 /data
|
||||||
|
chown root:root /data
|
||||||
|
|
||||||
|
# Set password
|
||||||
|
export RESTIC_PASSWORD=$(cat /etc/restic-password)
|
||||||
|
|
||||||
|
# Initialize repository
|
||||||
|
restic -r /data init
|
||||||
|
|
||||||
|
# Expected output:
|
||||||
|
# created restic repository 1234abcd at /data
|
||||||
|
# Please note that knowledge of your password is required to access
|
||||||
|
# the repository. Losing your password means that your data is
|
||||||
|
# irrecoverably lost.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 5: Verify repository is accessible
|
||||||
|
```bash
|
||||||
|
export RESTIC_PASSWORD=$(cat /etc/restic-password)
|
||||||
|
restic -r /data snapshots
|
||||||
|
|
||||||
|
# Should show empty list (no snapshots yet) with no errors
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 6: Test backup
|
||||||
|
```bash
|
||||||
|
# Try running a manual backup from the addon dashboard
|
||||||
|
# Or via command line:
|
||||||
|
bash /home/litespeed/mb-backups/backup_all.sh manual
|
||||||
|
```
|
||||||
|
|
||||||
|
## If /data is NOT on Shared Storage
|
||||||
|
|
||||||
|
If Step 2 shows `/data` is just a local directory (not mounted), you need to mount shared storage:
|
||||||
|
|
||||||
|
### Option 1: Mount Shared Storage via Dashboard
|
||||||
|
|
||||||
|
1. Go to your environment in the dashboard
|
||||||
|
2. Click on "Settings" → "Endpoints"
|
||||||
|
3. Look for "Shared Storage" node
|
||||||
|
4. Ensure it's mounted to `/data` on your compute nodes
|
||||||
|
|
||||||
|
### Option 2: Manual Mount (temporary)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Find your shared storage node IP
|
||||||
|
# This is shown in your dashboard under "Shared Storage 2.0-10.5 Node ID: 9457"
|
||||||
|
|
||||||
|
# Mount shared storage (replace <storage-ip> with actual IP)
|
||||||
|
mount -t nfs <storage-ip>:/data /data
|
||||||
|
|
||||||
|
# Verify mount
|
||||||
|
df -h /data
|
||||||
|
```
|
||||||
|
|
||||||
|
### Option 3: Create Mount in Manifest
|
||||||
|
|
||||||
|
Add this to your environment topology (not the addon):
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
nodes:
|
||||||
|
- nodeType: storage
|
||||||
|
cloudlets: 8
|
||||||
|
nodeGroup: storage
|
||||||
|
|
||||||
|
# Then mount to compute nodes
|
||||||
|
- nodeGroup: cp
|
||||||
|
volumes:
|
||||||
|
- /data
|
||||||
|
volumeMounts:
|
||||||
|
storage:
|
||||||
|
path: /data
|
||||||
|
sourcePath: /data
|
||||||
|
```
|
||||||
|
|
||||||
|
## Verification Script
|
||||||
|
|
||||||
|
Run this to diagnose the issue:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
#!/bin/bash
|
||||||
|
echo "=== Backup Repository Diagnostics ==="
|
||||||
|
echo ""
|
||||||
|
echo "1. /data Mount Status:"
|
||||||
|
df -h /data
|
||||||
|
echo ""
|
||||||
|
echo "2. /data Permissions:"
|
||||||
|
ls -ld /data
|
||||||
|
echo ""
|
||||||
|
echo "3. /data Contents:"
|
||||||
|
ls -la /data
|
||||||
|
echo ""
|
||||||
|
echo "4. Password File:"
|
||||||
|
ls -l /etc/restic-password /data/.restic-password 2>/dev/null
|
||||||
|
echo ""
|
||||||
|
echo "5. Repository Check:"
|
||||||
|
export RESTIC_PASSWORD=$(cat /etc/restic-password)
|
||||||
|
restic -r /data snapshots 2>&1
|
||||||
|
echo ""
|
||||||
|
echo "6. Repository Structure:"
|
||||||
|
ls -la /data/ 2>/dev/null
|
||||||
|
echo ""
|
||||||
|
echo "=== End Diagnostics ==="
|
||||||
|
```
|
||||||
|
|
||||||
|
Save this as `diagnose.sh`, run `chmod +x diagnose.sh && ./diagnose.sh` and share the output.
|
||||||
|
|
||||||
|
## Common Issues & Solutions
|
||||||
|
|
||||||
|
### Issue 1: "Fatal: unable to open config file"
|
||||||
|
**Solution:** Repository not initialized - Run `restic -r /data init`
|
||||||
|
|
||||||
|
### Issue 2: "wrong password"
|
||||||
|
**Solution:** Password mismatch
|
||||||
|
```bash
|
||||||
|
# Check if passwords are different
|
||||||
|
diff /etc/restic-password /data/.restic-password
|
||||||
|
|
||||||
|
# Use the one from shared storage
|
||||||
|
cp /data/.restic-password /etc/restic-password
|
||||||
|
```
|
||||||
|
|
||||||
|
### Issue 3: "permission denied"
|
||||||
|
**Solution:** Fix permissions
|
||||||
|
```bash
|
||||||
|
chown -R root:root /data
|
||||||
|
chmod 755 /data
|
||||||
|
chmod 644 /etc/restic-password
|
||||||
|
```
|
||||||
|
|
||||||
|
### Issue 4: "/data is not on shared storage"
|
||||||
|
**Solution:** Mount shared storage properly (see above)
|
||||||
|
|
||||||
|
## After Fixing
|
||||||
|
|
||||||
|
Once the repository is initialized:
|
||||||
|
|
||||||
|
1. ✅ Test backup: Run "Backup Now" from dashboard
|
||||||
|
2. ✅ Verify: Check "View All Backups" shows your backup
|
||||||
|
3. ✅ The issue should be permanently fixed
|
||||||
|
|
||||||
|
## Prevention for Next Installation
|
||||||
|
|
||||||
|
Before reinstalling the addon:
|
||||||
|
1. ✅ Ensure Shared Storage is mounted to `/data`
|
||||||
|
2. ✅ Verify: `df /data` shows shared storage
|
||||||
|
3. ✅ Test: `touch /data/test && rm /data/test` works
|
||||||
|
4. ✅ Then install the addon
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Need help?** Share the output of the diagnostics script above.
|
||||||
|
|
||||||
|
|
@ -45,10 +45,16 @@ export RESTIC_PASSWORD=$(cat "$PASSWORD_FILE")
|
||||||
|
|
||||||
# Verify repository access
|
# Verify repository access
|
||||||
log_message "Verifying repository access..."
|
log_message "Verifying repository access..."
|
||||||
if ! restic -r "$BACKUP_PATH" snapshots > /dev/null 2>&1; then
|
REPO_CHECK=$(restic -r "$BACKUP_PATH" snapshots 2>&1)
|
||||||
|
REPO_RESULT=$?
|
||||||
|
if [ $REPO_RESULT -ne 0 ]; then
|
||||||
log_message "ERROR: Unable to access the Restic repository. Aborting backup."
|
log_message "ERROR: Unable to access the Restic repository. Aborting backup."
|
||||||
|
log_message "ERROR Details: $REPO_CHECK"
|
||||||
|
log_message "Repository: $BACKUP_PATH"
|
||||||
|
log_message "Password file: $PASSWORD_FILE"
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
log_message "✓ Repository is accessible"
|
||||||
|
|
||||||
# Acquire a global lock to serialize Restic operations
|
# Acquire a global lock to serialize Restic operations
|
||||||
log_message "Acquiring global lock for Restic operations..."
|
log_message "Acquiring global lock for Restic operations..."
|
||||||
|
|
|
||||||
|
|
@ -53,24 +53,64 @@ echo "[INSTALL] Initializing repository..."
|
||||||
export RESTIC_PASSWORD=$(cat /etc/restic-password)
|
export RESTIC_PASSWORD=$(cat /etc/restic-password)
|
||||||
export RESTIC_REPOSITORY=/data
|
export RESTIC_REPOSITORY=/data
|
||||||
|
|
||||||
|
# Verify /data exists and check if it's mounted storage
|
||||||
|
echo "[INSTALL] Checking /data directory..."
|
||||||
|
if [ ! -d "/data" ]; then
|
||||||
|
echo "[INSTALL] Creating /data directory..."
|
||||||
|
mkdir -p /data
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Check if /data is on a mount point (shared storage)
|
||||||
|
MOUNT_INFO=$(df /data | tail -1)
|
||||||
|
echo "[INSTALL] Storage info: $MOUNT_INFO"
|
||||||
|
|
||||||
|
# Verify /data has proper permissions
|
||||||
|
echo "[INSTALL] Checking /data permissions..."
|
||||||
|
if [ ! -w "/data" ]; then
|
||||||
|
echo "[INSTALL] /data is not writable, fixing permissions..."
|
||||||
|
chown -R root:root /data
|
||||||
|
chmod 755 /data
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Test write access
|
||||||
|
echo "[INSTALL] Testing write access to /data..."
|
||||||
|
if ! touch /data/.write_test 2>/dev/null; then
|
||||||
|
echo "[INSTALL] ERROR: Cannot write to /data!"
|
||||||
|
echo "[INSTALL] Please ensure /data is mounted to Shared Storage"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
rm -f /data/.write_test
|
||||||
|
echo "[INSTALL] ✓ /data is writable"
|
||||||
|
|
||||||
# Check if repository is accessible with current password
|
# Check if repository is accessible with current password
|
||||||
|
echo "[INSTALL] Checking for existing repository..."
|
||||||
if restic snapshots >/dev/null 2>&1; then
|
if restic snapshots >/dev/null 2>&1; then
|
||||||
echo "[INSTALL] Repository already exists and is accessible"
|
echo "[INSTALL] ✓ Repository already exists and is accessible"
|
||||||
SNAPSHOT_COUNT=$(restic snapshots --json | jq '. | length' 2>/dev/null || echo "0")
|
SNAPSHOT_COUNT=$(restic snapshots --json 2>/dev/null | jq '. | length' 2>/dev/null || echo "0")
|
||||||
echo "[INSTALL] Found $SNAPSHOT_COUNT existing snapshot(s)"
|
echo "[INSTALL] ✓ Found $SNAPSHOT_COUNT existing snapshot(s)"
|
||||||
else
|
else
|
||||||
# Try to initialize - only works on empty/new repositories
|
# Try to initialize - show errors if it fails
|
||||||
echo "[INSTALL] Attempting to initialize repository..."
|
echo "[INSTALL] No existing repository found, initializing new repository..."
|
||||||
if restic init 2>/dev/null; then
|
INIT_OUTPUT=$(restic init 2>&1)
|
||||||
echo "[INSTALL] New repository initialized successfully"
|
INIT_RESULT=$?
|
||||||
|
|
||||||
|
if [ $INIT_RESULT -eq 0 ]; then
|
||||||
|
echo "[INSTALL] ✓ New repository initialized successfully"
|
||||||
|
# Verify it works
|
||||||
|
if restic snapshots >/dev/null 2>&1; then
|
||||||
|
echo "[INSTALL] ✓ Repository verified and accessible"
|
||||||
else
|
else
|
||||||
# Repository might exist but with different password or corrupted
|
echo "[INSTALL] ERROR: Repository created but not accessible!"
|
||||||
echo "[INSTALL] WARNING: Repository initialization failed"
|
exit 1
|
||||||
echo "[INSTALL] This could mean:"
|
fi
|
||||||
echo "[INSTALL] 1. Repository already exists (safe to ignore)"
|
else
|
||||||
echo "[INSTALL] 2. Password mismatch with existing repository"
|
echo "[INSTALL] ERROR: Repository initialization failed!"
|
||||||
echo "[INSTALL] 3. Permission issues"
|
echo "[INSTALL] Error details: $INIT_OUTPUT"
|
||||||
echo "[INSTALL] Please check repository manually if backups are missing"
|
echo "[INSTALL] This usually means:"
|
||||||
|
echo "[INSTALL] - /data is not persistent storage"
|
||||||
|
echo "[INSTALL] - Permission issues"
|
||||||
|
echo "[INSTALL] - Existing repository with different password"
|
||||||
|
exit 1
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue