new
This commit is contained in:
90
howto.md
Normal file
90
howto.md
Normal file
@@ -0,0 +1,90 @@
|
|||||||
|
# **Azure Blob Sync Script - How To Use**
|
||||||
|
|
||||||
|
This script syncs a directory to **Azure Blob Storage** using `azcopy` while:
|
||||||
|
✅ **Avoiding business hours** (default: 9 AM - 5 PM, configurable).
|
||||||
|
✅ **Preventing duplicate instances** (via a lock file).
|
||||||
|
✅ **Automatically resuming** unfinished jobs.
|
||||||
|
✅ **Logging progress & generating reports**.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **📌 1. Script Usage**
|
||||||
|
Run the script manually:
|
||||||
|
```bash
|
||||||
|
./run_azcopy.sh <directory> [log=true|false] [bandwidth_mbps]
|
||||||
|
```
|
||||||
|
### **Example Commands**
|
||||||
|
- **Basic sync with no bandwidth limit:**
|
||||||
|
```bash
|
||||||
|
./run_azcopy.sh /opt/AZURE/ false
|
||||||
|
```
|
||||||
|
- **Enable logging & limit bandwidth to 10 Mbps:**
|
||||||
|
```bash
|
||||||
|
./run_azcopy.sh /opt/AZURE/ true 10
|
||||||
|
```
|
||||||
|
- **Run in the background & detach from terminal:**
|
||||||
|
```bash
|
||||||
|
nohup ./run_azcopy.sh /opt/AZURE/ true 10 & disown
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **⏲️ 2. Automating with Cron**
|
||||||
|
Schedule the script to **run every hour**:
|
||||||
|
```bash
|
||||||
|
crontab -e
|
||||||
|
```
|
||||||
|
Add this line:
|
||||||
|
```bash
|
||||||
|
0 * * * * /path/to/run_azcopy.sh /opt/AZURE/ true 10
|
||||||
|
```
|
||||||
|
### **How It Works**
|
||||||
|
- Runs at **00 minutes past every hour** (e.g., 1:00, 2:00, 3:00, etc.).
|
||||||
|
- If already running, the **next cron execution exits** to prevent duplicates.
|
||||||
|
- If interrupted (e.g., business hours), the **next run resumes**.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **🔍 3. Checking If the Script Is Running**
|
||||||
|
Check if `azcopy` or the script is running:
|
||||||
|
```bash
|
||||||
|
pgrep -fl run_azcopy.sh
|
||||||
|
pgrep -fl azcopy
|
||||||
|
```
|
||||||
|
To **stop it manually**:
|
||||||
|
```bash
|
||||||
|
pkill -f azcopy
|
||||||
|
pkill -f run_azcopy.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **📄 4. Checking Logs & Reports**
|
||||||
|
- **Sync Log (if enabled)**:
|
||||||
|
```bash
|
||||||
|
tail -f azcopy_log_*.txt
|
||||||
|
```
|
||||||
|
- **Completion Report**:
|
||||||
|
```bash
|
||||||
|
cat completion_report_*.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **⚙️ 5. Customizing Business Hours**
|
||||||
|
Modify the script to change business hours:
|
||||||
|
```bash
|
||||||
|
BUSINESS_HOURS_START=9
|
||||||
|
BUSINESS_HOURS_END=17
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## **✅ 6. Expected Behavior**
|
||||||
|
| Scenario | What Happens? |
|
||||||
|
|----------|--------------|
|
||||||
|
| **Cron runs script inside business hours** | Script exits immediately. |
|
||||||
|
| **Script is running when cron starts again** | Second instance exits to prevent duplicates. |
|
||||||
|
| **Sync job interrupted by business hours** | Next run resumes automatically. |
|
||||||
|
| **Sync completes normally** | Report logs all transferred files. |
|
||||||
|
|
||||||
137
run_azcopy.sh
Executable file
137
run_azcopy.sh
Executable file
@@ -0,0 +1,137 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# Configurable variables
|
||||||
|
BUSINESS_HOURS_START=7
|
||||||
|
BUSINESS_HOURS_END=15
|
||||||
|
AZURE_URL="https://<>.core.windows.net/<container>"
|
||||||
|
AZURE_SAS="<add key here>"
|
||||||
|
|
||||||
|
# Arguments
|
||||||
|
SOURCE_DIR="$1"
|
||||||
|
LOGGING="${2:-false}" # Default to no logs
|
||||||
|
BANDWIDTH_CAP="${3:-0}" # Default is 0 (no cap)
|
||||||
|
|
||||||
|
# Report files
|
||||||
|
TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
|
||||||
|
FILE_LIST="file_list_$TIMESTAMP.txt"
|
||||||
|
COMPLETION_REPORT="completion_report_$TIMESTAMP.txt"
|
||||||
|
LOG_FILE="azcopy_log_$TIMESTAMP.txt"
|
||||||
|
|
||||||
|
# Lock file to prevent multiple instances
|
||||||
|
LOCK_FILE="/tmp/run_azcopy.lock"
|
||||||
|
|
||||||
|
# Check if another instance is running
|
||||||
|
if [[ -f "$LOCK_FILE" ]]; then
|
||||||
|
PID=$(cat "$LOCK_FILE")
|
||||||
|
if kill -0 "$PID" 2>/dev/null; then
|
||||||
|
echo "Another instance (PID $PID) is already running. Exiting..."
|
||||||
|
exit 1
|
||||||
|
else
|
||||||
|
echo "Stale lock file found. Removing..."
|
||||||
|
rm -f "$LOCK_FILE"
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Create lock file with current PID
|
||||||
|
echo $$ > "$LOCK_FILE"
|
||||||
|
|
||||||
|
|
||||||
|
# Function to check business hours
|
||||||
|
is_business_hours() {
|
||||||
|
HOUR=$(date +%H)
|
||||||
|
[[ $HOUR -ge $BUSINESS_HOURS_START && $HOUR -lt $BUSINESS_HOURS_END ]]
|
||||||
|
}
|
||||||
|
|
||||||
|
# Ensure source directory is provided
|
||||||
|
if [[ -z "$SOURCE_DIR" || ! -d "$SOURCE_DIR" ]]; then
|
||||||
|
echo "Usage: $0 <directory> [log=true|false] [bandwidth_mbps]"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Check if already within business hours before starting
|
||||||
|
if is_business_hours; then
|
||||||
|
echo "Business hours detected ($BUSINESS_HOURS_START:00 - $BUSINESS_HOURS_END:00). Exiting..."
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Create initial file list
|
||||||
|
find "$SOURCE_DIR" -type f > "$FILE_LIST"
|
||||||
|
TOTAL_FILES=$(wc -l < "$FILE_LIST")
|
||||||
|
|
||||||
|
echo "Initial file list created: $TOTAL_FILES files found."
|
||||||
|
|
||||||
|
# Check for incomplete azcopy jobs
|
||||||
|
PENDING_JOB=$(azcopy jobs list --with-status=InProgress | awk '/JobId:/ {print $2; exit}')
|
||||||
|
|
||||||
|
if [[ -n "$PENDING_JOB" ]]; then
|
||||||
|
echo "Resuming previous job: $PENDING_JOB"
|
||||||
|
azcopy jobs resume "$PENDING_JOB" &
|
||||||
|
AZCOPY_PID=$!
|
||||||
|
else
|
||||||
|
# Run azcopy in the background for a new sync
|
||||||
|
if [[ "$LOGGING" == "true" ]]; then
|
||||||
|
echo "Running: azcopy sync \"$SOURCE_DIR\" \"$AZURE_URL?$AZURE_SAS\" --recursive --cap-mbps $BANDWIDTH_CAP" | tee -a "$LOG_FILE"
|
||||||
|
azcopy sync "$SOURCE_DIR" "$AZURE_URL?$AZURE_SAS" --recursive --cap-mbps "$BANDWIDTH_CAP" | tee -a "$LOG_FILE" &
|
||||||
|
else
|
||||||
|
azcopy sync "$SOURCE_DIR" "$AZURE_URL?$AZURE_SAS" --recursive --cap-mbps "$BANDWIDTH_CAP" > /dev/null 2>&1 &
|
||||||
|
fi
|
||||||
|
AZCOPY_PID=$!
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "azcopy started with PID $AZCOPY_PID"
|
||||||
|
|
||||||
|
# Monitor the process
|
||||||
|
while kill -0 $AZCOPY_PID 2>/dev/null; do
|
||||||
|
if is_business_hours; then
|
||||||
|
echo -e "\nBusiness hours started! Stopping azcopy..."
|
||||||
|
kill $AZCOPY_PID
|
||||||
|
wait $AZCOPY_PID 2>/dev/null # Ensure the process is fully stopped
|
||||||
|
INTERRUPTED=true
|
||||||
|
break
|
||||||
|
fi
|
||||||
|
sleep 30 # Check every 30 seconds
|
||||||
|
done
|
||||||
|
|
||||||
|
# Generate completion report
|
||||||
|
if [[ "$INTERRUPTED" == "true" ]]; then
|
||||||
|
STATUS="Interrupted due to business hours, can be resumed"
|
||||||
|
else
|
||||||
|
STATUS="Completed normally"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Extract job summary from log (only if logging is enabled)
|
||||||
|
if [[ "$LOGGING" == "true" && -f "$LOG_FILE" ]]; then
|
||||||
|
COMPLETED_FILES=$(grep -oP 'Number of Copy Transfers Completed:\s+\K\d+' "$LOG_FILE" | tail -n1)
|
||||||
|
FAILED_FILES=$(grep -oP 'Number of Copy Transfers Failed:\s+\K\d+' "$LOG_FILE" | tail -n1)
|
||||||
|
TOTAL_FILES=$(grep -oP 'Total Number of Copy Transfers:\s+\K\d+' "$LOG_FILE" | tail -n1)
|
||||||
|
|
||||||
|
# If values are empty, fallback to 0
|
||||||
|
COMPLETED_FILES=${COMPLETED_FILES:-0}
|
||||||
|
FAILED_FILES=${FAILED_FILES:-0}
|
||||||
|
TOTAL_FILES=${TOTAL_FILES:-$COMPLETED_FILES}
|
||||||
|
|
||||||
|
# Calculate percentage
|
||||||
|
if [[ "$TOTAL_FILES" -eq 0 ]]; then
|
||||||
|
PERCENT_COMPLETE=0
|
||||||
|
else
|
||||||
|
PERCENT_COMPLETE=$(( COMPLETED_FILES * 100 / TOTAL_FILES ))
|
||||||
|
fi
|
||||||
|
else
|
||||||
|
PERCENT_COMPLETE=0
|
||||||
|
COMPLETED_FILES=0
|
||||||
|
FAILED_FILES=0
|
||||||
|
TOTAL_FILES=0
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "Sync Status: $STATUS" > "$COMPLETION_REPORT"
|
||||||
|
echo "Total Files: $TOTAL_FILES" >> "$COMPLETION_REPORT"
|
||||||
|
echo "Completed Files: $COMPLETED_FILES" >> "$COMPLETION_REPORT"
|
||||||
|
echo "Percentage Completed: $PERCENT_COMPLETE%" >> "$COMPLETION_REPORT"
|
||||||
|
|
||||||
|
echo "Completion report generated: $COMPLETION_REPORT"
|
||||||
|
|
||||||
|
# Remove lock file when done
|
||||||
|
rm -f "$LOCK_FILE"
|
||||||
|
|
||||||
|
exit 0
|
||||||
|
|
||||||
83
tests.md
Normal file
83
tests.md
Normal file
@@ -0,0 +1,83 @@
|
|||||||
|
#1st test.
|
||||||
|
|
||||||
|
Just run.
|
||||||
|
|
||||||
|
Detected busines hours and stopped.
|
||||||
|
|
||||||
|
---
|
||||||
|
[test@alma-azure-test AZURE]$ ./run_azcopy.sh /opt/AZURE/ true 20
|
||||||
|
Business hours detected (9:00 - 17:00). Exiting...
|
||||||
|
|
||||||
|
#2nd test.
|
||||||
|
|
||||||
|
Changed busines hours to something outside the time of test. To se if it operates as expected. 13 to 16
|
||||||
|
|
||||||
|
Started sync just before 13 to se if a file gets copied to blob and if the bandwith limit works and that the script gets stopped before finishing becouse going over after 13
|
||||||
|
|
||||||
|
---
|
||||||
|
[test@alma-azure-test AZURE]$ ./run_azcopy.sh /opt/AZURE/ true 10 &
|
||||||
|
[2] 8109
|
||||||
|
[test@alma-azure-test AZURE]$ Initial file list created: 8 files found.
|
||||||
|
Running: azcopy sync "/opt/AZURE/" "https://115.blob.core.windows.net/115?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwlacupitfx&se=2025-02-20T18:39:49Z&st=2025-02-19T10:39:49Z&spr=https&sig=xxxx" --recursive --cap-mbps 10 &
|
||||||
|
azcopy started with PID 8128
|
||||||
|
Error: 2 arguments source and destination are required for this command. Number of commands passed 3
|
||||||
|
|
||||||
|
Issues going to background.
|
||||||
|
|
||||||
|
#3rd test.
|
||||||
|
|
||||||
|
Amended the way fucnion calls azcopy to beter handle backgrouind tasks. Moved some files out of directory to sync in order to do basic sync test.
|
||||||
|
|
||||||
|
---
|
||||||
|
[test@alma-azure-test AZURE]$ ./run_azcopy.sh /opt/AZURE/BLOB/ true 20
|
||||||
|
Initial file list created: 1 files found.
|
||||||
|
Running: azcopy sync "/opt/AZURE/BLOB/" "https://115.blob.core.windows.net/115?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwlacupitfx&se=2025-02-20T18:39:49Z&st=2025-02-19T10:39:49Z&spr=https&sig=xxxx" --recursive --cap-mbps 20
|
||||||
|
azcopy started with PID 8493
|
||||||
|
INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support
|
||||||
|
|
||||||
|
Job 9692ef9e-3872-f64c-5dbb-6f8fd8ad220e has started
|
||||||
|
Log file is located at: /home/test/.azcopy/9692ef9e-3872-f64c-5dbb-6f8fd8ad220e.log
|
||||||
|
|
||||||
|
100.0 %, 1 Done, 0 Failed, 0 Pending, 1 Total, 2-sec Throughput (Mb/s): 9.1727
|
||||||
|
|
||||||
|
Job 9692ef9e-3872-f64c-5dbb-6f8fd8ad220e Summary
|
||||||
|
Files Scanned at Source: 1
|
||||||
|
Files Scanned at Destination: 2
|
||||||
|
Elapsed Time (Minutes): 4.4351
|
||||||
|
Number of Copy Transfers for Files: 1
|
||||||
|
Number of Copy Transfers for Folder Properties: 0
|
||||||
|
Total Number of Copy Transfers: 1
|
||||||
|
Number of Copy Transfers Completed: 1
|
||||||
|
Number of Copy Transfers Failed: 0
|
||||||
|
Number of Deletions at Destination: 0
|
||||||
|
Total Number of Bytes Transferred: 662880628
|
||||||
|
Total Number of Bytes Enumerated: 662880628
|
||||||
|
Final Job Status: Completed
|
||||||
|
|
||||||
|
Completion report generated: completion_report_20250220_132836.txt
|
||||||
|
|
||||||
|
#4th test.
|
||||||
|
|
||||||
|
Changed busines hours to test localy second time going into the restricted window to check if it will stopp the process. Window 14-15 so it should stop at 14 and then resume at 15.
|
||||||
|
We will use local execution to test if it stopps and then will test cron if it resumes as it should.
|
||||||
|
|
||||||
|
---
|
||||||
|
[test@alma-azure-test AZURE]$ ./run_azcopy.sh /opt/AZURE/BLOB/ true 20
|
||||||
|
Initial file list created: 4 files found.
|
||||||
|
Running: azcopy sync "/opt/AZURE/BLOB/" "https://115.blob.core.windows.net/115?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwlacupitfx&se=2025-02-20T18:39:49Z&st=2025-02-19T10:39:49Z&spr=https&sig=xxxx" --recursive --cap-mbps 20
|
||||||
|
azcopy started with PID 8654
|
||||||
|
INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support
|
||||||
|
|
||||||
|
Job 13c1945e-3c6e-dd42-49c0-2c6722ced7ec has started
|
||||||
|
Log file is located at: /home/test/.azcopy/13c1945e-3c6e-dd42-49c0-2c6722ced7ec.log
|
||||||
|
|
||||||
|
54.7 %, 1 Done, 0 Failed, 2 Pending, 3 Total, 2-sec Throughput (Mb/s): 20.1759Business hours started! Stopping azcopy...
|
||||||
|
./run_azcopy.sh: line 95: COMPLETED_FILES * 100 / TOTAL_FILES : division by 0 (error token is "TOTAL_FILES ")
|
||||||
|
Completion report generated: completion_report_20250220_135033.txt
|
||||||
|
|
||||||
|
#5th test.
|
||||||
|
|
||||||
|
Changed edge case were 0 files were transfered or incorectly processed from the log creating a division by 0 error. added check if it is running still to ensure we do not spawn many proccesses.
|
||||||
|
It should start via cron at 15 and then be re run every hour untill 7Am where it should detect restricted business hours window and do nothing.
|
||||||
|
|
||||||
|
---
|
||||||
Reference in New Issue
Block a user