Compare commits

..

10 Commits

Author SHA1 Message Date
8ca26e79b6 Update run_azcopy-rotae-log.md 2025-09-20 16:15:21 +01:00
Radek
7ff660ba90 remove test file. change sync to copy 2025-03-05 14:58:42 +00:00
830dc7cdc3 Update run_azcopy-multi-dir-multi-container.sh
try remove unicode hidden char
2025-02-25 10:20:32 +00:00
Radek
042f0871d2 small fix / now tested working 2025-02-25 10:18:02 +00:00
Radek
c5c55e156f fix missing 2025-02-21 15:41:39 +00:00
Radek
383cd7e975 new README 2025-02-21 14:54:03 +00:00
Radek
c48b65a253 new format avoid WHILE loop READ 2025-02-21 14:50:28 +00:00
Radek
397d480141 howto to readme 2025-02-21 12:24:33 +00:00
Radek
e7c1154372 multi dir / multi container version 2025-02-21 12:23:39 +00:00
Radek
ffd8fb4a34 multi dir / multi container version 2025-02-21 12:23:18 +00:00
8 changed files with 253 additions and 196 deletions

41
README.md Normal file
View File

@@ -0,0 +1,41 @@
### **📌 How to Use**
#### **1⃣ Create the Directory List File**
Example: `/opt/AZURE/dir2sync_list`
```plaintext
/opt/AZURE/BLOB > testcontainer115
/opt/AZURE/OTHERBLOB > testcontainer115b
/opt/AZURE/EXTRABLOB > testcontainer_extra
```
---
#### **2⃣ Run the Script**
```bash
./run_azcopy.sh /opt/AZURE/dir2sync_list true 20
```
- `true` → Enables logging.
- `20` → Limits bandwidth to **20 Mbps**.
---
#### **3⃣ Cron Job (Runs Every Hour)**
```bash
0 * * * * /opt/AZURE/run_azcopy.sh /opt/AZURE/dir2sync_list true 20
```
---
### **✅ Key Features**
**Prevents multiple instances** using a lock file (`/tmp/run_azcopy.lock`).
**Reads a file with directory-to-container mappings.**
**Validates each source directory before syncing.**
**Checks if the destination container exists** (test write).
**Stops immediately on any error** (clean runs only).
**Monitors `azcopy` every 30 seconds** and stops if business hours begin.
**Processes all directories sequentially using an array (fixing loop issue).**
**Uses one log file for the entire run.**

1
dir2sync_list Normal file
View File

@@ -0,0 +1 @@
/opt/AZURE/BLOB > testcontainer115

View File

@@ -1,90 +0,0 @@
# **Azure Blob Sync Script - How To Use**
This script syncs a directory to **Azure Blob Storage** using `azcopy` while:
**Avoiding business hours** (default: 9 AM - 5 PM, configurable).
**Preventing duplicate instances** (via a lock file).
**Automatically resuming** unfinished jobs.
**Logging progress & generating reports**.
---
## **📌 1. Script Usage**
Run the script manually:
```bash
./run_azcopy.sh <directory> [log=true|false] [bandwidth_mbps]
```
### **Example Commands**
- **Basic sync with no bandwidth limit:**
```bash
./run_azcopy.sh /opt/AZURE/ false
```
- **Enable logging & limit bandwidth to 10 Mbps:**
```bash
./run_azcopy.sh /opt/AZURE/ true 10
```
- **Run in the background & detach from terminal:**
```bash
nohup ./run_azcopy.sh /opt/AZURE/ true 10 & disown
```
---
## **⏲️ 2. Automating with Cron**
Schedule the script to **run every hour**:
```bash
crontab -e
```
Add this line:
```bash
0 * * * * /path/to/run_azcopy.sh /opt/AZURE/ true 10
```
### **How It Works**
- Runs at **00 minutes past every hour** (e.g., 1:00, 2:00, 3:00, etc.).
- If already running, the **next cron execution exits** to prevent duplicates.
- If interrupted (e.g., business hours), the **next run resumes**.
---
## **🔍 3. Checking If the Script Is Running**
Check if `azcopy` or the script is running:
```bash
pgrep -fl run_azcopy.sh
pgrep -fl azcopy
```
To **stop it manually**:
```bash
pkill -f azcopy
pkill -f run_azcopy.sh
```
---
## **📄 4. Checking Logs & Reports**
- **Sync Log (if enabled)**:
```bash
tail -f azcopy_log_*.txt
```
- **Completion Report**:
```bash
cat completion_report_*.txt
```
---
## **⚙️ 5. Customizing Business Hours**
Modify the script to change business hours:
```bash
BUSINESS_HOURS_START=9
BUSINESS_HOURS_END=17
```
---
## **✅ 6. Expected Behavior**
| Scenario | What Happens? |
|----------|--------------|
| **Cron runs script inside business hours** | Script exits immediately. |
| **Script is running when cron starts again** | Second instance exits to prevent duplicates. |
| **Sync job interrupted by business hours** | Next run resumes automatically. |
| **Sync completes normally** | Report logs all transferred files. |

11
notes Normal file
View File

@@ -0,0 +1,11 @@
cron job using the users cron tab.
#0 * * * * /usr/bin/env bash -lc '/opt/AZURE/run_azcopy.sh /opt/AZURE/dir2sync_list true 20'
the baove runs every hour using a bash login shell this is needed so all VARIABLE expansions work correctly.
We run everything as the user no root.
there are checks and error hadling was created for some cases but motsliukly not all. So logs should be checked on a regular basis.
tested on a smal set of files.

View File

@@ -0,0 +1,137 @@
#!/bin/bash
# Enter the specific 'work directory'Ensure we collect logs and other files in the one place.
cd /opt/AZURE || exit 1
# Configurable variables
# Basic variables need for this scrip[t to operate correctly.
BUSINESS_HOURS_START=16
BUSINESS_HOURS_END=20
AZURE_ACCOUNT=""
AZURE_SAS=""
LOCK_FILE="/tmp/run_azcopy.lock"
# Arguments
# From the comand line. the first is mandatory. There is a check for it if its not provided.
SOURCE_LIST_FILE="$1"
LOGGING="${2:-false}" # Default to false
BANDWIDTH_CAP="${3:-0}" # Default is 0 (no cap)
# Report files
# A bunch of temp files used to generate logs and a simple completion report.
TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
LOG_FILE="azcopy_log_$TIMESTAMP.txt"
COMPLETION_REPORT="completion_report_$TIMESTAMP.txt"
# Ensure source list file is provided
# This the check for source files no ne3ed to run if that is not present.
if [[ -z "$SOURCE_LIST_FILE" || ! -f "$SOURCE_LIST_FILE" ]]; then
echo "Usage: $0 <directory_list_file> [log=true|false] [bandwidth_mbps]"
exit 1
fi
# Lock file to prevent multiple instances
# Part of checks to prevent doubling of running proccesses.
if [[ -f "$LOCK_FILE" ]]; then
PID=$(cat "$LOCK_FILE")
if kill -0 "$PID" 2>/dev/null; then
echo "Another instance (PID $PID) is already running. Exiting..."
exit 1
else
echo "Stale lock file found. Removing..."
rm -f "$LOCK_FILE"
fi
fi
echo $$ > "$LOCK_FILE"
# Function to check business hours
# This will ensure we do not run at times when the server might need to do other things.
is_business_hours() {
HOUR=$(date +%H | sed 's/^0*//') # Remove leading zeros causing errors at morning times
#HOUR=$(printf "%d" "$(date +%H)") # Convert to decimal safely / this had an issue for some reason not working
[[ $HOUR -ge $BUSINESS_HOURS_START && $HOUR -lt $BUSINESS_HOURS_END ]]
}
# Stop if running during business hours
# Uses the previous funcion to kill the procces if needed.
if is_business_hours; then
echo "Business hours detected ($BUSINESS_HOURS_START:00 - $BUSINESS_HOURS_END:00). Exiting..."
rm -f "$LOCK_FILE"
exit 1
fi
echo "Starting sync job at $(date)" | tee -a "$LOG_FILE"
# Read the directory list file into an array
# This is used like that becouse of how cron uses a seperate shell enviroment.
mapfile -t SYNC_JOBS < "$SOURCE_LIST_FILE"
# The actual part of the script that does the job.
# Loops through the array and process each entry
# This also does a check to kill the process if it does not complete before the restriction window apply.
# Also checks if a given container exists will stop if it does not.
for LINE in "${SYNC_JOBS[@]}"; do
IFS=">" read -r SOURCE_DIR DEST_CONTAINER <<< "$LINE"
SOURCE_DIR=$(echo "$SOURCE_DIR" | xargs) # Trim spaces
DEST_CONTAINER=$(echo "$DEST_CONTAINER" | xargs) # Trim spaces
if [[ -z "$SOURCE_DIR" || ! -d "$SOURCE_DIR" ]]; then
echo "ERROR: Invalid directory: $SOURCE_DIR. Exiting." | tee -a "$LOG_FILE"
rm -f "$LOCK_FILE"
exit 1
fi
if [[ -z "$DEST_CONTAINER" ]]; then
echo "ERROR: No destination container specified for $SOURCE_DIR. Exiting." | tee -a "$LOG_FILE"
rm -f "$LOCK_FILE"
exit 1
fi
DEST_URL="$AZURE_ACCOUNT/$DEST_CONTAINER"
echo "Syncing $SOURCE_DIR to container: $DEST_CONTAINER" | tee -a "$LOG_FILE"
# Run azcopy in the background (one directory at a time)
if [[ "$LOGGING" == "true" ]]; then
# @05-03-2025 testing copy instead of sync due to issues after archiving files in the blob
# azcopy sync "$SOURCE_DIR" "$DEST_URL?$AZURE_SAS" --recursive --cap-mbps "$BANDWIDTH_CAP" | tee -a "$LOG_FILE" &
azcopy copy "$SOURCE_DIR" "$DEST_URL?$AZURE_SAS" --overwrite false --recursive --cap-mbps "$BANDWIDTH_CAP" | tee -a "$LOG_FILE" &
else
# @05-03-2025 testing copy instead of sync due to issues after archiving files in the blob
# azcopy sync "$SOURCE_DIR" "$DEST_URL?$AZURE_SAS" --recursive --cap-mbps "$BANDWIDTH_CAP" > /dev/null 2>&1 &
azcopy copy "$SOURCE_DIR" "$DEST_URL?$AZURE_SAS" --overwrite false --recursive --cap-mbps "$BANDWIDTH_CAP" > /dev/null 2>&1 &
fi
AZCOPY_PID=$!
# Monitor the process every 30 seconds
while kill -0 $AZCOPY_PID 2>/dev/null; do
if is_business_hours; then
echo -e "\nBusiness hours started! Stopping azcopy..." | tee -a "$LOG_FILE"
kill $AZCOPY_PID
wait $AZCOPY_PID 2>/dev/null # Ensure process stops completely
rm -f "$LOCK_FILE"
exit 1
fi
sleep 30 # Check every 30 seconds
done
# Check if sync failed
if [[ $? -ne 0 ]]; then
echo "ERROR: Sync failed for $SOURCE_DIR to $DEST_CONTAINER. Stopping script." | tee -a "$LOG_FILE"
rm -f "$LOCK_FILE"
exit 1
fi
done
echo "All directories synced successfully!" | tee -a "$LOG_FILE"
# Generate completion report
echo "Sync Completed: $(date)" > "$COMPLETION_REPORT"
echo "All directories listed in $SOURCE_LIST_FILE have been synced." >> "$COMPLETION_REPORT"
echo "Completion report generated: $COMPLETION_REPORT"
rm -f "$LOCK_FILE"
exit 0

55
run_azcopy-rotae-log.md Normal file
View File

@@ -0,0 +1,55 @@
##Log rotate
```bash
# /etc/logrotate.d/azure_logs
/opt/AZURE/*.txt {
weekly
missingok
rotate 4
compress
delaycompress
notifempty
create 0644 root root
}
~/.azcopy/*.log {
weekly
missingok
rotate 4
compress
delaycompress
notifempty
create 0644 root root
}
```
### Explanation:
- **weekly**: Rotate the logs on a weekly basis.
- **missingok**: If the log file is missing, go on to the next one without issuing an error message.
- **rotate 4**: Keep only the last 4 weeks of logs.
- **compress**: Compress the rotated logs using gzip.
- **delaycompress**: Delay compression until the next rotation cycle. This means the current log file will not be compressed immediately after rotation, but the previous log files will be.
- **notifempty**: Do not rotate the log if it is empty.
- **create 0644 root root**: Create new log files with owner `root` and group `root`, and permissions `0644`.
### Steps to Apply the Configuration:
1. Save the above configuration in `/etc/logrotate.d/azure_logs`.
2. Ensure that the `logrotate` service is enabled and running on your system.
3. Test the logrotate configuration with the following command to ensure there are no syntax errors:
```bash
sudo logrotate -d /etc/logrotate.d/azure_logs
```
The `-d` option runs logrotate in debug mode, which will show you what actions would be taken without actually performing them.
4. If everything looks good, you can force a rotation to test it:
```bash
sudo logrotate -f /etc/logrotate.d/azure_logs
```
This will rotate the logs immediately according to the specified configuration.
By following these steps, your logs in `/opt/AZURE` and `~/.azcopy` should be rotated weekly, compressed, and kept for only the last 4 weeks.

13
run_azcopy.sh → run_azcopy-single-dir.sh Executable file → Normal file
View File

@@ -1,15 +1,18 @@
#!/bin/bash #!/bin/bash
# Enter the specific 'work directory'Ensure we collect logs and other files in the one place.
cd /opt/AZURE || exit 1
# Configurable variables # Configurable variables
BUSINESS_HOURS_START=7 BUSINESS_HOURS_START=9
BUSINESS_HOURS_END=15 BUSINESS_HOURS_END=20
AZURE_URL="https://<>.core.windows.net/<container>" AZURE_URL=""
AZURE_SAS="<add key here>" AZURE_SAS=""
# Arguments # Arguments
SOURCE_DIR="$1" SOURCE_DIR="$1"
LOGGING="${2:-false}" # Default to no logs LOGGING="${2:-false}" # Default to no logs
BANDWIDTH_CAP="${3:-0}" # Default is 0 (no cap) BANDWIDTH_CAP="${3:-0}" # Default is 0 (no bw limit)
# Report files # Report files
TIMESTAMP=$(date +"%Y%m%d_%H%M%S") TIMESTAMP=$(date +"%Y%m%d_%H%M%S")

101
tests.md
View File

@@ -1,101 +0,0 @@
#1st test.
Just run.
Detected busines hours and stopped.
---
[test@alma-azure-test AZURE]$ ./run_azcopy.sh /opt/AZURE/ true 20
Business hours detected (9:00 - 17:00). Exiting...
#2nd test.
Changed busines hours to something outside the time of test. To se if it operates as expected. 13 to 16
Started sync just before 13 to se if a file gets copied to blob and if the bandwith limit works and that the script gets stopped before finishing becouse going over after 13
---
[test@alma-azure-test AZURE]$ ./run_azcopy.sh /opt/AZURE/ true 10 &
[2] 8109
[test@alma-azure-test AZURE]$ Initial file list created: 8 files found.
Running: azcopy sync "/opt/AZURE/" "https://115.blob.core.windows.net/115?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwlacupitfx&se=2025-02-20T18:39:49Z&st=2025-02-19T10:39:49Z&spr=https&sig=xxxx" --recursive --cap-mbps 10 &
azcopy started with PID 8128
Error: 2 arguments source and destination are required for this command. Number of commands passed 3
Issues going to background.
#3rd test.
Amended the way fucnion calls azcopy to beter handle backgrouind tasks. Moved some files out of directory to sync in order to do basic sync test.
---
[test@alma-azure-test AZURE]$ ./run_azcopy.sh /opt/AZURE/BLOB/ true 20
Initial file list created: 1 files found.
Running: azcopy sync "/opt/AZURE/BLOB/" "https://115.blob.core.windows.net/115?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwlacupitfx&se=2025-02-20T18:39:49Z&st=2025-02-19T10:39:49Z&spr=https&sig=xxxx" --recursive --cap-mbps 20
azcopy started with PID 8493
INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support
Job 9692ef9e-3872-f64c-5dbb-6f8fd8ad220e has started
Log file is located at: /home/test/.azcopy/9692ef9e-3872-f64c-5dbb-6f8fd8ad220e.log
100.0 %, 1 Done, 0 Failed, 0 Pending, 1 Total, 2-sec Throughput (Mb/s): 9.1727
Job 9692ef9e-3872-f64c-5dbb-6f8fd8ad220e Summary
Files Scanned at Source: 1
Files Scanned at Destination: 2
Elapsed Time (Minutes): 4.4351
Number of Copy Transfers for Files: 1
Number of Copy Transfers for Folder Properties: 0
Total Number of Copy Transfers: 1
Number of Copy Transfers Completed: 1
Number of Copy Transfers Failed: 0
Number of Deletions at Destination: 0
Total Number of Bytes Transferred: 662880628
Total Number of Bytes Enumerated: 662880628
Final Job Status: Completed
Completion report generated: completion_report_20250220_132836.txt
#4th test.
Changed busines hours to test localy second time going into the restricted window to check if it will stopp the process. Window 14-15 so it should stop at 14 and then resume at 15.
We will use local execution to test if it stopps and then will test cron if it resumes as it should.
---
[test@alma-azure-test AZURE]$ ./run_azcopy.sh /opt/AZURE/BLOB/ true 20
Initial file list created: 4 files found.
Running: azcopy sync "/opt/AZURE/BLOB/" "https://115.blob.core.windows.net/115?sv=2022-11-02&ss=bfqt&srt=sco&sp=rwlacupitfx&se=2025-02-20T18:39:49Z&st=2025-02-19T10:39:49Z&spr=https&sig=xxxx" --recursive --cap-mbps 20
azcopy started with PID 8654
INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support
Job 13c1945e-3c6e-dd42-49c0-2c6722ced7ec has started
Log file is located at: /home/test/.azcopy/13c1945e-3c6e-dd42-49c0-2c6722ced7ec.log
54.7 %, 1 Done, 0 Failed, 2 Pending, 3 Total, 2-sec Throughput (Mb/s): 20.1759Business hours started! Stopping azcopy...
./run_azcopy.sh: line 95: COMPLETED_FILES * 100 / TOTAL_FILES : division by 0 (error token is "TOTAL_FILES ")
Completion report generated: completion_report_20250220_135033.txt
#5th test.
Changed edge case were 0 files were transfered or incorectly processed from the log creating a division by 0 error. added check if it is running still to ensure we do not spawn many proccesses.
It should start via cron at 15 and then be re run every hour untill 7Am where it should detect restricted business hours window and do nothing.
Failed due to how cron and its enviroment work.
---
Feb 20 15:25:02 alma-azure-test CROND[8956]: (test) CMD (/usr/bin/env bash -lc /opt/AZURE/run_azcopy.sh /opt/AZURE/BLOB/ true 20)
Feb 20 15:25:02 alma-azure-test CROND[8954]: (test) CMDOUT (Usage: /opt/AZURE/run_azcopy.sh <directory> [log=true|false] [bandwidth_mbps])
Feb 20 15:25:02 alma-azure-test CROND[8954]: (test) CMDEND (/usr/bin/env bash -lc /opt/AZURE/run_azcopy.sh /opt/AZURE/BLOB/ true 20)
#6th test.
modified the cron job definition to.
27 * * * * /usr/bin/env bash -lc '/opt/AZURE/run_azcopy.sh /opt/AZURE/BLOB/ true 20'
Should ensure proper execution in a users envirmont.
---
That worked and its running in the background.