Compare commits

...

8 Commits

Author SHA1 Message Date
8ca26e79b6 Update run_azcopy-rotae-log.md 2025-09-20 16:15:21 +01:00
Radek
7ff660ba90 remove test file. change sync to copy 2025-03-05 14:58:42 +00:00
830dc7cdc3 Update run_azcopy-multi-dir-multi-container.sh
try remove unicode hidden char
2025-02-25 10:20:32 +00:00
Radek
042f0871d2 small fix / now tested working 2025-02-25 10:18:02 +00:00
Radek
c5c55e156f fix missing 2025-02-21 15:41:39 +00:00
Radek
383cd7e975 new README 2025-02-21 14:54:03 +00:00
Radek
c48b65a253 new format avoid WHILE loop READ 2025-02-21 14:50:28 +00:00
Radek
397d480141 howto to readme 2025-02-21 12:24:33 +00:00
7 changed files with 150 additions and 120 deletions

41
README.md Normal file
View File

@@ -0,0 +1,41 @@
### **📌 How to Use**
#### **1⃣ Create the Directory List File**
Example: `/opt/AZURE/dir2sync_list`
```plaintext
/opt/AZURE/BLOB > testcontainer115
/opt/AZURE/OTHERBLOB > testcontainer115b
/opt/AZURE/EXTRABLOB > testcontainer_extra
```
---
#### **2⃣ Run the Script**
```bash
./run_azcopy.sh /opt/AZURE/dir2sync_list true 20
```
- `true` → Enables logging.
- `20` → Limits bandwidth to **20 Mbps**.
---
#### **3⃣ Cron Job (Runs Every Hour)**
```bash
0 * * * * /opt/AZURE/run_azcopy.sh /opt/AZURE/dir2sync_list true 20
```
---
### **✅ Key Features**
**Prevents multiple instances** using a lock file (`/tmp/run_azcopy.lock`).
**Reads a file with directory-to-container mappings.**
**Validates each source directory before syncing.**
**Checks if the destination container exists** (test write).
**Stops immediately on any error** (clean runs only).
**Monitors `azcopy` every 30 seconds** and stops if business hours begin.
**Processes all directories sequentially using an array (fixing loop issue).**
**Uses one log file for the entire run.**

1
dir2sync_list Normal file
View File

@@ -0,0 +1 @@
/opt/AZURE/BLOB > testcontainer115

View File

@@ -1,90 +0,0 @@
# **Azure Blob Sync Script - How To Use**
This script syncs a directory to **Azure Blob Storage** using `azcopy` while:
**Avoiding business hours** (default: 9 AM - 5 PM, configurable).
**Preventing duplicate instances** (via a lock file).
**Automatically resuming** unfinished jobs.
**Logging progress & generating reports**.
---
## **📌 1. Script Usage**
Run the script manually:
```bash
./run_azcopy.sh <directory> [log=true|false] [bandwidth_mbps]
```
### **Example Commands**
- **Basic sync with no bandwidth limit:**
```bash
./run_azcopy.sh /opt/AZURE/ false
```
- **Enable logging & limit bandwidth to 10 Mbps:**
```bash
./run_azcopy.sh /opt/AZURE/ true 10
```
- **Run in the background & detach from terminal:**
```bash
nohup ./run_azcopy.sh /opt/AZURE/ true 10 & disown
```
---
## **⏲️ 2. Automating with Cron**
Schedule the script to **run every hour**:
```bash
crontab -e
```
Add this line:
```bash
0 * * * * /path/to/run_azcopy.sh /opt/AZURE/ true 10
```
### **How It Works**
- Runs at **00 minutes past every hour** (e.g., 1:00, 2:00, 3:00, etc.).
- If already running, the **next cron execution exits** to prevent duplicates.
- If interrupted (e.g., business hours), the **next run resumes**.
---
## **🔍 3. Checking If the Script Is Running**
Check if `azcopy` or the script is running:
```bash
pgrep -fl run_azcopy.sh
pgrep -fl azcopy
```
To **stop it manually**:
```bash
pkill -f azcopy
pkill -f run_azcopy.sh
```
---
## **📄 4. Checking Logs & Reports**
- **Sync Log (if enabled)**:
```bash
tail -f azcopy_log_*.txt
```
- **Completion Report**:
```bash
cat completion_report_*.txt
```
---
## **⚙️ 5. Customizing Business Hours**
Modify the script to change business hours:
```bash
BUSINESS_HOURS_START=9
BUSINESS_HOURS_END=17
```
---
## **✅ 6. Expected Behavior**
| Scenario | What Happens? |
|----------|--------------|
| **Cron runs script inside business hours** | Script exits immediately. |
| **Script is running when cron starts again** | Second instance exits to prevent duplicates. |
| **Sync job interrupted by business hours** | Next run resumes automatically. |
| **Sync completes normally** | Report logs all transferred files. |

11
notes Normal file
View File

@@ -0,0 +1,11 @@
cron job using the users cron tab.
#0 * * * * /usr/bin/env bash -lc '/opt/AZURE/run_azcopy.sh /opt/AZURE/dir2sync_list true 20'
the baove runs every hour using a bash login shell this is needed so all VARIABLE expansions work correctly.
We run everything as the user no root.
there are checks and error hadling was created for some cases but motsliukly not all. So logs should be checked on a regular basis.
tested on a smal set of files.

59
run_azcopy-multi-dir-multi-container.sh Normal file → Executable file
View File

@@ -4,29 +4,35 @@
cd /opt/AZURE || exit 1
# Configurable variables
BUSINESS_HOURS_START=9
BUSINESS_HOURS_END=17
# Basic variables need for this scrip[t to operate correctly.
BUSINESS_HOURS_START=16
BUSINESS_HOURS_END=20
AZURE_ACCOUNT=""
AZURE_SAS=""
LOCK_FILE="/tmp/run_azcopy.lock"
# Arguments
# From the comand line. the first is mandatory. There is a check for it if its not provided.
SOURCE_LIST_FILE="$1"
LOGGING="${2:-false}" # Default to false
BANDWIDTH_CAP="${3:-0}" # Default is 0 (no cap)
# Report files
# A bunch of temp files used to generate logs and a simple completion report.
TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
LOG_FILE="azcopy_log_$TIMESTAMP.txt"
COMPLETION_REPORT="completion_report_$TIMESTAMP.txt"
# Ensure source list file is provided
# This the check for source files no ne3ed to run if that is not present.
if [[ -z "$SOURCE_LIST_FILE" || ! -f "$SOURCE_LIST_FILE" ]]; then
echo "Usage: $0 <directory_list_file> [log=true|false] [bandwidth_mbps]"
exit 1
fi
# Lock file to prevent multiple instances
# Part of checks to prevent doubling of running proccesses.
if [[ -f "$LOCK_FILE" ]]; then
PID=$(cat "$LOCK_FILE")
if kill -0 "$PID" 2>/dev/null; then
@@ -40,12 +46,15 @@ fi
echo $$ > "$LOCK_FILE"
# Function to check business hours
# This will ensure we do not run at times when the server might need to do other things.
is_business_hours() {
HOUR=$(printf "%d" "$(date +%H)") # Convert to decimal safely
HOUR=$(date +%H | sed 's/^0*//') # Remove leading zeros causing errors at morning times
#HOUR=$(printf "%d" "$(date +%H)") # Convert to decimal safely / this had an issue for some reason not working
[[ $HOUR -ge $BUSINESS_HOURS_START && $HOUR -lt $BUSINESS_HOURS_END ]]
}
# Stop if running during business hours
# Uses the previous funcion to kill the procces if needed.
if is_business_hours; then
echo "Business hours detected ($BUSINESS_HOURS_START:00 - $BUSINESS_HOURS_END:00). Exiting..."
rm -f "$LOCK_FILE"
@@ -54,8 +63,16 @@ fi
echo "Starting sync job at $(date)" | tee -a "$LOG_FILE"
# Loop through directories in the list
while IFS=">" read -r SOURCE_DIR DEST_CONTAINER; do
# Read the directory list file into an array
# This is used like that becouse of how cron uses a seperate shell enviroment.
mapfile -t SYNC_JOBS < "$SOURCE_LIST_FILE"
# The actual part of the script that does the job.
# Loops through the array and process each entry
# This also does a check to kill the process if it does not complete before the restriction window apply.
# Also checks if a given container exists will stop if it does not.
for LINE in "${SYNC_JOBS[@]}"; do
IFS=">" read -r SOURCE_DIR DEST_CONTAINER <<< "$LINE"
SOURCE_DIR=$(echo "$SOURCE_DIR" | xargs) # Trim spaces
DEST_CONTAINER=$(echo "$DEST_CONTAINER" | xargs) # Trim spaces
@@ -72,49 +89,41 @@ while IFS=">" read -r SOURCE_DIR DEST_CONTAINER; do
fi
DEST_URL="$AZURE_ACCOUNT/$DEST_CONTAINER"
echo "Syncing $SOURCE_DIR to container: $DEST_CONTAINER" | tee -a "$LOG_FILE"
# Check if the container exists by attempting to write a small test file
TEST_FILE="$SOURCE_DIR/.azcopy_test_file"
touch "$TEST_FILE"
azcopy cp "$TEST_FILE" "$DEST_URL?$AZURE_SAS" > /dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo "ERROR: Destination container $DEST_CONTAINER does not exist or is inaccessible. Exiting." | tee -a "$LOG_FILE"
rm -f "$TEST_FILE"
rm -f "$LOCK_FILE"
exit 1
fi
rm -f "$TEST_FILE"
# Run azcopy for actual sync in the background
# Run azcopy in the background (one directory at a time)
if [[ "$LOGGING" == "true" ]]; then
azcopy sync "$SOURCE_DIR" "$DEST_URL?$AZURE_SAS" --recursive --cap-mbps "$BANDWIDTH_CAP" | tee -a "$LOG_FILE" &
# @05-03-2025 testing copy instead of sync due to issues after archiving files in the blob
# azcopy sync "$SOURCE_DIR" "$DEST_URL?$AZURE_SAS" --recursive --cap-mbps "$BANDWIDTH_CAP" | tee -a "$LOG_FILE" &
azcopy copy "$SOURCE_DIR" "$DEST_URL?$AZURE_SAS" --overwrite false --recursive --cap-mbps "$BANDWIDTH_CAP" | tee -a "$LOG_FILE" &
else
azcopy sync "$SOURCE_DIR" "$DEST_URL?$AZURE_SAS" --recursive --cap-mbps "$BANDWIDTH_CAP" > /dev/null 2>&1 &
# @05-03-2025 testing copy instead of sync due to issues after archiving files in the blob
# azcopy sync "$SOURCE_DIR" "$DEST_URL?$AZURE_SAS" --recursive --cap-mbps "$BANDWIDTH_CAP" > /dev/null 2>&1 &
azcopy copy "$SOURCE_DIR" "$DEST_URL?$AZURE_SAS" --overwrite false --recursive --cap-mbps "$BANDWIDTH_CAP" > /dev/null 2>&1 &
fi
AZCOPY_PID=$!
# Monitor the process
# Monitor the process every 30 seconds
while kill -0 $AZCOPY_PID 2>/dev/null; do
if is_business_hours; then
echo -e "\nBusiness hours started! Stopping azcopy..." | tee -a "$LOG_FILE"
kill $AZCOPY_PID
wait $AZCOPY_PID 2>/dev/null # Ensure the process is fully stopped
wait $AZCOPY_PID 2>/dev/null # Ensure process stops completely
rm -f "$LOCK_FILE"
exit 1
fi
sleep 30 # Check every 30 seconds
done
# Check for sync failure
# Check if sync failed
if [[ $? -ne 0 ]]; then
echo "ERROR: Sync failed for $SOURCE_DIR to $DEST_CONTAINER. Stopping script." | tee -a "$LOG_FILE"
rm -f "$LOCK_FILE"
exit 1
fi
done < "$SOURCE_LIST_FILE"
done
echo "All directories synced successfully!" | tee -a "$LOG_FILE"

55
run_azcopy-rotae-log.md Normal file
View File

@@ -0,0 +1,55 @@
##Log rotate
```bash
# /etc/logrotate.d/azure_logs
/opt/AZURE/*.txt {
weekly
missingok
rotate 4
compress
delaycompress
notifempty
create 0644 root root
}
~/.azcopy/*.log {
weekly
missingok
rotate 4
compress
delaycompress
notifempty
create 0644 root root
}
```
### Explanation:
- **weekly**: Rotate the logs on a weekly basis.
- **missingok**: If the log file is missing, go on to the next one without issuing an error message.
- **rotate 4**: Keep only the last 4 weeks of logs.
- **compress**: Compress the rotated logs using gzip.
- **delaycompress**: Delay compression until the next rotation cycle. This means the current log file will not be compressed immediately after rotation, but the previous log files will be.
- **notifempty**: Do not rotate the log if it is empty.
- **create 0644 root root**: Create new log files with owner `root` and group `root`, and permissions `0644`.
### Steps to Apply the Configuration:
1. Save the above configuration in `/etc/logrotate.d/azure_logs`.
2. Ensure that the `logrotate` service is enabled and running on your system.
3. Test the logrotate configuration with the following command to ensure there are no syntax errors:
```bash
sudo logrotate -d /etc/logrotate.d/azure_logs
```
The `-d` option runs logrotate in debug mode, which will show you what actions would be taken without actually performing them.
4. If everything looks good, you can force a rotation to test it:
```bash
sudo logrotate -f /etc/logrotate.d/azure_logs
```
This will rotate the logs immediately according to the specified configuration.
By following these steps, your logs in `/opt/AZURE` and `~/.azcopy` should be rotated weekly, compressed, and kept for only the last 4 weeks.

13
run_azcopy-single-dir.sh Executable file → Normal file
View File

@@ -1,15 +1,18 @@
#!/bin/bash
# Enter the specific 'work directory'Ensure we collect logs and other files in the one place.
cd /opt/AZURE || exit 1
# Configurable variables
BUSINESS_HOURS_START=7
BUSINESS_HOURS_END=15
AZURE_URL="https://<>.core.windows.net/<container>"
AZURE_SAS="<add key here>"
BUSINESS_HOURS_START=9
BUSINESS_HOURS_END=20
AZURE_URL=""
AZURE_SAS=""
# Arguments
SOURCE_DIR="$1"
LOGGING="${2:-false}" # Default to no logs
BANDWIDTH_CAP="${3:-0}" # Default is 0 (no cap)
BANDWIDTH_CAP="${3:-0}" # Default is 0 (no bw limit)
# Report files
TIMESTAMP=$(date +"%Y%m%d_%H%M%S")