Compare commits
11 Commits
1e286536e3
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
b01c162db9 | ||
|
|
b87f733969 | ||
|
|
cbaa07676c | ||
|
|
1ce128ff43 | ||
|
|
3973e7a803 | ||
|
|
4e9b920334 | ||
|
|
d588116f6b | ||
|
|
cdac727a28 | ||
|
|
222e8c99af | ||
|
|
5da6f7e95c | ||
|
|
c7a96e29c1 |
3
.gitignore
vendored
3
.gitignore
vendored
@@ -1,3 +1,6 @@
|
|||||||
|
# ---> Aii
|
||||||
|
.crush
|
||||||
|
|
||||||
# ---> Python
|
# ---> Python
|
||||||
# Byte-compiled / optimized / DLL files
|
# Byte-compiled / optimized / DLL files
|
||||||
__pycache__/
|
__pycache__/
|
||||||
|
|||||||
34
CRUSH.md
Normal file
34
CRUSH.md
Normal file
@@ -0,0 +1,34 @@
|
|||||||
|
# Fedora Metalink Proxy Codebase Guide
|
||||||
|
|
||||||
|
## Build/Lint/Test Commands
|
||||||
|
|
||||||
|
- **Build**: `docker-compose up --build` - Builds and runs the Docker container
|
||||||
|
- **Run**: `docker-compose up` - Starts the application
|
||||||
|
- **Test**: `python -m unittest discover` - Runs all tests (no test files found)
|
||||||
|
- **Lint**: `pylint app.py` - Lints the main application file
|
||||||
|
|
||||||
|
## Code Style Guidelines
|
||||||
|
|
||||||
|
- **Imports**: Grouped by standard library, third-party, local imports
|
||||||
|
- **Formatting**: 4-space indentation, spaces around operators
|
||||||
|
- **Types**: Use type hints for function signatures
|
||||||
|
- **Naming**: snake_case for variables/functions, CamelCase for classes
|
||||||
|
- **Error Handling**: Use try/except blocks for network operations
|
||||||
|
- **Environment**: Use `python-dotenv` for configuration
|
||||||
|
- **Logging**: Use `print()` for debugging (consider structured logging)
|
||||||
|
- **Security**: Validate all user inputs and environment variables
|
||||||
|
|
||||||
|
## Project Structure
|
||||||
|
|
||||||
|
- `app.py`: Main Flask application
|
||||||
|
- `templates/`: HTML templates for web interface
|
||||||
|
- `data/`: Persistent data directory (mounted in Docker)
|
||||||
|
- `Dockerfile`: Container configuration
|
||||||
|
- `docker-compose.yml`: Orchestration setup
|
||||||
|
|
||||||
|
## Key Components
|
||||||
|
|
||||||
|
- Flask web framework with Gunicorn for production
|
||||||
|
- BeautifulSoup for XML parsing
|
||||||
|
- Requests for HTTP operations
|
||||||
|
- Pandas/matplotlib for statistics visualization
|
||||||
20
Dockerfile
Normal file
20
Dockerfile
Normal file
@@ -0,0 +1,20 @@
|
|||||||
|
# Use an official Python runtime as a parent image
|
||||||
|
FROM python:3.9-slim
|
||||||
|
|
||||||
|
# Set the working directory in the container
|
||||||
|
WORKDIR /opt/app
|
||||||
|
|
||||||
|
# Copy the current directory contents into the container
|
||||||
|
COPY . /opt/app/
|
||||||
|
|
||||||
|
# Install system dependencies required for lxml and matplotlib
|
||||||
|
RUN apt-get update && apt-get install -y libxml2-dev libxslt1-dev
|
||||||
|
|
||||||
|
# Install Python packages
|
||||||
|
RUN pip install --no-cache-dir Flask requests python-dotenv beautifulsoup4 gunicorn lxml pandas matplotlib
|
||||||
|
|
||||||
|
# Make port 8182 available to the world outside this container
|
||||||
|
EXPOSE 8182
|
||||||
|
|
||||||
|
# Run Gunicorn
|
||||||
|
CMD ["gunicorn", "--bind", "0.0.0.0:8182", "app:app"]
|
||||||
60
README.md
60
README.md
@@ -1,3 +1,59 @@
|
|||||||
# fedora-metalink-proxy
|
# Fedora Metalink Proxy
|
||||||
|
|
||||||
Proxy for the Fedora metalink mirror list. Contains a filter for the variables provided in the xml file.
|
A Flask-based web application that acts as a proxy for Fedora Metalink files, allowing filtering based on various criteria such as country, protocol, and preference.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- Proxy for Fedora Metalink files with configurable filtering options
|
||||||
|
- Web interface for configuring filter settings
|
||||||
|
- Basic statistics and logging for monitoring request patterns
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
- Docker
|
||||||
|
- Docker Compose
|
||||||
|
|
||||||
|
## Setup
|
||||||
|
|
||||||
|
1. Clone the repository with:
|
||||||
|
`git clone <repository-url>`
|
||||||
|
Then navigate to the project directory:
|
||||||
|
`cd <repository-directory>`
|
||||||
|
|
||||||
|
2. Configure environment variables by copying the example file:
|
||||||
|
`cp .env.example .env`
|
||||||
|
Then edit the `.env` file to set your desired configuration options.
|
||||||
|
|
||||||
|
3. Build and run the Docker container with:
|
||||||
|
`docker-compose up --build`
|
||||||
|
This command will build the Docker image and start the container.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
### Access the Application
|
||||||
|
Open a web browser and navigate to `http://localhost:8182` to access the application.
|
||||||
|
|
||||||
|
### Configure Filter Settings
|
||||||
|
Navigate to `/dash/config` to configure the filter settings for the Metalink proxy.
|
||||||
|
|
||||||
|
### View Statistics
|
||||||
|
Navigate to `/dash/stats` to view basic statistics on request patterns.
|
||||||
|
|
||||||
|
### Access Metalink Files
|
||||||
|
Use the `/metalink` endpoint with query parameters to fetch and filter Metalink files.
|
||||||
|
|
||||||
|
Example URL format:
|
||||||
|
`http://localhost:8182/metalink?repo=fedora-42&arch=x86_64`
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
The application uses environment variables for configuration. Set these variables in the `.env` file:
|
||||||
|
|
||||||
|
- `EXCLUDED_COUNTRIES`: List of country codes to exclude
|
||||||
|
- `PREFERRED_PROTOCOLS`: List of preferred protocols
|
||||||
|
- `PREFERRED_TYPES`: List of preferred types
|
||||||
|
- `MIN_PREFERENCE`: Minimum preference value for filtering
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
MIT
|
||||||
|
|||||||
155
app.py
Normal file
155
app.py
Normal file
@@ -0,0 +1,155 @@
|
|||||||
|
import os
|
||||||
|
import requests
|
||||||
|
import time
|
||||||
|
from collections import defaultdict
|
||||||
|
from flask import Flask, render_template, request, redirect, url_for, Response
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
from bs4 import BeautifulSoup
|
||||||
|
import pandas as pd
|
||||||
|
import matplotlib.pyplot as plt
|
||||||
|
from io import BytesIO
|
||||||
|
import base64
|
||||||
|
|
||||||
|
# Load environment variables from .env file
|
||||||
|
load_dotenv()
|
||||||
|
|
||||||
|
app = Flask(__name__, template_folder='templates')
|
||||||
|
|
||||||
|
# Data structure to store request statistics
|
||||||
|
request_stats = defaultdict(lambda: defaultdict(int))
|
||||||
|
|
||||||
|
def get_config():
|
||||||
|
"""Retrieve configuration from environment variables."""
|
||||||
|
return {
|
||||||
|
"excluded_countries": os.getenv('EXCLUDED_COUNTRIES', '[]'),
|
||||||
|
"preferred_protocols": os.getenv('PREFERRED_PROTOCOLS', '["https", "http"]'),
|
||||||
|
"preferred_types": os.getenv('PREFERRED_TYPES', '["https", "http"]'),
|
||||||
|
"min_preference": os.getenv('MIN_PREFERENCE', '0')
|
||||||
|
}
|
||||||
|
|
||||||
|
def log_request(repo, arch, protocol):
|
||||||
|
"""Log request statistics."""
|
||||||
|
timestamp = int(time.time())
|
||||||
|
hour = time.strftime('%Y-%m-%d %H', time.localtime(timestamp))
|
||||||
|
day = time.strftime('%Y-%m-%d', time.localtime(timestamp))
|
||||||
|
week = time.strftime('%Y-%U', time.localtime(timestamp))
|
||||||
|
month = time.strftime('%Y-%m', time.localtime(timestamp))
|
||||||
|
|
||||||
|
request_stats[hour][(repo, arch, protocol)] += 1
|
||||||
|
request_stats[day][(repo, arch, protocol)] += 1
|
||||||
|
request_stats[week][(repo, arch, protocol)] += 1
|
||||||
|
request_stats[month][(repo, arch, protocol)] += 1
|
||||||
|
|
||||||
|
@app.route('/metalink')
|
||||||
|
def get_metalink():
|
||||||
|
# Get query parameters from the request
|
||||||
|
repo = request.args.get('repo')
|
||||||
|
arch = request.args.get('arch')
|
||||||
|
|
||||||
|
# Check if required parameters are provided
|
||||||
|
if not repo or not arch:
|
||||||
|
return "Error: Missing 'repo' or 'arch' parameter", 400
|
||||||
|
|
||||||
|
# Log the request
|
||||||
|
log_request(repo, arch, 'http') # Assuming HTTP for simplicity
|
||||||
|
|
||||||
|
# Construct the metalink URL using the provided parameters
|
||||||
|
metalink_url = f'https://mirrors.fedoraproject.org/metalink?repo={repo}&arch={arch}'
|
||||||
|
|
||||||
|
# Fetch the metalink file from the constructed URL
|
||||||
|
response = requests.get(metalink_url)
|
||||||
|
metalink_content = response.content
|
||||||
|
|
||||||
|
# Get the filtering criteria from environment variables
|
||||||
|
config = get_config()
|
||||||
|
excluded_countries = eval(config['excluded_countries'])
|
||||||
|
preferred_protocols = eval(config['preferred_protocols'])
|
||||||
|
preferred_types = eval(config['preferred_types'])
|
||||||
|
min_preference = int(config['min_preference'])
|
||||||
|
|
||||||
|
# Filter out the URLs based on the criteria
|
||||||
|
filtered_content = filter_urls(metalink_content, excluded_countries, preferred_protocols, preferred_types, min_preference)
|
||||||
|
|
||||||
|
# Return the filtered content as a response
|
||||||
|
return Response(filtered_content, mimetype='application/xml')
|
||||||
|
|
||||||
|
def filter_urls(content, excluded_countries, preferred_protocols, preferred_types, min_preference):
|
||||||
|
# Parse the XML content
|
||||||
|
soup = BeautifulSoup(content, 'xml')
|
||||||
|
|
||||||
|
# Find all URL elements
|
||||||
|
urls = soup.find_all('url')
|
||||||
|
|
||||||
|
# Iterate over URLs and remove those that do not meet the criteria
|
||||||
|
for url in urls:
|
||||||
|
location = url.get('location')
|
||||||
|
protocol = url.get('protocol')
|
||||||
|
type_ = url.get('type')
|
||||||
|
preference = int(url.get('preference', 0))
|
||||||
|
|
||||||
|
if (location in excluded_countries or
|
||||||
|
protocol not in preferred_protocols or
|
||||||
|
type_ not in preferred_types or
|
||||||
|
preference < min_preference):
|
||||||
|
url.decompose()
|
||||||
|
|
||||||
|
# Convert the BeautifulSoup object back to a string and clean up
|
||||||
|
filtered_content = str(soup)
|
||||||
|
filtered_content = '\n'.join(line for line in filtered_content.splitlines() if line.strip())
|
||||||
|
|
||||||
|
return filtered_content
|
||||||
|
|
||||||
|
@app.route('/dash')
|
||||||
|
@app.route('/dash/stats')
|
||||||
|
def stats():
|
||||||
|
# Generate some statistics
|
||||||
|
stats_data = {
|
||||||
|
'hourly': dict(request_stats.get(time.strftime('%Y-%m-%d %H', time.localtime()), {})),
|
||||||
|
'daily': dict(request_stats.get(time.strftime('%Y-%m-%d', time.localtime()), {})),
|
||||||
|
'weekly': dict(request_stats.get(time.strftime('%Y-%U', time.localtime()), {})),
|
||||||
|
'monthly': dict(request_stats.get(time.strftime('%Y-%m', time.localtime()), {}))
|
||||||
|
}
|
||||||
|
|
||||||
|
# Convert stats data to a DataFrame for easier manipulation
|
||||||
|
df = pd.DataFrame.from_dict(stats_data, orient='index').fillna(0)
|
||||||
|
|
||||||
|
if df.empty:
|
||||||
|
return render_template('stats.html', plot_url=None, message="No data available.")
|
||||||
|
|
||||||
|
# Generate a simple plot
|
||||||
|
plt.figure(figsize=(10, 6))
|
||||||
|
df.sum(axis=1).plot(kind='bar')
|
||||||
|
plt.title('Request Statistics')
|
||||||
|
plt.ylabel('Number of Requests')
|
||||||
|
|
||||||
|
# Save plot to a BytesIO object
|
||||||
|
img = BytesIO()
|
||||||
|
plt.savefig(img, format='png')
|
||||||
|
plt.close()
|
||||||
|
img.seek(0)
|
||||||
|
|
||||||
|
# Encode the plot to base64 for embedding in HTML
|
||||||
|
plot_url = base64.b64encode(img.getvalue()).decode('utf8')
|
||||||
|
|
||||||
|
return render_template('stats.html', plot_url=plot_url)
|
||||||
|
|
||||||
|
@app.route('/dash/config', methods=['GET', 'POST'])
|
||||||
|
def config():
|
||||||
|
if request.method == 'POST':
|
||||||
|
# Update environment variables with form data
|
||||||
|
os.environ['EXCLUDED_COUNTRIES'] = request.form.get('excluded_countries', '')
|
||||||
|
os.environ['PREFERRED_PROTOCOLS'] = request.form.get('preferred_protocols', '')
|
||||||
|
os.environ['PREFERRED_TYPES'] = request.form.get('preferred_types', '')
|
||||||
|
os.environ['MIN_PREFERENCE'] = request.form.get('min_preference', '')
|
||||||
|
|
||||||
|
return redirect(url_for('config'))
|
||||||
|
|
||||||
|
config = get_config()
|
||||||
|
return render_template('config.html', **config)
|
||||||
|
|
||||||
|
@app.route('/dash/logs')
|
||||||
|
def logs():
|
||||||
|
return render_template('logs.html')
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
app.run(host='0.0.0.0', port=8182)
|
||||||
13
docker-compose.yml
Normal file
13
docker-compose.yml
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
version: '3.8'
|
||||||
|
|
||||||
|
services:
|
||||||
|
metaproxy:
|
||||||
|
build: .
|
||||||
|
ports:
|
||||||
|
- "8182:8182"
|
||||||
|
volumes:
|
||||||
|
- ./data:/opt/app/data
|
||||||
|
environment:
|
||||||
|
- FLASK_ENV=production
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
4
env.example
Normal file
4
env.example
Normal file
@@ -0,0 +1,4 @@
|
|||||||
|
EXCLUDED_COUNTRIES=["RU", "CN"]
|
||||||
|
PREFERRED_PROTOCOLS=["https", "http", "rsync"]
|
||||||
|
PREFERRED_TYPES=["https", "http"]
|
||||||
|
MIN_PREFERENCE=50
|
||||||
68
metaproxy.py
68
metaproxy.py
@@ -1,68 +0,0 @@
|
|||||||
import os
|
|
||||||
import requests
|
|
||||||
from flask import Flask, Response, request
|
|
||||||
from dotenv import load_dotenv
|
|
||||||
from bs4 import BeautifulSoup
|
|
||||||
|
|
||||||
# Load environment variables from .env file
|
|
||||||
load_dotenv()
|
|
||||||
|
|
||||||
app = Flask(__name__)
|
|
||||||
|
|
||||||
@app.route('/metalink')
|
|
||||||
def get_metalink():
|
|
||||||
# Get query parameters from the request
|
|
||||||
repo = request.args.get('repo')
|
|
||||||
arch = request.args.get('arch')
|
|
||||||
|
|
||||||
# Check if required parameters are provided
|
|
||||||
if not repo or not arch:
|
|
||||||
return "Error: Missing 'repo' or 'arch' parameter", 400
|
|
||||||
|
|
||||||
# Construct the metalink URL using the provided parameters
|
|
||||||
metalink_url = f'https://mirrors.fedoraproject.org/metalink?repo={repo}&arch={arch}'
|
|
||||||
|
|
||||||
# Fetch the metalink file from the constructed URL
|
|
||||||
response = requests.get(metalink_url)
|
|
||||||
metalink_content = response.content
|
|
||||||
|
|
||||||
# Parse the .env file to get the filtering criteria
|
|
||||||
excluded_countries = eval(os.getenv('EXCLUDED_COUNTRIES', '[]'))
|
|
||||||
preferred_protocols = eval(os.getenv('PREFERRED_PROTOCOLS', '["https", "http"]'))
|
|
||||||
preferred_types = eval(os.getenv('PREFERRED_TYPES', '["https", "http"]'))
|
|
||||||
min_preference = int(os.getenv('MIN_PREFERENCE', '0'))
|
|
||||||
|
|
||||||
# Filter out the URLs based on the criteria
|
|
||||||
filtered_content = filter_urls(metalink_content, excluded_countries, preferred_protocols, preferred_types, min_preference)
|
|
||||||
|
|
||||||
# Return the filtered content as a response
|
|
||||||
return Response(filtered_content, mimetype='application/xml')
|
|
||||||
|
|
||||||
def filter_urls(content, excluded_countries, preferred_protocols, preferred_types, min_preference):
|
|
||||||
# Parse the XML content
|
|
||||||
soup = BeautifulSoup(content, 'xml')
|
|
||||||
|
|
||||||
# Find all URL elements
|
|
||||||
urls = soup.find_all('url')
|
|
||||||
|
|
||||||
# Iterate over URLs and remove those that do not meet the criteria
|
|
||||||
for url in urls:
|
|
||||||
location = url.get('location')
|
|
||||||
protocol = url.get('protocol')
|
|
||||||
type_ = url.get('type')
|
|
||||||
preference = int(url.get('preference', 0))
|
|
||||||
|
|
||||||
if (location in excluded_countries or
|
|
||||||
protocol not in preferred_protocols or
|
|
||||||
type_ not in preferred_types or
|
|
||||||
preference < min_preference):
|
|
||||||
url.decompose()
|
|
||||||
|
|
||||||
# Convert the BeautifulSoup object back to a string and clean up
|
|
||||||
filtered_content = str(soup)
|
|
||||||
filtered_content = '\n'.join(line for line in filtered_content.splitlines() if line.strip())
|
|
||||||
|
|
||||||
return filtered_content
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
app.run(debug=True)
|
|
||||||
31
templates/base.html
Normal file
31
templates/base.html
Normal file
@@ -0,0 +1,31 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="UTF-8">
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||||
|
<title>MetaProxy Dashboard</title>
|
||||||
|
<link href="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/css/bootstrap.min.css" rel="stylesheet">
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<div class="container">
|
||||||
|
<ul class="nav nav-tabs">
|
||||||
|
<li class="nav-item">
|
||||||
|
<a class="nav-link active" href="/dash/config">Configuration</a>
|
||||||
|
</li>
|
||||||
|
<li class="nav-item">
|
||||||
|
<a class="nav-link" href="/dash/stats">Statistics</a>
|
||||||
|
</li>
|
||||||
|
<li class="nav-item">
|
||||||
|
<a class="nav-link" href="/dash/logs">Error Logs</a>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<div class="tab-content">
|
||||||
|
{% block content %}
|
||||||
|
{% endblock %}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js"></script>
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/@popperjs/core@2.5.4/dist/umd/popper.min.js"></script>
|
||||||
|
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/js/bootstrap.min.js"></script>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
26
templates/config.html
Normal file
26
templates/config.html
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
{% extends "base.html" %}
|
||||||
|
|
||||||
|
{% block content %}
|
||||||
|
<div class="tab-pane fade show active">
|
||||||
|
<h2>Configuration</h2>
|
||||||
|
<form method="POST" action="/dash/config">
|
||||||
|
<div class="form-group">
|
||||||
|
<label for="excluded_countries">Excluded Countries</label>
|
||||||
|
<input type="text" class="form-control" id="excluded_countries" name="excluded_countries" value="{{ excluded_countries }}">
|
||||||
|
</div>
|
||||||
|
<div class="form-group">
|
||||||
|
<label for="preferred_protocols">Preferred Protocols</label>
|
||||||
|
<input type="text" class="form-control" id="preferred_protocols" name="preferred_protocols" value="{{ preferred_protocols }}">
|
||||||
|
</div>
|
||||||
|
<div class="form-group">
|
||||||
|
<label for="preferred_types">Preferred Types</label>
|
||||||
|
<input type="text" class="form-control" id="preferred_types" name="preferred_types" value="{{ preferred_types }}">
|
||||||
|
</div>
|
||||||
|
<div class="form-group">
|
||||||
|
<label for="min_preference">Minimum Preference</label>
|
||||||
|
<input type="text" class="form-control" id="min_preference" name="min_preference" value="{{ min_preference }}">
|
||||||
|
</div>
|
||||||
|
<button type="submit" class="btn btn-primary">Save</button>
|
||||||
|
</form>
|
||||||
|
</div>
|
||||||
|
{% endblock %}
|
||||||
8
templates/logs.html
Normal file
8
templates/logs.html
Normal file
@@ -0,0 +1,8 @@
|
|||||||
|
{% extends "base.html" %}
|
||||||
|
|
||||||
|
{% block content %}
|
||||||
|
<div class="tab-pane fade show active">
|
||||||
|
<h2>Error Logs</h2>
|
||||||
|
<p>Error logs content goes here.</p>
|
||||||
|
</div>
|
||||||
|
{% endblock %}
|
||||||
10
templates/stats.html
Normal file
10
templates/stats.html
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
{% extends "base.html" %}
|
||||||
|
|
||||||
|
{% block content %}
|
||||||
|
<div class="tab-pane fade show active">
|
||||||
|
<h2>Statistics</h2>
|
||||||
|
<div>
|
||||||
|
<img src="data:image/png;base64,{{ plot_url }}" alt="Request Statistics" />
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
{% endblock %}
|
||||||
Reference in New Issue
Block a user