Compare commits

..

11 Commits

Author SHA1 Message Date
ra_ma
b01c162db9 crush cli test 2025-10-16 19:13:53 +01:00
ra_ma
b87f733969 read 2025-06-20 20:24:49 +01:00
ra_ma
cbaa07676c statsu maybe 2025-06-20 19:49:05 +01:00
ra_ma
1ce128ff43 statsu 2025-06-20 19:44:27 +01:00
ra_ma
3973e7a803 import fix 2025-06-20 19:30:06 +01:00
ra_ma
4e9b920334 ui more more 2025-06-20 19:25:56 +01:00
ra_ma
d588116f6b ui more 2025-06-20 19:20:36 +01:00
ra_ma
cdac727a28 ui more 2025-06-20 19:10:08 +01:00
ra_ma
222e8c99af start webui 2025-06-20 18:59:25 +01:00
ra_ma
5da6f7e95c plus compose 2025-06-20 18:40:01 +01:00
ra_ma
c7a96e29c1 plus docker 2025-06-20 18:36:30 +01:00
12 changed files with 362 additions and 70 deletions

3
.gitignore vendored
View File

@@ -1,3 +1,6 @@
# ---> Aii
.crush
# ---> Python
# Byte-compiled / optimized / DLL files
__pycache__/

34
CRUSH.md Normal file
View File

@@ -0,0 +1,34 @@
# Fedora Metalink Proxy Codebase Guide
## Build/Lint/Test Commands
- **Build**: `docker-compose up --build` - Builds and runs the Docker container
- **Run**: `docker-compose up` - Starts the application
- **Test**: `python -m unittest discover` - Runs all tests (no test files found)
- **Lint**: `pylint app.py` - Lints the main application file
## Code Style Guidelines
- **Imports**: Grouped by standard library, third-party, local imports
- **Formatting**: 4-space indentation, spaces around operators
- **Types**: Use type hints for function signatures
- **Naming**: snake_case for variables/functions, CamelCase for classes
- **Error Handling**: Use try/except blocks for network operations
- **Environment**: Use `python-dotenv` for configuration
- **Logging**: Use `print()` for debugging (consider structured logging)
- **Security**: Validate all user inputs and environment variables
## Project Structure
- `app.py`: Main Flask application
- `templates/`: HTML templates for web interface
- `data/`: Persistent data directory (mounted in Docker)
- `Dockerfile`: Container configuration
- `docker-compose.yml`: Orchestration setup
## Key Components
- Flask web framework with Gunicorn for production
- BeautifulSoup for XML parsing
- Requests for HTTP operations
- Pandas/matplotlib for statistics visualization

20
Dockerfile Normal file
View File

@@ -0,0 +1,20 @@
# Use an official Python runtime as a parent image
FROM python:3.9-slim
# Set the working directory in the container
WORKDIR /opt/app
# Copy the current directory contents into the container
COPY . /opt/app/
# Install system dependencies required for lxml and matplotlib
RUN apt-get update && apt-get install -y libxml2-dev libxslt1-dev
# Install Python packages
RUN pip install --no-cache-dir Flask requests python-dotenv beautifulsoup4 gunicorn lxml pandas matplotlib
# Make port 8182 available to the world outside this container
EXPOSE 8182
# Run Gunicorn
CMD ["gunicorn", "--bind", "0.0.0.0:8182", "app:app"]

View File

@@ -1,3 +1,59 @@
# fedora-metalink-proxy
# Fedora Metalink Proxy
Proxy for the Fedora metalink mirror list. Contains a filter for the variables provided in the xml file.
A Flask-based web application that acts as a proxy for Fedora Metalink files, allowing filtering based on various criteria such as country, protocol, and preference.
## Features
- Proxy for Fedora Metalink files with configurable filtering options
- Web interface for configuring filter settings
- Basic statistics and logging for monitoring request patterns
## Prerequisites
- Docker
- Docker Compose
## Setup
1. Clone the repository with:
`git clone <repository-url>`
Then navigate to the project directory:
`cd <repository-directory>`
2. Configure environment variables by copying the example file:
`cp .env.example .env`
Then edit the `.env` file to set your desired configuration options.
3. Build and run the Docker container with:
`docker-compose up --build`
This command will build the Docker image and start the container.
## Usage
### Access the Application
Open a web browser and navigate to `http://localhost:8182` to access the application.
### Configure Filter Settings
Navigate to `/dash/config` to configure the filter settings for the Metalink proxy.
### View Statistics
Navigate to `/dash/stats` to view basic statistics on request patterns.
### Access Metalink Files
Use the `/metalink` endpoint with query parameters to fetch and filter Metalink files.
Example URL format:
`http://localhost:8182/metalink?repo=fedora-42&arch=x86_64`
## Configuration
The application uses environment variables for configuration. Set these variables in the `.env` file:
- `EXCLUDED_COUNTRIES`: List of country codes to exclude
- `PREFERRED_PROTOCOLS`: List of preferred protocols
- `PREFERRED_TYPES`: List of preferred types
- `MIN_PREFERENCE`: Minimum preference value for filtering
## License
MIT

155
app.py Normal file
View File

@@ -0,0 +1,155 @@
import os
import requests
import time
from collections import defaultdict
from flask import Flask, render_template, request, redirect, url_for, Response
from dotenv import load_dotenv
from bs4 import BeautifulSoup
import pandas as pd
import matplotlib.pyplot as plt
from io import BytesIO
import base64
# Load environment variables from .env file
load_dotenv()
app = Flask(__name__, template_folder='templates')
# Data structure to store request statistics
request_stats = defaultdict(lambda: defaultdict(int))
def get_config():
"""Retrieve configuration from environment variables."""
return {
"excluded_countries": os.getenv('EXCLUDED_COUNTRIES', '[]'),
"preferred_protocols": os.getenv('PREFERRED_PROTOCOLS', '["https", "http"]'),
"preferred_types": os.getenv('PREFERRED_TYPES', '["https", "http"]'),
"min_preference": os.getenv('MIN_PREFERENCE', '0')
}
def log_request(repo, arch, protocol):
"""Log request statistics."""
timestamp = int(time.time())
hour = time.strftime('%Y-%m-%d %H', time.localtime(timestamp))
day = time.strftime('%Y-%m-%d', time.localtime(timestamp))
week = time.strftime('%Y-%U', time.localtime(timestamp))
month = time.strftime('%Y-%m', time.localtime(timestamp))
request_stats[hour][(repo, arch, protocol)] += 1
request_stats[day][(repo, arch, protocol)] += 1
request_stats[week][(repo, arch, protocol)] += 1
request_stats[month][(repo, arch, protocol)] += 1
@app.route('/metalink')
def get_metalink():
# Get query parameters from the request
repo = request.args.get('repo')
arch = request.args.get('arch')
# Check if required parameters are provided
if not repo or not arch:
return "Error: Missing 'repo' or 'arch' parameter", 400
# Log the request
log_request(repo, arch, 'http') # Assuming HTTP for simplicity
# Construct the metalink URL using the provided parameters
metalink_url = f'https://mirrors.fedoraproject.org/metalink?repo={repo}&arch={arch}'
# Fetch the metalink file from the constructed URL
response = requests.get(metalink_url)
metalink_content = response.content
# Get the filtering criteria from environment variables
config = get_config()
excluded_countries = eval(config['excluded_countries'])
preferred_protocols = eval(config['preferred_protocols'])
preferred_types = eval(config['preferred_types'])
min_preference = int(config['min_preference'])
# Filter out the URLs based on the criteria
filtered_content = filter_urls(metalink_content, excluded_countries, preferred_protocols, preferred_types, min_preference)
# Return the filtered content as a response
return Response(filtered_content, mimetype='application/xml')
def filter_urls(content, excluded_countries, preferred_protocols, preferred_types, min_preference):
# Parse the XML content
soup = BeautifulSoup(content, 'xml')
# Find all URL elements
urls = soup.find_all('url')
# Iterate over URLs and remove those that do not meet the criteria
for url in urls:
location = url.get('location')
protocol = url.get('protocol')
type_ = url.get('type')
preference = int(url.get('preference', 0))
if (location in excluded_countries or
protocol not in preferred_protocols or
type_ not in preferred_types or
preference < min_preference):
url.decompose()
# Convert the BeautifulSoup object back to a string and clean up
filtered_content = str(soup)
filtered_content = '\n'.join(line for line in filtered_content.splitlines() if line.strip())
return filtered_content
@app.route('/dash')
@app.route('/dash/stats')
def stats():
# Generate some statistics
stats_data = {
'hourly': dict(request_stats.get(time.strftime('%Y-%m-%d %H', time.localtime()), {})),
'daily': dict(request_stats.get(time.strftime('%Y-%m-%d', time.localtime()), {})),
'weekly': dict(request_stats.get(time.strftime('%Y-%U', time.localtime()), {})),
'monthly': dict(request_stats.get(time.strftime('%Y-%m', time.localtime()), {}))
}
# Convert stats data to a DataFrame for easier manipulation
df = pd.DataFrame.from_dict(stats_data, orient='index').fillna(0)
if df.empty:
return render_template('stats.html', plot_url=None, message="No data available.")
# Generate a simple plot
plt.figure(figsize=(10, 6))
df.sum(axis=1).plot(kind='bar')
plt.title('Request Statistics')
plt.ylabel('Number of Requests')
# Save plot to a BytesIO object
img = BytesIO()
plt.savefig(img, format='png')
plt.close()
img.seek(0)
# Encode the plot to base64 for embedding in HTML
plot_url = base64.b64encode(img.getvalue()).decode('utf8')
return render_template('stats.html', plot_url=plot_url)
@app.route('/dash/config', methods=['GET', 'POST'])
def config():
if request.method == 'POST':
# Update environment variables with form data
os.environ['EXCLUDED_COUNTRIES'] = request.form.get('excluded_countries', '')
os.environ['PREFERRED_PROTOCOLS'] = request.form.get('preferred_protocols', '')
os.environ['PREFERRED_TYPES'] = request.form.get('preferred_types', '')
os.environ['MIN_PREFERENCE'] = request.form.get('min_preference', '')
return redirect(url_for('config'))
config = get_config()
return render_template('config.html', **config)
@app.route('/dash/logs')
def logs():
return render_template('logs.html')
if __name__ == '__main__':
app.run(host='0.0.0.0', port=8182)

13
docker-compose.yml Normal file
View File

@@ -0,0 +1,13 @@
version: '3.8'
services:
metaproxy:
build: .
ports:
- "8182:8182"
volumes:
- ./data:/opt/app/data
environment:
- FLASK_ENV=production
restart: unless-stopped

4
env.example Normal file
View File

@@ -0,0 +1,4 @@
EXCLUDED_COUNTRIES=["RU", "CN"]
PREFERRED_PROTOCOLS=["https", "http", "rsync"]
PREFERRED_TYPES=["https", "http"]
MIN_PREFERENCE=50

View File

@@ -1,68 +0,0 @@
import os
import requests
from flask import Flask, Response, request
from dotenv import load_dotenv
from bs4 import BeautifulSoup
# Load environment variables from .env file
load_dotenv()
app = Flask(__name__)
@app.route('/metalink')
def get_metalink():
# Get query parameters from the request
repo = request.args.get('repo')
arch = request.args.get('arch')
# Check if required parameters are provided
if not repo or not arch:
return "Error: Missing 'repo' or 'arch' parameter", 400
# Construct the metalink URL using the provided parameters
metalink_url = f'https://mirrors.fedoraproject.org/metalink?repo={repo}&arch={arch}'
# Fetch the metalink file from the constructed URL
response = requests.get(metalink_url)
metalink_content = response.content
# Parse the .env file to get the filtering criteria
excluded_countries = eval(os.getenv('EXCLUDED_COUNTRIES', '[]'))
preferred_protocols = eval(os.getenv('PREFERRED_PROTOCOLS', '["https", "http"]'))
preferred_types = eval(os.getenv('PREFERRED_TYPES', '["https", "http"]'))
min_preference = int(os.getenv('MIN_PREFERENCE', '0'))
# Filter out the URLs based on the criteria
filtered_content = filter_urls(metalink_content, excluded_countries, preferred_protocols, preferred_types, min_preference)
# Return the filtered content as a response
return Response(filtered_content, mimetype='application/xml')
def filter_urls(content, excluded_countries, preferred_protocols, preferred_types, min_preference):
# Parse the XML content
soup = BeautifulSoup(content, 'xml')
# Find all URL elements
urls = soup.find_all('url')
# Iterate over URLs and remove those that do not meet the criteria
for url in urls:
location = url.get('location')
protocol = url.get('protocol')
type_ = url.get('type')
preference = int(url.get('preference', 0))
if (location in excluded_countries or
protocol not in preferred_protocols or
type_ not in preferred_types or
preference < min_preference):
url.decompose()
# Convert the BeautifulSoup object back to a string and clean up
filtered_content = str(soup)
filtered_content = '\n'.join(line for line in filtered_content.splitlines() if line.strip())
return filtered_content
if __name__ == '__main__':
app.run(debug=True)

31
templates/base.html Normal file
View File

@@ -0,0 +1,31 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>MetaProxy Dashboard</title>
<link href="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/css/bootstrap.min.css" rel="stylesheet">
</head>
<body>
<div class="container">
<ul class="nav nav-tabs">
<li class="nav-item">
<a class="nav-link active" href="/dash/config">Configuration</a>
</li>
<li class="nav-item">
<a class="nav-link" href="/dash/stats">Statistics</a>
</li>
<li class="nav-item">
<a class="nav-link" href="/dash/logs">Error Logs</a>
</li>
</ul>
<div class="tab-content">
{% block content %}
{% endblock %}
</div>
</div>
<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@popperjs/core@2.5.4/dist/umd/popper.min.js"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/js/bootstrap.min.js"></script>
</body>
</html>

26
templates/config.html Normal file
View File

@@ -0,0 +1,26 @@
{% extends "base.html" %}
{% block content %}
<div class="tab-pane fade show active">
<h2>Configuration</h2>
<form method="POST" action="/dash/config">
<div class="form-group">
<label for="excluded_countries">Excluded Countries</label>
<input type="text" class="form-control" id="excluded_countries" name="excluded_countries" value="{{ excluded_countries }}">
</div>
<div class="form-group">
<label for="preferred_protocols">Preferred Protocols</label>
<input type="text" class="form-control" id="preferred_protocols" name="preferred_protocols" value="{{ preferred_protocols }}">
</div>
<div class="form-group">
<label for="preferred_types">Preferred Types</label>
<input type="text" class="form-control" id="preferred_types" name="preferred_types" value="{{ preferred_types }}">
</div>
<div class="form-group">
<label for="min_preference">Minimum Preference</label>
<input type="text" class="form-control" id="min_preference" name="min_preference" value="{{ min_preference }}">
</div>
<button type="submit" class="btn btn-primary">Save</button>
</form>
</div>
{% endblock %}

8
templates/logs.html Normal file
View File

@@ -0,0 +1,8 @@
{% extends "base.html" %}
{% block content %}
<div class="tab-pane fade show active">
<h2>Error Logs</h2>
<p>Error logs content goes here.</p>
</div>
{% endblock %}

10
templates/stats.html Normal file
View File

@@ -0,0 +1,10 @@
{% extends "base.html" %}
{% block content %}
<div class="tab-pane fade show active">
<h2>Statistics</h2>
<div>
<img src="data:image/png;base64,{{ plot_url }}" alt="Request Statistics" />
</div>
</div>
{% endblock %}