Compare commits
11 Commits
1e286536e3
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
b01c162db9 | ||
|
|
b87f733969 | ||
|
|
cbaa07676c | ||
|
|
1ce128ff43 | ||
|
|
3973e7a803 | ||
|
|
4e9b920334 | ||
|
|
d588116f6b | ||
|
|
cdac727a28 | ||
|
|
222e8c99af | ||
|
|
5da6f7e95c | ||
|
|
c7a96e29c1 |
3
.gitignore
vendored
3
.gitignore
vendored
@@ -1,3 +1,6 @@
|
||||
# ---> Aii
|
||||
.crush
|
||||
|
||||
# ---> Python
|
||||
# Byte-compiled / optimized / DLL files
|
||||
__pycache__/
|
||||
|
||||
34
CRUSH.md
Normal file
34
CRUSH.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# Fedora Metalink Proxy Codebase Guide
|
||||
|
||||
## Build/Lint/Test Commands
|
||||
|
||||
- **Build**: `docker-compose up --build` - Builds and runs the Docker container
|
||||
- **Run**: `docker-compose up` - Starts the application
|
||||
- **Test**: `python -m unittest discover` - Runs all tests (no test files found)
|
||||
- **Lint**: `pylint app.py` - Lints the main application file
|
||||
|
||||
## Code Style Guidelines
|
||||
|
||||
- **Imports**: Grouped by standard library, third-party, local imports
|
||||
- **Formatting**: 4-space indentation, spaces around operators
|
||||
- **Types**: Use type hints for function signatures
|
||||
- **Naming**: snake_case for variables/functions, CamelCase for classes
|
||||
- **Error Handling**: Use try/except blocks for network operations
|
||||
- **Environment**: Use `python-dotenv` for configuration
|
||||
- **Logging**: Use `print()` for debugging (consider structured logging)
|
||||
- **Security**: Validate all user inputs and environment variables
|
||||
|
||||
## Project Structure
|
||||
|
||||
- `app.py`: Main Flask application
|
||||
- `templates/`: HTML templates for web interface
|
||||
- `data/`: Persistent data directory (mounted in Docker)
|
||||
- `Dockerfile`: Container configuration
|
||||
- `docker-compose.yml`: Orchestration setup
|
||||
|
||||
## Key Components
|
||||
|
||||
- Flask web framework with Gunicorn for production
|
||||
- BeautifulSoup for XML parsing
|
||||
- Requests for HTTP operations
|
||||
- Pandas/matplotlib for statistics visualization
|
||||
20
Dockerfile
Normal file
20
Dockerfile
Normal file
@@ -0,0 +1,20 @@
|
||||
# Use an official Python runtime as a parent image
|
||||
FROM python:3.9-slim
|
||||
|
||||
# Set the working directory in the container
|
||||
WORKDIR /opt/app
|
||||
|
||||
# Copy the current directory contents into the container
|
||||
COPY . /opt/app/
|
||||
|
||||
# Install system dependencies required for lxml and matplotlib
|
||||
RUN apt-get update && apt-get install -y libxml2-dev libxslt1-dev
|
||||
|
||||
# Install Python packages
|
||||
RUN pip install --no-cache-dir Flask requests python-dotenv beautifulsoup4 gunicorn lxml pandas matplotlib
|
||||
|
||||
# Make port 8182 available to the world outside this container
|
||||
EXPOSE 8182
|
||||
|
||||
# Run Gunicorn
|
||||
CMD ["gunicorn", "--bind", "0.0.0.0:8182", "app:app"]
|
||||
60
README.md
60
README.md
@@ -1,3 +1,59 @@
|
||||
# fedora-metalink-proxy
|
||||
# Fedora Metalink Proxy
|
||||
|
||||
Proxy for the Fedora metalink mirror list. Contains a filter for the variables provided in the xml file.
|
||||
A Flask-based web application that acts as a proxy for Fedora Metalink files, allowing filtering based on various criteria such as country, protocol, and preference.
|
||||
|
||||
## Features
|
||||
|
||||
- Proxy for Fedora Metalink files with configurable filtering options
|
||||
- Web interface for configuring filter settings
|
||||
- Basic statistics and logging for monitoring request patterns
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Docker
|
||||
- Docker Compose
|
||||
|
||||
## Setup
|
||||
|
||||
1. Clone the repository with:
|
||||
`git clone <repository-url>`
|
||||
Then navigate to the project directory:
|
||||
`cd <repository-directory>`
|
||||
|
||||
2. Configure environment variables by copying the example file:
|
||||
`cp .env.example .env`
|
||||
Then edit the `.env` file to set your desired configuration options.
|
||||
|
||||
3. Build and run the Docker container with:
|
||||
`docker-compose up --build`
|
||||
This command will build the Docker image and start the container.
|
||||
|
||||
## Usage
|
||||
|
||||
### Access the Application
|
||||
Open a web browser and navigate to `http://localhost:8182` to access the application.
|
||||
|
||||
### Configure Filter Settings
|
||||
Navigate to `/dash/config` to configure the filter settings for the Metalink proxy.
|
||||
|
||||
### View Statistics
|
||||
Navigate to `/dash/stats` to view basic statistics on request patterns.
|
||||
|
||||
### Access Metalink Files
|
||||
Use the `/metalink` endpoint with query parameters to fetch and filter Metalink files.
|
||||
|
||||
Example URL format:
|
||||
`http://localhost:8182/metalink?repo=fedora-42&arch=x86_64`
|
||||
|
||||
## Configuration
|
||||
|
||||
The application uses environment variables for configuration. Set these variables in the `.env` file:
|
||||
|
||||
- `EXCLUDED_COUNTRIES`: List of country codes to exclude
|
||||
- `PREFERRED_PROTOCOLS`: List of preferred protocols
|
||||
- `PREFERRED_TYPES`: List of preferred types
|
||||
- `MIN_PREFERENCE`: Minimum preference value for filtering
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
|
||||
155
app.py
Normal file
155
app.py
Normal file
@@ -0,0 +1,155 @@
|
||||
import os
|
||||
import requests
|
||||
import time
|
||||
from collections import defaultdict
|
||||
from flask import Flask, render_template, request, redirect, url_for, Response
|
||||
from dotenv import load_dotenv
|
||||
from bs4 import BeautifulSoup
|
||||
import pandas as pd
|
||||
import matplotlib.pyplot as plt
|
||||
from io import BytesIO
|
||||
import base64
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
app = Flask(__name__, template_folder='templates')
|
||||
|
||||
# Data structure to store request statistics
|
||||
request_stats = defaultdict(lambda: defaultdict(int))
|
||||
|
||||
def get_config():
|
||||
"""Retrieve configuration from environment variables."""
|
||||
return {
|
||||
"excluded_countries": os.getenv('EXCLUDED_COUNTRIES', '[]'),
|
||||
"preferred_protocols": os.getenv('PREFERRED_PROTOCOLS', '["https", "http"]'),
|
||||
"preferred_types": os.getenv('PREFERRED_TYPES', '["https", "http"]'),
|
||||
"min_preference": os.getenv('MIN_PREFERENCE', '0')
|
||||
}
|
||||
|
||||
def log_request(repo, arch, protocol):
|
||||
"""Log request statistics."""
|
||||
timestamp = int(time.time())
|
||||
hour = time.strftime('%Y-%m-%d %H', time.localtime(timestamp))
|
||||
day = time.strftime('%Y-%m-%d', time.localtime(timestamp))
|
||||
week = time.strftime('%Y-%U', time.localtime(timestamp))
|
||||
month = time.strftime('%Y-%m', time.localtime(timestamp))
|
||||
|
||||
request_stats[hour][(repo, arch, protocol)] += 1
|
||||
request_stats[day][(repo, arch, protocol)] += 1
|
||||
request_stats[week][(repo, arch, protocol)] += 1
|
||||
request_stats[month][(repo, arch, protocol)] += 1
|
||||
|
||||
@app.route('/metalink')
|
||||
def get_metalink():
|
||||
# Get query parameters from the request
|
||||
repo = request.args.get('repo')
|
||||
arch = request.args.get('arch')
|
||||
|
||||
# Check if required parameters are provided
|
||||
if not repo or not arch:
|
||||
return "Error: Missing 'repo' or 'arch' parameter", 400
|
||||
|
||||
# Log the request
|
||||
log_request(repo, arch, 'http') # Assuming HTTP for simplicity
|
||||
|
||||
# Construct the metalink URL using the provided parameters
|
||||
metalink_url = f'https://mirrors.fedoraproject.org/metalink?repo={repo}&arch={arch}'
|
||||
|
||||
# Fetch the metalink file from the constructed URL
|
||||
response = requests.get(metalink_url)
|
||||
metalink_content = response.content
|
||||
|
||||
# Get the filtering criteria from environment variables
|
||||
config = get_config()
|
||||
excluded_countries = eval(config['excluded_countries'])
|
||||
preferred_protocols = eval(config['preferred_protocols'])
|
||||
preferred_types = eval(config['preferred_types'])
|
||||
min_preference = int(config['min_preference'])
|
||||
|
||||
# Filter out the URLs based on the criteria
|
||||
filtered_content = filter_urls(metalink_content, excluded_countries, preferred_protocols, preferred_types, min_preference)
|
||||
|
||||
# Return the filtered content as a response
|
||||
return Response(filtered_content, mimetype='application/xml')
|
||||
|
||||
def filter_urls(content, excluded_countries, preferred_protocols, preferred_types, min_preference):
|
||||
# Parse the XML content
|
||||
soup = BeautifulSoup(content, 'xml')
|
||||
|
||||
# Find all URL elements
|
||||
urls = soup.find_all('url')
|
||||
|
||||
# Iterate over URLs and remove those that do not meet the criteria
|
||||
for url in urls:
|
||||
location = url.get('location')
|
||||
protocol = url.get('protocol')
|
||||
type_ = url.get('type')
|
||||
preference = int(url.get('preference', 0))
|
||||
|
||||
if (location in excluded_countries or
|
||||
protocol not in preferred_protocols or
|
||||
type_ not in preferred_types or
|
||||
preference < min_preference):
|
||||
url.decompose()
|
||||
|
||||
# Convert the BeautifulSoup object back to a string and clean up
|
||||
filtered_content = str(soup)
|
||||
filtered_content = '\n'.join(line for line in filtered_content.splitlines() if line.strip())
|
||||
|
||||
return filtered_content
|
||||
|
||||
@app.route('/dash')
|
||||
@app.route('/dash/stats')
|
||||
def stats():
|
||||
# Generate some statistics
|
||||
stats_data = {
|
||||
'hourly': dict(request_stats.get(time.strftime('%Y-%m-%d %H', time.localtime()), {})),
|
||||
'daily': dict(request_stats.get(time.strftime('%Y-%m-%d', time.localtime()), {})),
|
||||
'weekly': dict(request_stats.get(time.strftime('%Y-%U', time.localtime()), {})),
|
||||
'monthly': dict(request_stats.get(time.strftime('%Y-%m', time.localtime()), {}))
|
||||
}
|
||||
|
||||
# Convert stats data to a DataFrame for easier manipulation
|
||||
df = pd.DataFrame.from_dict(stats_data, orient='index').fillna(0)
|
||||
|
||||
if df.empty:
|
||||
return render_template('stats.html', plot_url=None, message="No data available.")
|
||||
|
||||
# Generate a simple plot
|
||||
plt.figure(figsize=(10, 6))
|
||||
df.sum(axis=1).plot(kind='bar')
|
||||
plt.title('Request Statistics')
|
||||
plt.ylabel('Number of Requests')
|
||||
|
||||
# Save plot to a BytesIO object
|
||||
img = BytesIO()
|
||||
plt.savefig(img, format='png')
|
||||
plt.close()
|
||||
img.seek(0)
|
||||
|
||||
# Encode the plot to base64 for embedding in HTML
|
||||
plot_url = base64.b64encode(img.getvalue()).decode('utf8')
|
||||
|
||||
return render_template('stats.html', plot_url=plot_url)
|
||||
|
||||
@app.route('/dash/config', methods=['GET', 'POST'])
|
||||
def config():
|
||||
if request.method == 'POST':
|
||||
# Update environment variables with form data
|
||||
os.environ['EXCLUDED_COUNTRIES'] = request.form.get('excluded_countries', '')
|
||||
os.environ['PREFERRED_PROTOCOLS'] = request.form.get('preferred_protocols', '')
|
||||
os.environ['PREFERRED_TYPES'] = request.form.get('preferred_types', '')
|
||||
os.environ['MIN_PREFERENCE'] = request.form.get('min_preference', '')
|
||||
|
||||
return redirect(url_for('config'))
|
||||
|
||||
config = get_config()
|
||||
return render_template('config.html', **config)
|
||||
|
||||
@app.route('/dash/logs')
|
||||
def logs():
|
||||
return render_template('logs.html')
|
||||
|
||||
if __name__ == '__main__':
|
||||
app.run(host='0.0.0.0', port=8182)
|
||||
13
docker-compose.yml
Normal file
13
docker-compose.yml
Normal file
@@ -0,0 +1,13 @@
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
metaproxy:
|
||||
build: .
|
||||
ports:
|
||||
- "8182:8182"
|
||||
volumes:
|
||||
- ./data:/opt/app/data
|
||||
environment:
|
||||
- FLASK_ENV=production
|
||||
restart: unless-stopped
|
||||
|
||||
4
env.example
Normal file
4
env.example
Normal file
@@ -0,0 +1,4 @@
|
||||
EXCLUDED_COUNTRIES=["RU", "CN"]
|
||||
PREFERRED_PROTOCOLS=["https", "http", "rsync"]
|
||||
PREFERRED_TYPES=["https", "http"]
|
||||
MIN_PREFERENCE=50
|
||||
68
metaproxy.py
68
metaproxy.py
@@ -1,68 +0,0 @@
|
||||
import os
|
||||
import requests
|
||||
from flask import Flask, Response, request
|
||||
from dotenv import load_dotenv
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv()
|
||||
|
||||
app = Flask(__name__)
|
||||
|
||||
@app.route('/metalink')
|
||||
def get_metalink():
|
||||
# Get query parameters from the request
|
||||
repo = request.args.get('repo')
|
||||
arch = request.args.get('arch')
|
||||
|
||||
# Check if required parameters are provided
|
||||
if not repo or not arch:
|
||||
return "Error: Missing 'repo' or 'arch' parameter", 400
|
||||
|
||||
# Construct the metalink URL using the provided parameters
|
||||
metalink_url = f'https://mirrors.fedoraproject.org/metalink?repo={repo}&arch={arch}'
|
||||
|
||||
# Fetch the metalink file from the constructed URL
|
||||
response = requests.get(metalink_url)
|
||||
metalink_content = response.content
|
||||
|
||||
# Parse the .env file to get the filtering criteria
|
||||
excluded_countries = eval(os.getenv('EXCLUDED_COUNTRIES', '[]'))
|
||||
preferred_protocols = eval(os.getenv('PREFERRED_PROTOCOLS', '["https", "http"]'))
|
||||
preferred_types = eval(os.getenv('PREFERRED_TYPES', '["https", "http"]'))
|
||||
min_preference = int(os.getenv('MIN_PREFERENCE', '0'))
|
||||
|
||||
# Filter out the URLs based on the criteria
|
||||
filtered_content = filter_urls(metalink_content, excluded_countries, preferred_protocols, preferred_types, min_preference)
|
||||
|
||||
# Return the filtered content as a response
|
||||
return Response(filtered_content, mimetype='application/xml')
|
||||
|
||||
def filter_urls(content, excluded_countries, preferred_protocols, preferred_types, min_preference):
|
||||
# Parse the XML content
|
||||
soup = BeautifulSoup(content, 'xml')
|
||||
|
||||
# Find all URL elements
|
||||
urls = soup.find_all('url')
|
||||
|
||||
# Iterate over URLs and remove those that do not meet the criteria
|
||||
for url in urls:
|
||||
location = url.get('location')
|
||||
protocol = url.get('protocol')
|
||||
type_ = url.get('type')
|
||||
preference = int(url.get('preference', 0))
|
||||
|
||||
if (location in excluded_countries or
|
||||
protocol not in preferred_protocols or
|
||||
type_ not in preferred_types or
|
||||
preference < min_preference):
|
||||
url.decompose()
|
||||
|
||||
# Convert the BeautifulSoup object back to a string and clean up
|
||||
filtered_content = str(soup)
|
||||
filtered_content = '\n'.join(line for line in filtered_content.splitlines() if line.strip())
|
||||
|
||||
return filtered_content
|
||||
|
||||
if __name__ == '__main__':
|
||||
app.run(debug=True)
|
||||
31
templates/base.html
Normal file
31
templates/base.html
Normal file
@@ -0,0 +1,31 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>MetaProxy Dashboard</title>
|
||||
<link href="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/css/bootstrap.min.css" rel="stylesheet">
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<ul class="nav nav-tabs">
|
||||
<li class="nav-item">
|
||||
<a class="nav-link active" href="/dash/config">Configuration</a>
|
||||
</li>
|
||||
<li class="nav-item">
|
||||
<a class="nav-link" href="/dash/stats">Statistics</a>
|
||||
</li>
|
||||
<li class="nav-item">
|
||||
<a class="nav-link" href="/dash/logs">Error Logs</a>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="tab-content">
|
||||
{% block content %}
|
||||
{% endblock %}
|
||||
</div>
|
||||
</div>
|
||||
<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/@popperjs/core@2.5.4/dist/umd/popper.min.js"></script>
|
||||
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/js/bootstrap.min.js"></script>
|
||||
</body>
|
||||
</html>
|
||||
26
templates/config.html
Normal file
26
templates/config.html
Normal file
@@ -0,0 +1,26 @@
|
||||
{% extends "base.html" %}
|
||||
|
||||
{% block content %}
|
||||
<div class="tab-pane fade show active">
|
||||
<h2>Configuration</h2>
|
||||
<form method="POST" action="/dash/config">
|
||||
<div class="form-group">
|
||||
<label for="excluded_countries">Excluded Countries</label>
|
||||
<input type="text" class="form-control" id="excluded_countries" name="excluded_countries" value="{{ excluded_countries }}">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label for="preferred_protocols">Preferred Protocols</label>
|
||||
<input type="text" class="form-control" id="preferred_protocols" name="preferred_protocols" value="{{ preferred_protocols }}">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label for="preferred_types">Preferred Types</label>
|
||||
<input type="text" class="form-control" id="preferred_types" name="preferred_types" value="{{ preferred_types }}">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label for="min_preference">Minimum Preference</label>
|
||||
<input type="text" class="form-control" id="min_preference" name="min_preference" value="{{ min_preference }}">
|
||||
</div>
|
||||
<button type="submit" class="btn btn-primary">Save</button>
|
||||
</form>
|
||||
</div>
|
||||
{% endblock %}
|
||||
8
templates/logs.html
Normal file
8
templates/logs.html
Normal file
@@ -0,0 +1,8 @@
|
||||
{% extends "base.html" %}
|
||||
|
||||
{% block content %}
|
||||
<div class="tab-pane fade show active">
|
||||
<h2>Error Logs</h2>
|
||||
<p>Error logs content goes here.</p>
|
||||
</div>
|
||||
{% endblock %}
|
||||
10
templates/stats.html
Normal file
10
templates/stats.html
Normal file
@@ -0,0 +1,10 @@
|
||||
{% extends "base.html" %}
|
||||
|
||||
{% block content %}
|
||||
<div class="tab-pane fade show active">
|
||||
<h2>Statistics</h2>
|
||||
<div>
|
||||
<img src="data:image/png;base64,{{ plot_url }}" alt="Request Statistics" />
|
||||
</div>
|
||||
</div>
|
||||
{% endblock %}
|
||||
Reference in New Issue
Block a user