Skip to content

devops-ia/pr-generator

PR generator

CI GitHub release Docker Hub Docker Pulls Python License: MIT

Automated Pull Request creation daemon for GitHub and Bitbucket Cloud.

pr-generator runs as a long-lived service that periodically scans your repository branches, matches them against configurable regex patterns, and automatically opens Pull Requests toward the configured destination branches — skipping any PR that already exists.


Table of Contents


How it works

┌─────────────────────────────────────────────────────────────┐
│                        Scan cycle                           │
│                                                             │
│  1. Fetch all branches   ──▶  GitHub  /  Bitbucket          │
│  2. For every rule                                          │
│       match branches against regex pattern                  │
│       for each match                                        │
│         skip  if open PR already exists                     │
│         create PR  source ──▶ destination                   │
│  3. Sleep scan_frequency seconds                            │
│  4. Repeat                                                  │
└─────────────────────────────────────────────────────────────┘

Key design points:

  • Concurrent: branches are fetched from all providers in parallel; rule×provider pairs are also processed concurrently (up to 10 workers).
  • Idempotent: an existing open PR for the same source→destination pair is detected and skipped.
  • Dry-run mode: log what would be created without actually calling the API.
  • Graceful shutdown: handles SIGTERM / SIGINT and drains in-progress work.

Quick start

# Install
pip install -e .

# Point to your config file and run
CONFIG_PATH=./config.yaml pr-generator

Or with Docker:

docker run --rm \
  -v "$(pwd)/config.yaml:/etc/pr-generator/config.yaml:ro" \
  ghcr.io/devops-ia/pr-generator:latest

Configuration

YAML file

The default config path is /etc/pr-generator/config.yaml. Override with the CONFIG_PATH environment variable. The application exits with an error at startup if the file is not found.

# config.yaml

# How often (seconds) to scan for new branches.
scan_frequency: 300        # default: 300

# Logging level: DEBUG | INFO | WARNING | ERROR
log_level: INFO            # default: INFO

# Log format: "text" (human-readable) or "json" (structured, for log aggregators)
log_format: text           # default: text

# When true, PRs are logged but never actually created.
dry_run: false             # default: false

# Port for the built-in health server.
health_port: 8080          # default: 8080

providers:
  github:
    enabled: true
    owner: my-org
    repo: my-repo
    app_id: "123456"
    installation_id: "78901234"   # optional — auto-resolved if omitted
    private_key_path: /secrets/github-app.pem   # path to PEM file
    # Alternative: set GITHUB_APP_PRIVATE_KEY env var (plain PEM or base64-encoded)
    timeout: 30            # HTTP timeout in seconds

  bitbucket:
    enabled: true
    workspace: my-workspace
    repo_slug: my-repo
    token_env: BITBUCKET_TOKEN   # name of the env var that holds the token
    close_source_branch: true    # delete source branch after merge (default: true)
    timeout: 30

rules:
  - pattern: "feature/.*"          # Python regex matched against branch names
    destinations:
      github: main
      bitbucket: develop

  - pattern: "release/.*"
    destinations:
      github: main

  - pattern: ".*-hotfix-.*"
    destinations:
      bitbucket: master

Multiple GitHub organisations

Use any name as the provider key and set type: github (or type: bitbucket) to identify the implementation. Rules reference providers by their name.

providers:
  github-acme:
    type: github          # required for non-standard key names
    enabled: true
    owner: acme-org
    repo: backend
    app_id: "111"
    private_key_path: /secrets/acme-app.pem

  github-skunkworks:
    type: github
    enabled: true
    owner: skunkworks-org
    repo: platform
    auth_method: pat
    token_env: SKUNKWORKS_GITHUB_TOKEN

  bitbucket:              # "github" / "bitbucket" keys default type automatically
    enabled: true
    workspace: my-workspace
    repo_slug: my-repo
    token_env: BITBUCKET_TOKEN

rules:
  - pattern: "feature/.*"
    destinations:
      github-acme: main
      github-skunkworks: develop
      bitbucket: develop

Config fields reference

Field Type Default Description
scan_frequency int 300 Seconds between scan cycles
log_level string "INFO" Python logging level
dry_run bool false Simulate PR creation without API calls
health_port int 8080 Port for health HTTP server
providers.<name>.type string (key name) Provider implementation: github or bitbucket. Required when the key name is not github or bitbucket
providers.<name>.enabled bool false Activate this provider instance. If no providers are enabled the application starts in idle mode — it logs a warning and keeps running without performing any scans
providers.<name>.owner string GitHub organisation or user (GitHub only)
providers.<name>.repo string Repository name (GitHub only)
providers.<name>.app_id string GitHub App ID (GitHub App auth)
providers.<name>.installation_id string (auto) Installation ID; resolved automatically if omitted (GitHub App auth)
providers.<name>.private_key_path string Path to GitHub App private key PEM file (GitHub App auth)
providers.<name>.auth_method string "app" app (GitHub App) or pat (Personal Access Token) (GitHub only)
providers.<name>.token_env string "GITHUB_TOKEN" / "BITBUCKET_TOKEN" Env var name containing the token (PAT / Bitbucket). Must be unique across all enabled providers of the same type — duplicate values raise a ValueError at startup
providers.<name>.workspace string Bitbucket workspace slug (Bitbucket only)
providers.<name>.repo_slug string Bitbucket repository slug (Bitbucket only)
providers.<name>.close_source_branch bool true Delete source branch after PR merges (Bitbucket only)
providers.<name>.timeout float 30 HTTP timeout (seconds)
rules[].pattern string Python regex applied to branch names
rules[].destinations map provider_name: destination_branch pairs

Environment variables

Variable Description
CONFIG_PATH Path to the YAML config file. Default: /etc/pr-generator/config.yaml
GITHUB_APP_PRIVATE_KEY GitHub App PEM key (plain text or base64-encoded). Used only when private_key_path is absent or empty in config — if private_key_path is set but the file does not exist, the application raises FileNotFoundError without falling back to this variable
GITHUB_TOKEN Default token env var for GitHub PAT providers (token_env: GITHUB_TOKEN)
BITBUCKET_TOKEN Default token env var for Bitbucket providers (token_env: BITBUCKET_TOKEN)
any name Custom env var referenced by token_env in provider config

Providers

GitHub App

Authentication uses a GitHub App. Two modes are available:

GitHub App (recommended) — the provider:

  1. Signs a short-lived JWT with the App's RSA private key.
  2. Exchanges it for an installation access token (cached up to ~55 minutes).
  3. Uses the installation token for all API calls.
  4. Caches per-cycle PR-existence and branch-existence lookups to reduce API usage.

Personal Access Token (PAT) — set auth_method: pat and point token_env at an env var holding the PAT.

Required GitHub App permissions: Contents (read), Pull requests (read & write).

Bitbucket Cloud

Authentication uses a project/repository Bearer token (HTTP access token).

The provider fetches default reviewers at PR creation time and automatically includes them in the payload.

Required Bitbucket permissions: Repositories (read), Pull requests (read & write).


Rules

Each rule has:

  • pattern — a Python regex (re.compile) matched against branch names using re.match (anchored at the start). The destination branch is excluded from matching.
  • destinations — a map of provider_name → destination_branch. Only providers that are both listed here and active in providers are processed.
rules:
  - pattern: "feature/.*"
    destinations:
      github: main          # create PRs toward "main" on GitHub
      bitbucket: develop    # create PRs toward "develop" on Bitbucket

Multiple rules are supported.


ArgoCD Image Updater integration

pr-generator pairs naturally with Argo CD Image Updater. Image Updater creates branches named argocd-image-updater-set-<app>-<env>-<image>. Configure rules to catch those branches and open PRs toward the appropriate target branch per environment.

scan_frequency: 120

providers:
  github:
    enabled: true
    owner: my-org
    repo: gitops-repo
    auth_method: app
    app_id: "123456"
    private_key_path: /secrets/github-app.pem

rules:
  - pattern: "argocd-image-updater-.*-dev-.*"
    destinations:
      github: develop

  - pattern: "argocd-image-updater-.*-staging-.*"
    destinations:
      github: staging

  - pattern: "argocd-image-updater-.*-pro-.*"
    destinations:
      github: main

Annotation-based discovery

Instead of a central rules list, each ArgoCD Application CR can carry annotations that define its own PR rules. pr-generator reads these annotations on every scan cycle — no restart or config change required.

Modes

Mode Behaviour
config_only Static rules from config.yaml only. No Kubernetes API access. Default.
annotations_only Rules come exclusively from annotated ArgoCD Applications. rules: is ignored at runtime.
hybrid Both sources active. Annotation destinations win on same pattern+provider collision.

Annotation schema

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: my-app
  annotations:
    pr-generator.io/enabled: "true"
    pr-generator.io/pattern: "^image-updater/.*"
    pr-generator.io/destination.github: "main"       # provider key → base branch
    pr-generator.io/destination.bitbucket: "develop"

config.yaml

annotation_discovery:
  mode: hybrid                    # config_only | annotations_only | hybrid
  annotation_prefix: pr-generator.io   # default

# rules: required when mode is config_only or hybrid; optional for annotations_only
rules:
  - pattern: "^hotfix/.*"
    destinations:
      github: main

RBAC requirement

Annotation discovery reads applications.argoproj.io cluster-wide. The Helm chart creates a ClusterRole and ClusterRoleBinding automatically when annotationDiscovery.enabled: true. For bare Docker/pip deployments, the pod's ServiceAccount needs:

rules:
  - apiGroups: ["argoproj.io"]
    resources: ["applications"]
    verbs: ["get", "list"]

Health endpoints

A lightweight HTTP server starts on health_port (default 8080):

Endpoint Behaviour
GET /livez 200 live while running; 503 shutting down during shutdown
GET /healthz Same as /livez (alias)
GET /readyz 200 ready after the first scan cycle completes; 503 not ready before that
GET /metrics Prometheus text exposition (see Prometheus metrics)

Suitable for Kubernetes liveness, readiness, and startup probes:

livenessProbe:
  httpGet:
    path: /livez
    port: 8080
readinessProbe:
  httpGet:
    path: /readyz
    port: 8080

Prometheus metrics

pr-generator exposes Prometheus metrics at GET /metrics on the health port (default 8080).

Metrics reference

Metric Type Labels Description
pr_generator_scan_cycles_total Counter Scan cycles completed
pr_generator_scan_duration_seconds Histogram Duration per cycle (buckets: .1, .5, 1, 5, 10, 30, 60 s)
pr_generator_last_scan_timestamp_seconds Gauge Unix timestamp of last completed cycle
pr_generator_prs_created_total Counter provider PRs opened
pr_generator_prs_skipped_total Counter provider PRs skipped (already open)
pr_generator_prs_simulated_total Counter provider PRs simulated (dry_run: true)
pr_generator_scan_errors_total Counter provider Errors during branch fetch or PR creation
pr_generator_rules_active Gauge Rules active in the current cycle
pr_generator_annotation_rules_discovered Gauge Rules discovered from ArgoCD annotations in last cycle

The provider label value is the key name from config.providers (e.g. github, my-bitbucket).

Scraping

curl http://localhost:8080/metrics

Helm chart — Prometheus Operator

metrics:
  enabled: true
  serviceMonitor:
    enabled: true          # creates ServiceMonitor CRD
    interval: 30s
    labels:
      release: kube-prometheus-stack   # match your Operator's serviceMonitorSelector

Programmatic API

from prometheus_client import CollectorRegistry
from pr_generator.metrics import PrGeneratorMetrics

# Isolated registry (useful in tests)
m = PrGeneratorMetrics(registry=CollectorRegistry())
m.record_annotation_rules(3)
print(m.generate_latest().decode())

Docker

The image is built from a two-stage Dockerfile:

  • Stage 1 – installs Python dependencies into /install.
  • Stage 2 – minimal python:3.14-slim runtime; runs as a non-root user (prgen).
# Build
docker build -t pr-generator .

# Run with YAML config
docker run --rm \
  -v "$(pwd)/config.yaml:/etc/pr-generator/config.yaml:ro" \
  -v "$(pwd)/github-app.pem:/secrets/github-app.pem:ro" \
  -e BITBUCKET_TOKEN=<token> \
  -p 8080:8080 \
  pr-generator

Development

Prerequisites: Python ≥ 3.11

# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate

# Install the package in editable mode with dev extras
pip install -e .
pip install pytest

# Run tests
pytest

# Run with a local config
CONFIG_PATH=./config.yaml python -m pr_generator

Project layout

src/pr_generator/
├── __main__.py          # Entry point: startup, provider init, scan loop
├── config.py            # Config loading from YAML file
├── models.py            # Dataclasses: AppConfig, ProviderConfig, ScanRule, …
├── scanner.py           # Concurrent scan cycle orchestrator
├── health.py            # HTTP health server (/livez, /readyz, /healthz)
├── http_client.py       # Shared HTTP client with retry/backoff
├── annotation_discovery.py  # Kubernetes annotation-based rule discovery
├── config.py            # Config loader (YAML → AppConfig)
├── health.py            # HTTP health + metrics server (/livez, /readyz, /metrics)
├── logging_config.py    # Logging setup (plain text or structured JSON)
├── metrics.py           # Prometheus metrics (PrGeneratorMetrics)
└── providers/
    ├── base.py          # ProviderInterface Protocol
    ├── github.py        # GitHub App provider
    └── bitbucket.py     # Bitbucket Cloud provider

tests/
├── conftest.py                  # Shared pytest fixtures
├── test_annotation_discovery.py # Annotation discovery tests
├── test_config.py               # Config loading tests
├── test_health.py               # Health server tests
├── test_metrics.py              # Prometheus metrics tests
├── test_models.py               # Model tests
└── test_scanner.py              # Scan cycle tests

Troubleshooting

Application exits with FileNotFoundError

FileNotFoundError: [Core] private_key_path '/secrets/github-app.pem' does not exist.

private_key_path is set in config.yaml but the file is not present at that path. Either mount the PEM file at the configured path, or remove private_key_path from the config and set the GITHUB_APP_PRIVATE_KEY environment variable instead.

ValueError: duplicate tokenEnv

ValueError: [Core] Providers 'bb-eu' and 'bb-us' both use tokenEnv 'BITBUCKET_TOKEN'.

Two enabled providers of the same type share the same token_env value. Assign a unique env var name to each provider and export the corresponding variable in your runtime environment.

/readyz returns 503

This is expected during startup. The endpoint returns 503 not ready until the first full scan cycle completes. If it never flips to 200, check the application logs for errors in the scan cycle (API auth failures, missing config fields, network issues).

No PRs are created (dry_run is false, branches exist)

  1. Regex anchoring — rules use re.match, which is anchored at the start of the string. A pattern feature/.* will not match hotfix/feature/x. Enable log_level: DEBUG to see per-branch matching decisions.
  2. Provider name mismatch — the name in rules[].destinations must exactly match the provider key under providers:.
  3. Destination branch excluded — pr-generator skips branches whose name equals the destination branch to avoid self-targeting PRs.

GitHub App: RuntimeError: Could not resolve installation id

Set installation_id explicitly in the provider config (find it in your GitHub App settings under Installations), or ensure the GitHub App is installed on the target repository.