gitlab_sync.py

Sync a generated file to an external GitLab repository via merge request, with automatic merge wait, pipeline retry, and preemption support

Purpose

Sync a file from a public source URL to a target GitLab repository via merge request. The script always waits for the MR to be merged and for the post-merge pipeline to complete before returning. It is invoked by infrastructure CI jobs to keep external repos (pyxis-repo-configs, konflux-release-data) in sync with generated files from the source repos.

Usage

Run gitlab_sync.py --help for full usage information.

usage: gitlab_sync.py [-h] --source-url SOURCE_URL
                      --target-project TARGET_PROJECT
                      --target-file TARGET_FILE --sync-branch SYNC_BRANCH
                      --mr-title MR_TITLE --gitlab-url GITLAB_URL
                      [--merge-timeout MERGE_TIMEOUT]
                      [--post-merge-timeout POST_MERGE_TIMEOUT]
                      [--mr-description MR_DESCRIPTION] [--squash-on-merge]
                      [--dry-run]

Authentication:
  GITLAB_TOKEN    Environment variable with GitLab API token (required)

Monitoring:
  SENTRY_DSN      Sentry DSN for alerting on failures (optional)

How It Works

  1. Fetch source from the public --source-url (with HTTP retries)
  2. Determine comparison ref: use the sync branch if it exists, otherwise the project’s default branch
  3. Compare source content against the comparison ref (stripped whitespace)
  4. Early exit if content matches and the comparison ref is the default branch (nothing to deploy)
  5. Commit the source to the sync branch using the GitLab Commits API with force: true (creates a single-commit branch on top of the default branch). Also auto-squashes if the branch has accumulated multiple commits.
  6. Create or update MR targeting the default branch
  7. Self-approve the MR (best-effort; continues if already approved)
  8. Post takeover comment with CI_JOB_ID for preemption tracking
  9. Wait for MR merge (polling every 30s, up to --merge-timeout)
  10. Wait for post-merge pipeline to succeed (up to --post-merge-timeout)

Exit Codes

Code Meaning
0 Success (MR merged) or no changes needed
1 Failure (timeout, pipeline failure, unrecoverable error)
42 Preempted by a newer CI job managing the same MR

Preemption

When multiple CI jobs target the same sync branch, the script uses MR comments to coordinate. Each job posts a comment containing its CI_JOB_ID (monotonically increasing within a GitLab instance). Before each merge attempt, the script checks for comments with a higher job ID. If found, it yields by exiting with code 42.

The caller should handle exit 42 to stop processing (a newer job will handle all remaining syncs). When running outside CI (CI_JOB_ID not set), preemption is disabled.

Error Recovery

  • Pipeline failure: Retries the MR pipeline with exponential backoff (60s, 300s, 900s). Sends a Sentry alert if all retries are exhausted.
  • Merge conflict: Rebases the MR and re-approves (rebase resets approvals in GitLab). Up to 3 attempts before failing with a Sentry alert.
  • Post-merge pipeline failure: Retries once, then returns exit code 1. This blocks downstream sync steps (e.g., Pyxis must succeed before RPA).
  • Draft MR: Keeps polling without attempting to merge (allows manual intervention).

Examples

# Dry run: show diff without modifying anything
GITLAB_TOKEN=$TOKEN gitlab_sync.py \
  --source-url "https://gitlab.com/redhat/hummingbird/containers/-/raw/main/releng/pyxis-hummingbird.yaml" \
  --target-project "releng/pyxis-repo-configs" \
  --target-file "products/hummingbird/hummingbird.yaml" \
  --sync-branch "hummingbird/sync-containers-pyxis" \
  --mr-title "chore: Update hummingbird Pyxis config" \
  --gitlab-url "https://gitlab.cee.redhat.com" \
  --dry-run

# Sync Pyxis config (requires squash merge)
gitlab_sync.py \
  --gitlab-url "https://gitlab.cee.redhat.com" \
  --source-url "https://gitlab.com/redhat/hummingbird/containers/-/raw/main/releng/pyxis-hummingbird.yaml" \
  --target-project "releng/pyxis-repo-configs" \
  --target-file "products/hummingbird/hummingbird.yaml" \
  --sync-branch "hummingbird/sync-containers-pyxis" \
  --mr-title "chore: Update hummingbird Pyxis config" \
  --squash-on-merge

# Sync RPM RPA to konflux-release-data
gitlab_sync.py \
  --gitlab-url "https://gitlab.cee.redhat.com" \
  --source-url "https://gitlab.com/redhat/hummingbird/rpms/-/raw/main/releng/hummingbird-rpms-staging.yaml" \
  --target-project "releng/konflux-release-data" \
  --target-file "config/kflux-prd-rh03.nnv1.p1/product/ReleasePlanAdmission/hummingbird/hummingbird-rpms-staging.yaml" \
  --sync-branch "hummingbird/sync-rpms-rpa" \
  --mr-title "Update hummingbird RPM ReleasePlanAdmission"

Preemption wrapper for sequential syncs in a CI job:

gitlab_sync.py --gitlab-url ... --source-url ... --squash-on-merge
rc=$?
if [ "$rc" -eq 42 ]; then
  echo "Preempted by newer job, stopping"
  exit 0
elif [ "$rc" -ne 0 ]; then
  exit "$rc"
fi
# Proceed to next sync only on success
gitlab_sync.py --gitlab-url ... --source-url ...

Environment Variables

Variable Required Description
GITLAB_TOKEN Yes API token with Developer+ access on the target project
CI_JOB_ID No Set automatically in CI; enables preemption detection
SENTRY_DSN No Sentry DSN for failure alerting

Development

The script lives at ci/images/gitlab-ci/gitlab_sync.py and is baked into the gitlab-ci container image via the Containerfile.

# Run unit tests
python -m unittest discover -s ci/images/gitlab-ci/tests -p 'test_*.py'

# Lint
ruff check ci/images/gitlab-ci/gitlab_sync.py

# Type check
mypy ci/images/gitlab-ci/gitlab_sync.py

Tests use unittest with the responses library to mock HTTP responses at the requests level. See ci/images/gitlab-ci/tests/test_gitlab_sync.py.