devops-execution-engine

Professional DevOps expertise for your Clawdbot instance

Version: 1.0.0
License: Apache 2.0
Author: Clawdbot Community
Platform: Clawdbot

Overview

The DevOps Execution Engine is a comprehensive skill package that extends Clawdbot with professional-grade DevOps capabilities. It provides a structured Plan → Approve → Execute workflow for safe infrastructure operations, combined with a comprehensive library of 11 production-ready DevOps skills.

Designed specifically for Clawdbot to leverage its unique ability to actually execute commands with human oversight, not just suggest them.

Features

✅ Safe Execution

Plan → Approve → Execute workflow
Risk classification (LOW/MEDIUM/HIGH/CRITICAL)
Pre-flight validation
Rollback procedures
Audit logging

📚 Comprehensive Skills Library

Kubernetes: Debug, deploy, manage
AWS: Operations, cost optimization
Infrastructure: Terraform, ArgoCD, GitOps
Containers: Docker operations
Observability: Log analysis, system health
Incident Response: Structured playbooks

🔒 Safety Guarantees

No auto-execution without approval
Every action logged
Rollback plans included
Risk assessment for all operations

Installation

# Clone the repository
git clone https://github.com/clawdbot/devops-execution-engine

# Install as Clawdbot skill
clawdbot skills:install ./devops-execution-engine

# Verify installation
clawdbot skills:list | grep devops-execution-engine

Quick Start

1. Diagnose an Issue

You: Check the cluster health

Clawd: [uses system-health and k8s-debug skills]
- Nodes: 3/3 Ready
- Pods: 2 in CrashLoopBackOff
- Disk: worker-1 at 85% (warning)

2. Generate Execution Plan

You: Fix the crashloop pods

Clawd: 📋 EXECUTION PLAN: plan-20260126-003

Title: Fix CrashLoopBackOff in api-service
Risk: MEDIUM
Time: ~5min

Steps:
1. Increase memory limit 256Mi → 512Mi
2. Wait for rollout (5min timeout)
3. Verify all pods running

Rollback: kubectl rollout undo deployment/api-service

Approve? (yes/no/explain)

3. Approve & Execute

You: approve

Clawd: ✅ Executing plan-20260126-003...

Step 1/3: Patching deployment... ✓
Step 2/3: Waiting for rollout... ✓ (2m 34s)
Step 3/3: Verifying pods... ✓

✅ Plan completed successfully
All pods running, no restarts
Logged to: memory/actions-log.jsonl

Included Skills

Kubernetes

k8s-debug - Troubleshoot pods, deployments, nodes
k8s-deploy - Safe deployment workflows with rollback
argocd-gitops - GitOps workflows with ArgoCD

Cloud

aws-ops - AWS resource management and queries
cost-optimization - Cloud cost analysis and recommendations

Infrastructure

terraform-workflow - IaC workflows and best practices
docker-ops - Container operations and debugging

Operations

incident-response - Structured incident response playbooks
log-analysis - Cross-platform log analysis patterns
system-health - Quick health checks (disk, memory, CPU)
git-workflow - Git workflows and DevOps practices

Usage Examples

Kubernetes Debugging

"Debug the pods in production namespace"
"Why is api-service crashing?"
"Check resource usage across the cluster"

Incident Response

"We have a SEV1 - API is down"
"Run incident response for high error rates"
"Check recent deployments"

Cost Optimization

"Analyze AWS costs this month"
"Find underutilized resources"
"Suggest cost optimizations"

Deployments

"Deploy api-service v2.1.0 to production"
"Rollback the last deployment"
"Check ArgoCD sync status"

Execution Plan Format

Plans are generated as YAML in memory/execution-plans/:

plan:
  id: plan-20260126-001
  title: "Fix CrashLoopBackOff in api-service"
  risk: MEDIUM
  estimated_time: 5min
  
  rollback:
    method: "Rollback deployment to previous revision"
    commands: ["kubectl rollout undo deployment/api-service"]
  
steps:
  - action: kubectl_patch
    command: "kubectl patch deployment api-service..."
    risk: MEDIUM
    reversible: true
    
  - action: wait_for_rollout
    timeout: 5m
    success_criteria: "all pods running"
    
approval:
  required: true
  status: pending

Safety Model

Risk Levels

🟢 LOW

Read-only operations
No impact on running services
Auto-executable (if configured)

🟡 MEDIUM

Resource changes (memory, CPU limits)
Scaling operations
Non-production deployments
Requires approval

🔴 HIGH

Production deployments
Service restarts
Configuration changes
Requires approval + impact analysis

⛔ CRITICAL

Data operations
Security/RBAC changes
Namespace/resource deletion
Blocked by default, requires override

Approval Process

Generate - Clawd creates execution plan
Present - Shows summary with risk assessment
Review - You examine the plan
Approve - Explicit "yes", "approve", or "execute"
Execute - Clawd runs steps sequentially
Verify - Post-execution validation
Log - Record to audit trail

Configuration

Create ~/.clawdbot/skills/devops-execution-engine/config.yaml:

# Execution engine config
execution:
  auto_approve_low_risk: false    # Auto-approve LOW risk actions
  pause_between_steps: true       # Pause after each step
  timeout_default: 300            # Default timeout (seconds)
  
# Logging
audit:
  log_path: "memory/actions-log.jsonl"
  log_level: "info"
  
# Safety
safety:
  require_approval: true
  allow_critical: false           # Block CRITICAL actions
  dry_run_by_default: false

Documentation

INSTALLATION.md - Installation guide
SKILLS.md - Skills reference
SAFETY.md - Safety guarantees
EXAMPLES.md - Usage examples
API.md - API reference

Contributing

Contributions welcome! See CONTRIBUTING.md

Adding Custom Skills

Create skill directory in skills/
Add SKILL.md with documentation
Include example execution plans
Submit PR

Reporting Issues

GitHub Issues: Bug reports and feature requests
Discussions: Questions and general discussion

License

Apache 2.0 - See LICENSE

Support

GitHub: https://github.com/clawdbot/devops-execution-engine
Discord: https://discord.com/invite/clawd
Docs: https://docs.clawd.bot

Built with ❤️ by the Clawdbot community

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

devops-execution-engine

Overview

Features

✅ Safe Execution

📚 Comprehensive Skills Library

🔒 Safety Guarantees

Installation

Quick Start

1. Diagnose an Issue

2. Generate Execution Plan

3. Approve & Execute

Included Skills

Kubernetes

Cloud

Infrastructure

Operations

Usage Examples

Kubernetes Debugging

Incident Response

Cost Optimization

Deployments

Execution Plan Format

Safety Model

Risk Levels

Approval Process

Configuration

Documentation

Contributing

Adding Custom Skills

Reporting Issues

License

Support

FilesExpand file tree

SKILL.md

Latest commit

History

SKILL.md

File metadata and controls

devops-execution-engine

Overview

Features

✅ Safe Execution

📚 Comprehensive Skills Library

🔒 Safety Guarantees

Installation

Quick Start

1. Diagnose an Issue

2. Generate Execution Plan

3. Approve & Execute

Included Skills

Kubernetes

Cloud

Infrastructure

Operations

Usage Examples

Kubernetes Debugging

Incident Response

Cost Optimization

Deployments

Execution Plan Format

Safety Model

Risk Levels

Approval Process

Configuration

Documentation

Contributing

Adding Custom Skills

Reporting Issues

License

Support