Root Causes Identified:
- Double slash in API URLs (
https://api.getforge.com//deployed_sites.json) - No timeout configuration for fetch requests
- No retry mechanism for transient network errors
- Insufficient error handling and logging
- Problem:
FORGE_APIenv var with trailing slash + code adding another slash =// - Solution: Added URL normalization to remove trailing slashes before constructing API URLs
- Files Modified:
deleter.js,src/middlewares/common.js
- Added: 10-second timeout for all API requests
- Implementation: Using AbortController for proper timeout handling
- Files Modified:
deleter.js,src/middlewares/common.js
- Added: Exponential backoff retry logic for network errors and HTTP 5xx errors
- Configuration:
deleter.js: 3 retries with 1s base delaycommon.js: 2 retries with 500ms base delay
- Retry Conditions: ETIMEDOUT, ECONNRESET, ENOTFOUND, AbortError, HTTP 5xx errors (502, 503, 504, etc.)
- Note: 502 Bad Gateway and other 5xx errors are now retried automatically
- Added: Better error categorization and logging
- Added: Specific handling for timeout vs. other network errors
- Added: Validation for missing FORGE_API environment variable
- Added: Request timeout protection (30s) to prevent hanging requests
- Added: Automatic retry for HTTP 5xx errors (502, 503, 504)
- Purpose: Base URL for Forge API endpoints
- Expected Value:
https://api.getforge.com(without trailing slash) - Current Issue: If set with trailing slash, ensure it's
https://api.getforge.com/ - Note: Code now handles both with and without trailing slash
Critical: Disk space issues can cause 504 Gateway Timeout errors. If disk usage exceeds 90%, the server may be unable to write cache files or logs.
df -h / # Check disk usage
du -sh /tmp/cache/ # Check cache directory size- Run the disk cleanup script:
sudo /app/scripts/disk-space-cleanup.sh - Manually clean old cache:
find /tmp/cache -type f -mtime +1 -delete - Check for large log files:
find /var/log -type f -size +100M -ls
See README_DISK_SPACE.md for detailed disk space management documentation.
Success Indicators:
🔄 Retrying API request in Xms- Retry mechanism working- Normal site cleaning messages without errors
Error Indicators:
💣 BOOM! Connection timeout to Forge API after retries- Network connectivity issues⚠️ FORGE_API environment variable not set- Configuration issue💣 BOOM! Request timeout to Forge API after retries- Server response issues💣 BOOM! API returned 502 error after retries- Upstream API issues (now retried automatically)⏱️ Timeout loading site meta for X after 30000ms- Request timeout protection triggered
- If ETIMEDOUT errors persist: Check network connectivity from server to api.getforge.com
- If retries are frequent: Consider increasing timeout values or check API health
- If "API unavailable" errors: Verify FORGE_API environment variable is set correctly
// In deleter.js
const API_TIMEOUT = 10000 // 10 seconds
const MAX_RETRIES = 3
const RETRY_DELAY_BASE = 1000 // 1 second
// In src/middlewares/common.js
const API_TIMEOUT = 10000 // 10 seconds
const MAX_RETRIES = 2
const RETRY_DELAY_BASE = 500 // 500msModify the constants at the top of each file if 10 seconds proves insufficient.