ca-grow-ops-manager/.agent/workflows/debug-dns-routing.md
fullsizemalt 28d8e9e4a2
Some checks failed
Deploy to Production / deploy (push) Failing after 0s
Test / backend-test (push) Failing after 0s
Test / frontend-test (push) Failing after 0s
docs: Add agent-optimized debugging workflows for DNS/Routing
2025-12-09 08:54:51 -08:00

44 lines
1.8 KiB
Markdown

---
description: Debug and fix Traefik routing issues where the wrong app (e.g., Alertmanager) is served, indicating an upstream DNS/Cloudflare Wildcard conflict.
---
# Debugging DNS & Routing Conflicts (The Wildcard Trap)
If a subdomain (e.g., `777wolfpack.runfoo.run`) is serving the wrong application (like Alertmanager) and Traefik logs show NO activity for that domain, you are likely hitting a **Cloudflare Wildcard Fallback**.
## Diagnosis Steps
1. **Check Traefik Logs**:
`docker logs traefik --tail 50`
If you see requests for the domain, it's a local Traefik config issue.
If you see **ZERO requests**, traffic is not reaching this server.
2. **Verify DNS**:
`host 777wolfpack.runfoo.run`
Compare the returned IP with the server's public IP.
- **Match**: Routing issue is local.
- **Mismatch**: You are hitting a Wildcard (`*`) record pointing to a different server.
3. **Run the Server Matrix**:
Use this script to audit exactly what the server thinks it is doing.
```bash
#!/bin/bash
# map_server.sh
echo "=== OPERATIONAL MATRIX ==="
echo "[1] NATIVE PORTS (Who owns 80/443?)"
sudo ss -tulpn | grep -E ':80|:443'
echo ""
echo "[2] VIRTUAL_HOST (Nginx Proxy Check)"
docker ps -q | xargs docker inspect --format '{{.Name}} {{range $e := .Config.Env}}{{if ge (len $e) 12}}{{if eq (slice $e 0 12) "VIRTUAL_HOST"}} {{$e}} {{end}}{{end}}{{end}}'
echo ""
echo "[3] TRAEFIK ROUTERS"
docker ps -q | xargs docker inspect --format '{{.Name}} {{range $k, $v := .Config.Labels}}{{if or (eq $k "traefik.http.routers.wolfpack-frontend.rule") (eq $k "traefik.http.routers.aspirant-dashboard.rule")}}{{$k}}={{$v}}{{end}}{{end}}'
```
## The Fix
1. Go to **Cloudflare DNS**.
2. Add a specific **A Record** for the missing subdomain.
3. Point it to the **Correct Server IP**.
4. Wait 1 minute.