Files
claude-plugins/infraestrutura/skills/easypanel-troubleshoot/SKILL.md
T
ealmeida faef9b47dc fix(project-manager): remover Dify KB das descriptions, marcar nota TODO
Dify foi removido 06-03-2026. Skills brainstorm/discover ainda referenciam-no
no corpo. Bump v1.2 + nota top-of-file. Reescrita workflow para próxima sessão.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 04:52:03 +01:00

8.4 KiB

name, description
name description
easypanel-troubleshoot Diagnóstico automático de problemas de deploy no EasyPanel — análise de logs de containers, health endpoints, routing Traefik e recursos para identificar causas raiz.

EasyPanel Troubleshoot

Diagnóstico automático e inteligente de problemas de deploy no EasyPanel.

Quando Usar

  • Deploy falhou com erros 502/503
  • Container em crash loop (restarts frequentes)
  • Health endpoint não responde
  • Problemas de routing Traefik
  • Investigar causa de deploy falhado
  • After deploy validation

Quando NÃO Usar

  • Para deploys iniciais (usar /easypanel-init)
  • Para rollbacks de versão (usar /easypanel-rollback)
  • Para validação pós-deploy bem-sucedido (usar /easypanel-validate)
  • Quando o problema é conhecido e a solução já está identificada

Sintaxe

/easypanel-troubleshoot <service-name> [--verbose]

Exemplos

# Diagnóstico básico
/easypanel-troubleshoot dashboard-api

# Diagnóstico detalhado (logs completos)
/easypanel-troubleshoot dashboard-api --verbose

Workflow Automático

1. Check Service Status via API

# Obter token
TOKEN=$(cat /etc/easypanel/.api-token)

# Inspeccionar serviço (estado completo)
curl -s "http://localhost:3000/api/trpc/services.app.inspectService?input=$(echo -n '{"json":{"projectName":"PROJECT","serviceName":"SERVICE"}}' | jq -sRr @uri)" \
  -H "Authorization: Bearer $TOKEN" | jq '.result.data.json'

# Obter logs via API (namespace logs.*, NAO services.app.*)
curl -s "http://localhost:3000/api/trpc/logs.getServiceLogs?input=$(echo -n '{"json":{"projectName":"PROJECT","serviceName":"SERVICE","tail":100}}' | jq -sRr @uri)" \
  -H "Authorization: Bearer $TOKEN"

2. Fetch Logs (alternativa SSH)

docker logs <container> --tail 100

Parse de erros comuns:

  • Port already in use (EADDRINUSE)
  • Cannot find module (dependencies)
  • Connection refused (database)
  • Out of memory (OOM)
  • Unhandled promise rejection
  • Listen errors

3. Check Health Endpoint

curl https://<domain>/health
  • Status code: 200, 404, 502, 503?
  • Response time
  • Response body validation

4. Check Traefik Routing

  • Traefik labels correctos?
    • router.rule com domain
    • service.loadbalancer.server.port
    • entrypoints=websecure
    • certresolver=letsencrypt
  • Domain DNS aponta para servidor?
  • SSL certificate válido?

5. Check Port Mismatch

  • Dockerfile EXPOSE vs app actual port
  • docker-compose ports vs Traefik port
  • App listening port (from logs)
  • Common mismatch: 3000 vs 8080

6. Check Environment Variables

  • Required vars set? (compare .env.example)
  • Sensitive vars (DATABASE_URL, API_KEY)
  • Missing vars causing app crash?

7. Check Dependencies

  • Database reachable? (if applicable)
  • Redis reachable? (if applicable)
  • External APIs responding?

8. Check Resources

  • CPU throttling?
  • Memory limit reached (OOM)?
  • Disk space?

9. Generate Report

  • Issues found (prioritized: CRITICAL, WARNING, INFO)
  • Root cause (if detected)
  • Recommended fixes (step-by-step)
  • Commands to run

Error Pattern Detection

Port Mismatch

Pattern: "listening on port X" + Traefik port Y
Fix: Update Dockerfile EXPOSE or Traefik label

Missing Environment Variable

Pattern: "is not defined" / "undefined"
Fix: Add variable to EasyPanel environment

Crash Loop

Pattern: Restart count > 3 in 10min
Fix: Check logs for root cause

Database Connection

Pattern: "ECONNREFUSED" / "Connection refused"
Fix: Verify DATABASE_URL and network connectivity

Memory Issues

Pattern: "out of memory" / "OOM killed"
Fix: Increase memory limit in docker-compose

Output Format

🔍 EasyPanel Troubleshooting: <service-name>

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📊 SERVICE STATUS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Container: <name>
Status: Running | Stopped | Restarting
Uptime: <time>
Restarts: <count> (last 10min)
Memory: X / Y MB
CPU: X%

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔴 ISSUES FOUND (X)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

1. 🔴 CRITICAL: <Issue Title>
   Detected: <details>
   Impact: <what breaks>

   Fix:
   <step-by-step solution>

2. ⚠️  WARNING: <Issue Title>
   ...

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📋 LOGS (Last 20 lines)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

[timestamp] INFO: ...
[timestamp] ❌ ERROR: ...

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🛠️  RECOMMENDED ACTIONS (Prioritized)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

1. <Action 1>
2. <Action 2>
3. Redeploy: git commit + push

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔗 HELPFUL RESOURCES
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

- Research: /media/ealmeida/Dados/Dev/Docs/EasyPanel-Deploy-Research/
- Checklist: CHECKLIST_EasyPanel_Deploy.md
- Templates: TEMPLATE_Dockerfile_NodeJS_MultiStage

API Endpoints Usados

Ver skill /easypanel-api para documentação completa.

Acção Endpoint Verificado
Estado serviço GET services.app.inspectService Sim
Logs serviço GET logs.getServiceLogs Sim
Listar projectos GET projects.listProjects Sim
Stats sistema GET monitor.getSystemStats Sim
Stats Docker GET monitor.getDockerTaskStats Sim

Endpoints que NAO existem: services.app.getServiceLogs (usar logs.getServiceLogs), monitor.getStats (usar monitor.getSystemStats).

MCPs Necessários

  • ssh-unified - Acesso ao servidor para API e fallback

Tools Necessários

# Via SSH
docker ps
docker logs <container> --tail 100
docker inspect <container>
curl https://<domain>/health

Checklist Execução

  • Conectar via SSH ao servidor EasyPanel
  • Obter status do container (docker ps)
  • Fetch logs (últimas 100 linhas)
  • Parse errors automático (patterns conhecidos)
  • Check health endpoint (curl)
  • Verificar Traefik labels (docker inspect)
  • Detectar port mismatch
  • Verificar env vars (se possível)
  • Gerar report estruturado
  • Recomendar fixes priorizados

Error Messages Database

Common Patterns

Pattern Root Cause Fix
EADDRINUSE :::PORT Port já em uso Matar processo ou usar porta diferente
Cannot find module 'X' Dependency missing npm install no build
ECONNREFUSED Database/service down Verificar network/DATABASE_URL
502 Bad Gateway App crash ou port mismatch Check logs + ports
MODULE_NOT_FOUND Build incorrecto Verificar Dockerfile COPY
OOM killed Memória insuficiente Aumentar limite RAM

Security

  • NUNCA expor credenciais nos logs
  • Sanitizar env vars em output
  • Limitar acesso SSH apenas ao necessário

Performance

  • Limitar logs a 100 linhas (evitar overflow)
  • Cache de patterns conhecidos
  • Timeout de 30s em curl health checks

Versão: 1.0.0 | Autor: Descomplicar® | Data: 2026-02-04

Metadata (Desk CRM Task #65)

Tarefa: SKL: /easypanel-troubleshoot - Automated Diagnostics
Milestone: 294 (Skills Claude Code)
Tags: skill(79), stackworkflow(75), claude-code(81), activo(116)
Responsáveis: Emanuel(1), AikTop(25)
Status: 4 (Em progresso) → 5 (Concluído)

Healing Log

Registo de erros conhecidos e como evitá-los. Lido automaticamente antes de executar.

{"date":"","issue":"","fix":"","source":"user|auto"}

Adicionar nova linha após cada erro corrigido.