Files
Emanuel Almeida 6b3a6f2698 feat: refactor 30+ skills to Anthropic progressive disclosure pattern
- All SKILL.md files now <500 lines (avg reduction 69%)
- Detailed content extracted to references/ subdirectories
- Frontmatter standardised: only name + description (Anthropic standard)
- New skills: brand-guidelines, spec-coauthor, report-templates, skill-creator
- Design skills: anti-slop guidelines, premium-proposals reference
- Removed non-standard frontmatter fields (triggers, version, author, category)

Plugins affected: infraestrutura, marketing, dev-tools, crm-ops, gestao,
core-tools, negocio, perfex-dev, wordpress, design-media

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 15:05:03 +00:00

7.5 KiB

name, description, context
name description context
gateway-check Health check rapido dos MCPs no gateway.descomplicar.pt — estado services (systemd+pm2), portas, memoria/CPU, erros recentes. Output tabela resumo. fork

/gateway-check v1.0

Health check rapido dos MCPs no servidor gateway (mcp-hub.descomplicar.pt).

Referencia: PROC-MCP-Desenvolvimento.md | Memory: mcp-gateway.md, infra.md


Inventario MCPs Gateway

pm2 (Node.js — /opt/mcp-gateway/)

pm2 ID Nome Porta Prioridade
0 mcp-desk-crm 3150 P1
1 mcp-memory 3151 P2
2 mcp-wikijs 3152 P3
4 mcp-moloni 3158 P2

systemd (24 services)

Service Porta Tipo Prioridade
mcp-time 3163 Node P1
google-workspace-mcp 3164 Python/FastMCP P1
n8n-mcp 3157 Node P2
gitea-mcp 3162 Go P2
gsc-mcp 3153 Python/FastMCP P2
google-analytics-mcp 3156 Python/FastMCP P2
imap-enterprise 3155 Node P2
context7-mcp 3159 Node P3
cwp-mcp 3183 Node/supergateway P3
cloudflare-dns-mcp 3171 Node/supergateway P3
mcp-youtube 3187 Python/FastMCP P3
youtube-research 3184 Node P3
magic-mcp 3172 Node/supergateway P3
mcp-echarts-mcp 3173 Node/supergateway P3
mcp-mermaid-mcp 3174 Node/supergateway P3
metabase-mcp 3175 Node/supergateway P3
pixabay-mcp 3176 Node/supergateway P3
replicate-mcp 3177 Node/supergateway P3
outline-api-mcp 3178 Node/supergateway P3
pexels-mcp 3179 Node/supergateway P3
penpot-mcp 3180 Node/supergateway P3
vimeo-mcp 3181 Node/supergateway P3
presenton-mcp 3182 Node/supergateway P3
mcp-reonic 3160 Node P3

Prioridades: P1=critico (bloqueia trabalho) | P2=importante (degrada workflow) | P3=util


Protocolo de Execucao

1. Estado dos services

# Executar via mcp__ssh-unified__ssh_execute(server="gateway")

# pm2
pm2 jlist 2>/dev/null | python3 -c "
import sys,json
for p in json.load(sys.stdin):
    print(f\"{p['name']:20s} {p['pm2_env']['status']:10s} cpu={p['monit']['cpu']}% mem={p['monit']['memory']//1024//1024}MB restarts={p['pm2_env']['restart_time']} uptime={round(($(date +%s)*1000-p['pm2_env']['pm_uptime'])/3600000,1)}h\")
"

# systemd — estado + memoria
systemctl list-units --type=service --state=running,failed --no-pager | grep -i mcp
systemctl list-units --type=service --state=failed --no-pager | grep -i mcp

2. Verificar portas activas

# Confirmar que todas as portas esperadas estao a escutar
for port in 3150 3151 3152 3153 3155 3156 3157 3158 3159 3160 3162 3163 3164 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3187; do
  if ss -tln | grep -q ":${port} "; then
    echo "OK   :${port}"
  else
    echo "DOWN :${port}"
  fi
done

3. Memoria e CPU por MCP

# Top consumers de memoria
ps aux --sort=-%mem | head -20 | grep -E 'node|python|supergateway|mcp'

# Memoria total MCPs
ps aux | grep -E 'mcp|supergateway' | awk '{sum+=$6} END {printf "Total MCP RAM: %.0f MB\n", sum/1024}'

# Load do servidor
uptime
free -h

4. Erros recentes (ultimos 30min)

# pm2 logs com erros
pm2 logs --err --lines 5 --nostream 2>/dev/null

# systemd services com erros recentes
for svc in $(systemctl list-units --type=service --state=running | grep -i mcp | awk '{print $1}'); do
  errs=$(journalctl -u $svc --since "30 min ago" -p err --no-pager -q 2>/dev/null | wc -l)
  if [ "$errs" -gt 0 ]; then
    echo "=== $svc ($errs erros) ==="
    journalctl -u $svc --since "30 min ago" -p err --no-pager -q 2>/dev/null | tail -3
  fi
done

5. Gateway nginx health

# Verificar nginx activo
systemctl is-active nginx

# Testar endpoint health (se existir)
curl -s -o /dev/null -w "%{http_code}" http://localhost/health 2>/dev/null || echo "no-health-endpoint"

Execucao Pratica

Executar os 5 passos via mcp__ssh-unified__ssh_execute(server="gateway"). Agrupar comandos para minimizar chamadas SSH (maximo 2-3 chamadas).

Chamada 1 — estado geral:

echo "=== PM2 ===" && pm2 list 2>/dev/null && echo "=== SYSTEMD ===" && systemctl list-units --type=service --state=running --no-pager | grep -i mcp && echo "=== FAILED ===" && systemctl list-units --type=service --state=failed --no-pager | grep -i mcp && echo "=== LOAD ===" && uptime && free -h

Chamada 2 — portas + memoria + erros:

echo "=== PORTAS ===" && for port in 3150 3151 3152 3153 3155 3156 3157 3158 3159 3160 3162 3163 3164 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3187; do if ss -tln | grep -q ":${port} "; then echo "OK   :${port}"; else echo "DOWN :${port}"; fi; done && echo "=== RAM MCPs ===" && ps aux | grep -E 'mcp|supergateway' | grep -v grep | awk '{sum+=$6} END {printf "Total: %.0f MB\n", sum/1024}' && echo "=== PM2 ERROS ===" && pm2 logs --err --lines 3 --nostream 2>/dev/null && echo "=== SYSTEMD ERROS (30min) ===" && for svc in $(systemctl list-units --type=service --state=running | grep -i mcp | awk '{print $1}'); do errs=$(journalctl -u $svc --since "30 min ago" -p err --no-pager -q 2>/dev/null | wc -l); if [ "$errs" -gt 0 ]; then echo "--- $svc ($errs erros) ---"; journalctl -u $svc --since "30 min ago" -p err --no-pager -q 2>/dev/null | tail -2; fi; done

Output

Apresentar resultado como tabela resumo:

## Gateway Health Check — [data via mcp-time]

**Servidor:** mcp-hub.descomplicar.pt | **Load:** X.XX | **RAM:** X.XG/XG | **MCPs RAM:** XXXMB

### Estado MCPs (X/28 operacionais)

| # | MCP | Porta | Gestor | Estado | RAM | Notas |
|---|-----|-------|--------|--------|-----|-------|
| 1 | mcp-desk-crm | 3150 | pm2 | OK/DOWN/WARN | XXmb | restarts, erros |
| ... | ... | ... | ... | ... | ... | ... |

### Alertas
- [P1] MCP X esta DOWN — accao sugerida
- [WARN] MCP Y tem N restarts nas ultimas Xh
- [WARN] RAM total MCPs > 2GB (limite recomendado)

### Erros Recentes
[lista de erros se existirem, agrupados por MCP]

Criterios de estado:

  • OK — service running + porta a escutar + sem erros recentes
  • WARN — running mas com erros recentes OU >5 restarts OU memoria >250MB
  • DOWN — service parado OU porta nao escuta

Troubleshooting Automatico

Se MCP DOWN:
  1. Verificar service: systemctl status <nome>
  2. Ver logs: journalctl -u <nome> --since "1h ago" --no-pager | tail -20
  3. Se supergateway: verificar preload catch-errors.mjs (mcp-gateway.md)
  4. Tentar restart: systemctl restart <nome>
  5. Re-verificar porta

Se RAM total > 2GB:
  1. Identificar top consumers
  2. Verificar processos orphan: ps aux | grep -c supergateway
  3. Se orphans > 28: limpar com pkill e restart escalonado (infra.md)

Se muitos restarts pm2:
  1. pm2 logs <nome> --err --lines 20
  2. Verificar se e o bug conhecido do supergateway (mcp-gateway.md)

Anti-Patterns

  • Nunca fazer restart massivo sem verificar primeiro (pode causar downtime)
  • Nunca ignorar MCP P1 em estado DOWN
  • Sempre reportar estado mesmo que tudo esteja OK (confirma que o check correu)
  • Sempre incluir timestamp via mcp-time no output

Integracao

  • /today pode invocar /gateway-check como parte do checkup diario
  • /infra-check faz verificacao mais ampla (inclui despesas); /gateway-check e focado apenas nos MCPs gateway
  • Resultado pode ser publicado na discussao #31 (Logs) do projecto #65

Skill v1.0.0 | 12-03-2026 | Descomplicar