Files
claude-plugins/infraestrutura/skills/gateway-check/SKILL.md
T
ealmeida faef9b47dc fix(project-manager): remover Dify KB das descriptions, marcar nota TODO
Dify foi removido 06-03-2026. Skills brainstorm/discover ainda referenciam-no
no corpo. Bump v1.2 + nota top-of-file. Reescrita workflow para próxima sessão.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 04:52:03 +01:00

8.4 KiB

name, description, context
name description context
gateway-check Gestao completa dos MCPs no gateway.descomplicar.pt — health check, restart, troubleshoot, mapa de portas, adicionar MCPs. Usar quando MCP falha, health check, ou gestao gateway. fork

/gateway-check v2.0

Gestao e health check dos MCPs no servidor gateway (VM 103, gateway.descomplicar.pt).

Referencia: Memory mcp-gateway.md | PROC-MCP-Desenvolvimento.md


Acesso

  • VM: 103 no Proxmox (QEMU)
  • IP: 5.9.90.69
  • SSH: mcp__ssh-unified__ssh_execute(server="gateway")
  • HTTPS: https://gateway.descomplicar.pt/v1/<nome>/mcp
  • Nginx whitelist: 188.251.199.30 (IP fixo NOS). Se 403 -> verificar IP com curl -4 ifconfig.me
  • Nginx config: /etc/nginx/sites-enabled/ no gateway
  • NAO confundir com: server (VM 100, 5.9.90.105), easy (VM 101, 5.9.90.70), dev (LXC 102)

Mapa de MCPs (30 services — actualizado 28-03-2026)

pm2 (Node.js — /opt/mcp-gateway/)

pm2 ID Nome Porta nginx path Prioridade
0 mcp-desk-crm 3150 /v1/desk-crm/mcp P1
1 mcp-memory 3151 /v1/memory/mcp P1
2 mcp-wikijs 3152 /v1/wikijs/mcp P3
5 mcp-moloni 3158 /v1/moloni/mcp P2
6 mcp-youtube-research 3157 /v1/youtube-research/mcp P3
7 mcp-youtube 3187 /v1/youtube/mcp P3

systemd (24 services)

Service Porta nginx path Tipo Prioridade
google-workspace-mcp 3164 /v1/google-workspace/mcp Python/FastMCP P1
mcp-time 3163 /v1/mcp-time/mcp Node P1
imap-enterprise 3160 /v1/imap/mcp Node P2
gitea-mcp 3162 /v1/gitea/mcp Go P2
n8n-mcp 3161 /v1/n8n/mcp Node P2
gsc-mcp 3153 /v1/gsc/mcp Python/FastMCP P2
google-analytics-mcp 3156 /v1/google-analytics/mcp Python/FastMCP P2
context7-mcp 3159 /v1/context7/mcp Node P2
mcp-reonic 3170 /v1/reonic/mcp Node P3
cloudflare-dns-mcp 3171 /v1/cloudflare-dns/mcp supergateway P3
magic-mcp 3172 /v1/magic/mcp supergateway P3
mcp-echarts-mcp 3173 /v1/mcp-echarts/mcp supergateway P3
mcp-mermaid-mcp 3174 /v1/mcp-mermaid/mcp supergateway P3
metabase-mcp 3175 /v1/metabase/mcp supergateway P3
pixabay-mcp 3176 /v1/pixabay/mcp supergateway P3
replicate-mcp 3177 /v1/replicate/mcp supergateway P3
outline-api-mcp 3178 /v1/outline-api/mcp supergateway P3
pexels-mcp 3179 /v1/pexels/mcp supergateway P3
penpot-mcp 3180 /v1/penpot/mcp supergateway P3
vimeo-mcp 3181 /v1/vimeo/mcp supergateway P3
presenton-mcp 3182 /v1/presenton/mcp supergateway P3
cwp-mcp 3183 /v1/cwp/mcp supergateway P3
design-engine-mcp 3184 /v1/design-engine/mcp supergateway P3

Prioridades: P1=critico (bloqueia trabalho) | P2=importante (degrada workflow) | P3=util Proxima porta livre: 3188


Protocolo de Health Check

Executar via mcp__ssh-unified__ssh_execute(server="gateway") em 2 chamadas:

Chamada 1 — estado geral

echo "=== PM2 ===" && pm2 list 2>/dev/null && echo "=== SYSTEMD ===" && systemctl list-units --type=service --state=running --no-pager | grep -i mcp && echo "=== FAILED ===" && systemctl list-units --type=service --state=failed --no-pager | grep -i mcp && echo "=== LOAD ===" && uptime && free -h

Chamada 2 — portas + erros

echo "=== PORTAS ===" && for port in 3150 3151 3152 3153 3156 3157 3158 3159 3160 3161 3162 3163 3164 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3187; do if ss -tln | grep -q ":${port} "; then echo "OK   :${port}"; else echo "DOWN :${port}"; fi; done && echo "=== RAM MCPs ===" && ps aux | grep -E 'mcp|supergateway' | grep -v grep | awk '{sum+=$6} END {printf "Total: %.0f MB\n", sum/1024}' && echo "=== ERROS (30min) ===" && for svc in $(systemctl list-units --type=service --state=running | grep -i mcp | awk '{print $1}'); do errs=$(journalctl -u $svc --since "30 min ago" -p err --no-pager -q 2>/dev/null | wc -l); if [ "$errs" -gt 0 ]; then echo "--- $svc ($errs) ---"; journalctl -u $svc --since "30 min ago" -p err --no-pager -q 2>/dev/null | tail -2; fi; done && echo "=== PM2 ERROS ===" && pm2 logs --err --lines 3 --nostream 2>/dev/null

Output esperado

Apresentar como tabela resumo com data via mcp-time:

## Gateway Health — [data]
Servidor: gateway.descomplicar.pt | Load: X.XX | RAM: X.XG/XG | MCPs RAM: XXXMB
X/30 operacionais | Alertas: N

Criterios: OK=running+porta escuta | WARN=running com erros ou >5 restarts | DOWN=parado ou porta fechada


Operacoes

Restart de um MCP

# pm2
pm2 restart <nome>

# systemd
systemctl restart <nome>.service

Ver logs de um MCP

# pm2
pm2 logs <nome> --lines 30 --nostream

# systemd
journalctl -u <nome>.service --since "1h ago" --no-pager | tail -30

Testar endpoint especifico

# Internamente no gateway
curl -s http://127.0.0.1:<porta>/mcp -X POST \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"0.1"}}}'

# Externamente
curl -s https://gateway.descomplicar.pt/v1/<nome>/mcp -X POST \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"0.1"}}}'

Troubleshooting

MCP DOWN

  1. Verificar service: systemctl status <nome> ou pm2 show <nome>
  2. Ver logs: journalctl -u <nome> --since "1h ago" --no-pager | tail -30
  3. Se supergateway: verificar preload catch-errors.mjs (ver abaixo)
  4. Tentar restart: systemctl restart <nome>
  5. Re-verificar porta: ss -tln | grep :<porta>

Supergateway crash (bug conhecido)

  • Erro: No connection established for request ID: 0
  • Fix: preload script em /opt/mcp-gateway/supergateway-catch-errors.mjs
  • Activacao: Environment="NODE_OPTIONS=--import /opt/mcp-gateway/supergateway-catch-errors.mjs" no unit file
  • 15 services patchados: todos os supergateway na tabela acima
  • Ao adicionar novo supergateway: OBRIGATORIO adicionar esta linha ao unit file

FastMCP Python + nginx (DNS rebinding)

FastMCP 1.26+ bloqueia Host headers externos. No nginx:

proxy_set_header Host "127.0.0.1:<PORTA>";  # CORRECTO
# proxy_set_header Host $host;              # ERRADO — FastMCP bloqueia

Detectar: Se MCP retorna Invalid Host header via HTTPS mas funciona em curl localhost -> e este problema.

403 Forbidden

IP nao esta na whitelist nginx. Verificar: curl -4 ifconfig.me IP autorizado: 188.251.199.30. Actualizar em /etc/nginx/sites-enabled/ se mudou.

RAM total > 2GB

  1. Identificar top consumers: ps aux --sort=-%mem | head -20 | grep -E 'node|python|supergateway'
  2. Processos orphan: ps aux | grep -c supergateway
  3. Se orphans > 24: pkill -f supergateway e restart escalonado

Adicionar novo MCP ao gateway

  1. Instalar em /opt/mcp-gateway/<nome>/ (Node) ou /opt/mcp-<nome>/ (Python)
  2. Porta: proxima livre (actualmente 3188)
  3. Criar unit file systemd (se supergateway: incluir preload catch-errors)
  4. Criar bloco nginx (se FastMCP Python: Host header fix obrigatorio)
  5. systemctl daemon-reload && systemctl enable --now <nome>.service
  6. nginx -t && systemctl reload nginx
  7. Testar internamente e externamente (ver comandos curl acima)
  8. Adicionar a ~/.claude.json: {"type":"http","url":"https://gateway.descomplicar.pt/v1/<nome>/mcp"}
  9. Actualizar esta skill (mapa de portas + proxima porta livre)
  10. Actualizar memory mcp-gateway.md

Anti-Patterns

  • Nunca restart massivo sem verificar primeiro
  • Nunca ignorar MCP P1 em estado DOWN
  • Nunca confundir gateway (VM 103) com dev/server/easy
  • Sempre reportar estado mesmo que tudo OK
  • Sempre testar endpoint apos restart
  • Sempre actualizar mapa de portas ao adicionar/remover MCPs

Skill v2.0.0 | 28-03-2026 | Descomplicar


Healing Log

Registo de erros conhecidos e como evitá-los. Lido automaticamente antes de executar.

{"date":"","issue":"","fix":"","source":"user|auto"}

Adicionar nova linha após cada erro corrigido.