Observabilidade
0) Escopo
Mapeamento de observabilidade com base apenas no que esta versionado no repositorio:
- fontes de log;
- healthchecks existentes;
- metricas operacionais extraiveis;
- pontos de consulta de logs por ambiente.
0.1 Fontes de evidencia
docker-compose.ymldocker/nginx.confhelm/values.yamlhelm/templates/deployment.yamlhelm/templates/wp-cron-deployment.yaml.github/workflows/ci-cd-pipeline.ymlwe-dhedalos/functions/rest/system_logs.phpwe-dhedalos/functions/post_types/dynamic_logs.phpwe-dhedalos/functions/post_types/user_action_log.phpwe-dhedalos/functions/post_types/presence_log.phpwe-dhedalos/functions/post_types/cancellation_logs.phpwe-dhedalos/functions/3rd/log.phpwe-dhedalos/functions/3rd/simplybook.phpwe-dhedalos/functions/utils/user_patch_log.phpwe-dhedalos/functions/utils/log_user.phpwe-dhedalos/functions/utils/cache.phpwe-dhedalos/functions.phpwe-dhedalos/functions/utils/*_cron.php
1) Padrao de logs
1.1 Infraestrutura
Containers de runtime local:
nginxwordpressmariadbredis
Coleta local:
docker compose logs -f nginx wordpress mariadb redis
Evidencia:
docker-compose.yml:2
1.2 Logs estruturados de dominio (WordPress)
Fonte A: CPT dynamic_logs
Campos uteis:
post_typepost_idpost_titleuser_iduser_nameuser_emailuser_ipactiondetailstimestamp
Evidencia:
we-dhedalos/functions/post_types/dynamic_logs.php:320
Fonte B: tabela ${prefix}simplybook_api_requests_log
Campos uteis:
endpointhttp_methodrequest_paramsresponsetimestampuser_info(ID do usuario logado, quando houver)origin(cacheouapi)ip_addressuser_agent
Evidencia:
we-dhedalos/functions/3rd/log.php:38we-dhedalos/functions/3rd/simplybook.php:53
Fonte C: tabela ${prefix}user_patch_log
Campos uteis:
user_idreceivedcreated_at
Evidencia:
we-dhedalos/functions/utils/user_patch_log.php:21
Fonte D: tabela ${prefix}user_logs
Campos uteis:
messagecreated_at
Evidencia:
we-dhedalos/functions/utils/log_user.php:13
1.3 Request-id e user-id
Estado atual observado:
user-id: presente em multiplas fontes (dynamic_logs.user_id,simplybook_api_requests_log.user_info, etc.).request-id/trace-id: nao ha padrao implementado de correlacao no repo.
Evidencias:
we-dhedalos/functions/post_types/dynamic_logs.php:324docker/nginx.conf:1
2) Healthchecks
2.1 Kubernetes probes
Healthchecks configurados no chart:
livenessProbe:GET /readinessProbe:GET /
Evidencias:
helm/values.yaml:66helm/values.yaml:70helm/templates/deployment.yaml:94
2.2 Endpoints de smoke da aplicacao
Nao foi identificado endpoint dedicado /health//healthz no codigo.
Endpoints utilitarios para smoke:
GET /api/dhedalos/v1/theme_settings(publico)GET /api/dhedalos/v1/maintenance_mode?slug=hub|cadastro(publico)
Evidencias:
we-dhedalos/functions/rest/theme_settings.php:35we-dhedalos/functions/rest/maintenance_mode.php:30
3) Metricas importantes (extraiveis com o que existe)
Nao ha stack de metricas versionada (Prometheus/Grafana/OTel) neste repositorio. As metricas abaixo sao operacionais e derivadas de logs/endpoints/tabelas existentes.
3.1 Latencia
Pontos criticos:
- chamadas SimplyBook (timeout configurado em
30s) - limpeza de submissions externas (timeout
20s)
Evidencias:
we-dhedalos/functions/3rd/simplybook.php:127we-dhedalos/functions.php:104
Sinal pratico:
- aumento de erros de timeout no
wordpresslogs (error_log).
3.2 Erros
Pontos criticos:
- erros 4xx/5xx da API;
- falhas de integracoes Novu/SimplyBook;
- falhas em jobs diarios.
Evidencias:
- endpoint de auditoria:
we-dhedalos/functions/rest/system_logs.php:33 - erro Novu:
we-dhedalos/functions/utils/notify_course_start_date_cron.php:98 - erro cron cancelamento:
we-dhedalos/functions/utils/auto_cancel_enrollments_cron.php:152
3.3 Assincrono (cron em vez de fila)
Estado atual:
- processamento assincrono por WP-Cron;
- nao ha fila baseada em broker (RabbitMQ/SQS/Kafka);
- ha deployment dedicado
wp-cronno Kubernetes para execucao ciclica de eventos vencidos.
Evidencias:
we-dhedalos/functions.php:45we-dhedalos/functions/utils/auto_cancel_enrollments_cron.php:31helm/templates/wp-cron-deployment.yaml:4helm/templates/wp-cron-deployment.yaml:89
Metricas uteis:
- quantidade de eventos vencidos;
- tempo de atraso dos hooks diarios;
- taxa de falha por hook cron.
Comandos:
docker compose exec wordpress wp cron event list
docker compose exec wordpress wp cron event run --due-now
3.4 Banco e cache
Sinais uteis:
- crescimento de
dynamic_logs,user_log_action,sub_log_action,cancellation_logs; - crescimento de
simplybook_api_requests_log; - comportamento de flush/invalidacao de cache.
Evidencias:
- limpeza
dynamic_logs:we-dhedalos/functions/post_types/dynamic_logs.php:1016 - limpeza
user_log_action:we-dhedalos/functions/post_types/user_action_log.php:105 - limpeza
sub_log_action:we-dhedalos/functions/post_types/presence_log.php:133 - limpeza
cancellation_logs:we-dhedalos/functions/post_types/cancellation_logs.php:117 - flush global:
we-dhedalos/functions/utils/cache.php:28
4) Onde olhar logs
4.1 Local
docker compose logs -f nginx wordpress mariadb redis
4.2 Kubernetes
Ambientes mapeados na pipeline:
piloto-dhedalos-ecosystemdhedalos-ecosystemessencia-ecosystemdev-dhedalos-wp
Evidencias:
.github/workflows/ci-cd-pipeline.yml:56.github/workflows/ci-cd-pipeline.yml:88.github/workflows/ci-cd-pipeline.yml:120.github/workflows/ci-cd-pipeline.yml:152
Comandos base:
kubectl -n <namespace> get deploy
kubectl -n <namespace> logs deploy/<deployment-name> -c dhedalos-app-backend-wp-phpfpm --tail=200 -f
kubectl -n <namespace> logs deploy/<deployment-name> -c dhedalos-app-backend-wp-nginx --tail=200 -f
kubectl -n <namespace> logs deploy/<deployment-name>-wp-cron -c wp-cron --tail=200 -f
Evidencia dos containers:
helm/templates/deployment.yaml:49helm/templates/deployment.yaml:85helm/templates/wp-cron-deployment.yaml:54
4.3 API de auditoria interna
Endpoints:
GET /api/dhedalos/v1/system-logsGET /api/dhedalos/v1/system-logs/stats
Evidencias:
we-dhedalos/functions/rest/system_logs.php:33we-dhedalos/functions/rest/system_logs.php:45
4.4 Logs da pipeline
Para falhas de build/deploy, consultar logs do GitHub Actions no workflow CI/CD Pipeline.
Evidencia:
.github/workflows/ci-cd-pipeline.yml:1
5) Consultas de apoio (operacao)
Nota:
- substitua
<prefix>pelo prefixo real das tabelas (WORDPRESS_TABLE_PREFIX, padrao localwp_).
5.1 Conferir retencao e volume de tabelas de log
docker compose exec wordpress wp db query "SELECT COUNT(*) AS total FROM <prefix>user_patch_log;"
docker compose exec wordpress wp db query "SELECT COUNT(*) AS total FROM <prefix>simplybook_api_requests_log;"
docker compose exec wordpress wp db query "SELECT COUNT(*) AS total FROM <prefix>user_logs;"
5.2 Conferir ultimos eventos SimplyBook
docker compose exec wordpress wp db query "SELECT endpoint,http_method,origin,timestamp FROM <prefix>simplybook_api_requests_log ORDER BY id DESC LIMIT 20;"
5.3 Conferir cron backlog
docker compose exec wordpress wp cron event list
6) Retencao de logs observada
| Fonte | Politica observada |
|---|---|
dynamic_logs | remove entradas anteriores ao ano corrente |
user_log_action | remove acima de 90 dias |
sub_log_action | remove acima de 90 dias |
cancellation_logs | remove acima de 90 dias |
simplybook_api_requests_log | existe delete_old_logs() para 1 semana, sem agendamento identificado no repo |
Evidencias:
we-dhedalos/functions/post_types/dynamic_logs.php:1016we-dhedalos/functions/post_types/user_action_log.php:105we-dhedalos/functions/post_types/presence_log.php:133we-dhedalos/functions/post_types/cancellation_logs.php:117we-dhedalos/functions/3rd/log.php:105
7) Pendencias
- Nao foi identificado backend externo de observabilidade explicitamente versionado (ex.: CloudWatch, ELK, Datadog, Dokku log drain) nem documentacao de consulta nesses provedores.
- Nao ha padrao implementado de correlacao
request-id/trace-identre Nginx, WordPress e integracoes externas. - Nao ha endpoint dedicado de healthcheck de aplicacao (
/health), apenas probes HTTP em/no chart. - A rotina de limpeza de
simplybook_api_requests_logexiste em codigo, mas nao foi encontrado hook/evento versionado que a execute automaticamente.