WIP: Set up Prometheus and Alertmanager #632

Draft
greg wants to merge 15 commits from feature/prometheus_cookbook into master
Owner
  • Add recipe for setting up Node Exporter on all machines (via role included in base role)
  • Add all node_exporter nodes as targets in prometheus.yml (via node/role search)
  • Write custom AltertManager config
    • Add Prometheus webhooks to hubot-incoming-webhook
    • Configure AltertManager to send to email and hubot/hal by default
  • Configure alerts for some basic problems like low disk space, low memory/swap, and CPUs running hot
  • Find all (interesting) existing prometheus metrics endpoints for services we run and add them as targets
    • Garage
    • PostgreSQL
    • Redis
    • Nginx (Openresty)
    • haproxy
    • ...
* [x] Add recipe for setting up [Node Exporter](https://prometheus.io/docs/guides/node-exporter/) on all machines (via role included in base role) * [x] Add all `node_exporter` nodes as targets in `prometheus.yml` (via node/role search) * [ ] Write custom AltertManager config * [ ] Add Prometheus webhooks to `hubot-incoming-webhook` * [ ] Configure AltertManager to send to email and hubot/hal by default * [ ] Configure alerts for some basic problems like low disk space, low memory/swap, and CPUs running hot * [ ] Find all (interesting) existing prometheus metrics endpoints for services we run and add them as targets * [x] Garage * [ ] PostgreSQL * [ ] Redis * [ ] Nginx (Openresty) * [ ] haproxy * [ ] ...
greg added 4 commits 2026-07-03 16:00:58 +00:00
Owner

Would be nice to get #263 solved for looking up the targets' private IPs.

Would be nice to get #263 solved for looking up the targets' private IPs.
raucao self-assigned this 2026-07-03 17:54:39 +00:00
greg was assigned by raucao 2026-07-03 17:54:39 +00:00
raucao added the feature label 2026-07-03 17:57:33 +00:00
raucao added 5 commits 2026-07-04 13:27:48 +00:00
raucao added 1 commit 2026-07-04 13:45:37 +00:00
raucao added 2 commits 2026-07-04 13:47:00 +00:00
raucao added 1 commit 2026-07-04 13:51:48 +00:00
raucao added 1 commit 2026-07-04 14:14:50 +00:00
raucao changed title from WIP: Set up Prometheus to WIP: Set up Prometheus and AlertManager 2026-07-04 14:24:38 +00:00
raucao changed title from WIP: Set up Prometheus and AlertManager to WIP: Set up Prometheus and Alertmanager 2026-07-04 14:24:55 +00:00
raucao added this to the 2026 project 2026-07-04 14:25:32 +00:00
raucao moved this to In Progress in 2026 on 2026-07-04 14:25:46 +00:00
raucao added 1 commit 2026-07-04 14:30:34 +00:00
This pull request is marked as a work in progress.
This branch is out-of-date with the base branch
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin feature/prometheus_cookbook:feature/prometheus_cookbook
git checkout feature/prometheus_cookbook
Sign in to join this conversation.
No Reviewers
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: kosmos/chef#632