Project Health Monitoring
Running Services (PM2)
| Service | Purpose | Expected Behavior | Health Check |
|---|---|---|---|
tg-collector | Collect Telegram messages | Should be online 24/7, no restarts | ✅ Running |
forwarder | Forward picks | Should be online 24/7, no restarts | ✅ Running |
router | Route messages | Should be online 24/7, no restarts | ✅ Running |
discord-forwarder-python | Forward to Discord | Should be online 24/7, no restarts | ✅ Running |
discord-forwarder-leaks | Leaks forwarding | Should be online 24/7, no restarts | ✅ Running |
hiddenbag-bot | HiddenBag Telegram bot | 7 restarts - MONITOR | ⚠️ Restarts |
hiddenbag-web | HiddenBag Next.js site | 18 restarts - MONITOR | ⚠️ Restarts |
ai-tools-scheduler | AI Tools HQ scheduler | Should be online, no restarts | ✅ Running |
dailyai-picks | Daily AI picks | STOPPED - intentional? | ⏸️ Stopped |
Last Health Check
- Date: 2026-01-28 11:33 EST
- Overall Status: ✅ All stable
Diagnostic Results (from logs)
hiddenbag-bot (7 restarts earlier)
- Cause:
Supabase error: TypeError: fetch failed(intermittent network) - Current: ✅ STABLE - 900+ minutes uptime, working fine
- This is normal behavior - bot retries and PM2 restarts on crash
hiddenbag-web (18 restarts earlier)
- Cause: Port 3000 conflict (process 14632 using it), runs on 3001
- Cause: Next.js dev mode recompiles (not crashes)
- Current: ✅ WORKING - serving pages fine
dailyai-picks (stopped)
- Status: ✅ BY DESIGN
- Has PM2 cron:
*/30 * * * *(every 30 min) - It runs, processes, stops. This is correct.
What To Check
Every 4 Hours (Cron: pm2-health-check)
- All core services online (tg-collector, forwarder, router, discord-forwarder-*)
- Restart count - alert if >3 in 24h
- Check logs for errors
Daily
- Verify messages flowing (check last forward timestamp)
- Check disk space
- Review error logs
Weekly (Sunday)
- Full project review with Matt
- Update this health status
- Clear old logs
Improvement Tracking
Things That Can Be Better
| Area | Issue | Proposed Fix | Status |
|---|---|---|---|
| HiddenBag | Too many restarts | Check logs, fix crash cause | TODO |
| dailyai-picks | Stopped | Ask Matt if intentional | TODO |
| Context loss | Compaction loses context | Better memory logging | In Progress |
Lessons Learned
- Always write to memory files immediately
- Don't rely on session context - it can compact anytime
- Log Matt's exact words for reference
Notes
Updated by Damian automatically during health checks.