StackNova

Skip to main content
← Case studies

OpsHub — Real-Time Operations Command Center

SaaS / Operations

OpsHub

Product overview

OpsHub is a real-time operations dashboard and workflow engine that aggregates status from ticketing, monitoring, and internal tools into one view. Teams define health rules, runbooks, and escalations so that anomalies and incidents are detected early and handled consistently.

Problem statement

Operations leads were juggling 10+ tools and spreadsheets to understand system health and who was doing what. Status updates were manual and often outdated. Escalation paths and runbooks lived in wikis and Slack, leading to inconsistent response and duplicated effort. New hires took months to become effective.

Product vision

One screen that shows 'what's up' and 'what we're doing about it' in real time. Alerts should be actionable, runbooks one click away, and handoffs traceable. The product should scale from a single team to a global ops org without losing clarity.

Key features

  • Live KPI boards with configurable widgets and drill-down
  • Alert rules with thresholds, dependencies, and severity
  • Runbook library with step-by-step procedures and checklists
  • Incident timeline and assignment with audit trail
  • Integrations: PagerDuty, Datadog, Jira, ServiceNow, Slack
  • Role-based views and customizable dashboards per team

UX / product design approach

We used job stories and a few shadowing sessions with the client’s ops lead to capture how they work during incidents vs. steady state. We designed a clear hierarchy: summary → team view → incident detail. Scannable layout with color, icons, and one-click actions (acknowledge, assign, run runbook). Filters and time ranges are shareable via URL.

Technical architecture

React SPA with a real-time layer (WebSockets and SSE) for live metrics and events. Backend: Node.js API and event workers that normalize incoming webhooks and evaluate rules. PostgreSQL for config, state, and audit; TimescaleDB for time-series metrics. Integration adapters as separate services with retries and backoff. Caching for read-heavy dashboard payloads.

Technology stack

  • React, TypeScript, Vite
  • Node.js (Express), Bull queues
  • PostgreSQL, TimescaleDB
  • Redis, Socket.io
  • REST and webhook integrations to third-party tools

Challenges solved

  • Unifying heterogeneous alert formats and deduplicating related events
  • Keeping dashboard load under 2s with hundreds of live metrics
  • Designing runbook execution that works with human steps and external APIs
  • Keeping WebSocket connections efficient so the dashboard stays responsive as usage grows

Business impact

The client’s ops team uses OpsHub as their main view for incidents and runbooks. They’ve seen faster triage and less time on manual status updates. The product is built to support more teams as they grow.

Visual elements

Suggested UI highlights for this product.

  • Command center: grid of KPI cards with live sparklines and status indicators
  • Incident view: timeline, assignees, and runbook progress
  • Runbook editor: step list with timers and integration triggers
  • Integrations config: connection status and last sync time per system

Outcome

Dashboard and runbook system shipped; the client’s ops team uses it daily. They report quicker incident triage and less time chasing status in spreadsheets.

Services

  • SaaS
  • Custom Development
  • Design
  • Integrations