First Steps
Create your first service, escalation policy, on-call schedule, and test incident
First Steps
This guide walks you through setting up OpsKnight from scratch. By the end, you'll have a complete incident management workflow configured and tested.
What We'll Build
In this guide, you'll create:
- An Admin User — Your first login account
- A Team — To organize your responders
- A Service — Representing a system you monitor
- An Escalation Policy — Defining who gets notified
- An On-Call Schedule — Rotating on-call duties
- A Test Incident — Verifying everything works
Each step builds on the previous, so follow them in order.
Step 1: Create Admin User (First-Time Setup)
When you first access OpsKnight, you'll be automatically redirected to the setup page at /setup.
On the Setup Page:
- Enter your Name (e.g., "Jane Admin")
- Enter your Email (e.g., [email protected])
- Click Create Admin Account
Important: A secure password will be generated for you and displayed only once. Copy it immediately and store it safely — you won't be able to see it again.
After Setup:
- You'll be redirected to the login page
- Log in with your email and the generated password
- Once logged in, you can change your password in Profile settings
Note: The
/setuppage is only accessible when no users exist in the system. After the first admin is created, this page becomes unavailable.
Step 2: Invite Team Members
Before setting up schedules, you need team members to put on-call.
Navigate to Users
- Click Settings in the sidebar
- Select Users
Invite a User
- Click Invite User
- Enter their email address
- Select a role:
- User — Can view and acknowledge incidents
- Responder — Can handle incidents and be on-call
- Admin — Full system access
- Click Send Invitation
The user receives an email with a link to set their password and activate their account.
Tip: For testing, invite yourself with a different email to simulate a team member.
Step 3: Create a Team
Teams organize users and can be targeted by escalation policies.
Navigate to Teams
- Click Teams in the sidebar
Create Your Team
- Click Create Team
- Enter team details:
- Name:
Platform Engineering - Description:
Responsible for core infrastructure
- Name:
- Click Create
Add Members
- Open your new team
- Click Add Members
- Select users and assign roles:
- Owner — Can manage the team
- Admin — Can add/remove members
- Member — Standard team member
Why Teams Matter:
- Escalation policies can notify entire teams
- Incidents can be assigned to teams
- Dashboards can filter by team
Step 4: Create a Service
Services represent the systems you monitor — APIs, databases, applications, or infrastructure components.
Navigate to Services
- Click Services in the sidebar
Create Your Service
- Click Create Service
- Fill in the details:
- Name:
Payment API - Description:
Handles all payment processing - Team: Select
Platform Engineering
- Name:
- Leave Escalation Policy empty for now (we'll create one next)
- Click Create
Why Services Matter:
- Alerts route through services to reach the right people
- Services connect to escalation policies
- Analytics track metrics per service
- Status pages display service health
Step 5: Create an Escalation Policy
Escalation policies define who gets notified when an incident occurs and how notifications escalate if no one responds.
Navigate to Escalation Policies
- Click Policies in the sidebar
Create Your Policy
- Click Create Policy
- Enter basic info:
- Name:
Payment API Escalation - Description:
Primary → Secondary → Team Lead
- Name:
Add Escalation Steps
Step 1 — Primary On-Call:
- Click Add Step
- Configure:
- Target Type:
Schedule(we'll create this next) - Delay:
0 minutes(notify immediately)
- Target Type:
Step 2 — Secondary Backup:
- Click Add Step
- Configure:
- Target Type:
User - Target: Select a backup person
- Delay:
5 minutes(if no ack after 5 min)
- Target Type:
Step 3 — Team Escalation:
-
Click Add Step
-
Configure:
- Target Type:
Team - Target:
Platform Engineering - Delay:
10 minutes
- Target Type:
-
Enable Repeat to loop back to Step 1 if no one acknowledges
-
Click Create
How Escalation Works
Incident Created
↓
Step 1: Notify Primary On-Call (from schedule)
↓ (wait 5 min if not acknowledged)
Step 2: Notify Backup User
↓ (wait 10 min if still not acknowledged)
Step 3: Notify Entire Team
↓ (if repeat enabled, go back to Step 1)
Step 6: Create an On-Call Schedule
On-call schedules define who is responsible for responding during specific time periods.
Navigate to Schedules
- Click Schedules in the sidebar
Create Your Schedule
- Click Create Schedule
- Enter details:
- Name:
Payment API Primary On-Call - Timezone: Select your team's timezone (e.g.,
America/New_York)
- Name:
- Click Create
Add a Layer
Layers allow multiple rotation patterns (e.g., weekday vs. weekend coverage).
- Click Add Layer
- Configure:
- Name:
Primary Rotation - Rotation Length:
168 hours(1 week per person) - Start Time: Select when the rotation begins
- Name:
Add Users to the Layer
- Click Add User
- Select team members in rotation order
- Drag to reorder if needed
Schedule Concepts:
- Layers — Different rotation patterns (primary, secondary, holiday)
- Rotation Length — How long each person is on-call (24h, 168h, etc.)
- Overrides — Temporary changes (vacation coverage, swaps)
Step 7: Connect Policy to Service
Now link everything together.
Update Your Service
- Go to Services → Payment API
- Click Edit
- Set Escalation Policy:
Payment API Escalation - Click Save
The Chain is Complete:
Payment API (Service)
↓
Payment API Escalation (Policy)
↓
Step 1: Payment API Primary On-Call (Schedule)
↓
Currently: Jane Admin (based on rotation)
Step 8: Configure Notifications (Optional)
Before testing, set up at least one notification channel.
Email (Simplest)
- Go to Settings → Notifications
- Configure SMTP or a provider (SendGrid, Resend, etc.)
- Test with Send Test Email
Slack (Recommended)
- Go to Settings → Integrations → Slack
- Click Connect to Slack
- Authorize OpsKnight in your workspace
- Select a default channel for notifications
See the Notifications Guide for detailed setup.
Step 9: Create a Test Incident
Let's verify everything works by creating a manual incident.
Create the Incident
- Click Incidents in the sidebar
- Click Create Incident
- Fill in:
- Title:
Test: Payment API High Latency - Description:
This is a test incident to verify the workflow - Service:
Payment API - Urgency:
High
- Title:
- Click Create
What Should Happen
- Incident Created — Appears in the incident list with status
OPEN - Notification Sent — Current on-call person receives alert via configured channels
- Timeline Updated — Shows "Incident triggered" event
Verify the Flow
- Open the incident to see the timeline
- Check that notifications were sent (look for delivery status)
- Click Acknowledge to stop escalation
- Click Resolve to close the incident
Step 10: Test via API (Optional)
For a more realistic test, send an alert via the Events API:
curl -X POST http://localhost:3000/api/events \
-H "Content-Type: application/json" \
-d '{
"routing_key": "payment-api",
"event_action": "trigger",
"dedup_key": "test-alert-001",
"payload": {
"summary": "High CPU usage on payment-api-prod-1",
"severity": "critical",
"source": "monitoring-system",
"custom_details": {
"cpu_percent": 95,
"host": "payment-api-prod-1"
}
}
}'
Note: The
routing_keyshould match an integration key configured for your service.
Congratulations!
You've successfully set up OpsKnight with:
- A team of responders
- A monitored service
- An escalation policy with multiple steps
- An on-call rotation schedule
- A tested incident workflow
What's Next?
Now that the basics are working, explore these areas:
Connect Real Monitoring Tools
Route alerts from your actual infrastructure:
Set Up More Notification Channels
Reach responders through multiple channels:
Configure SLAs
Track response time commitments:
Create a Status Page
Communicate with customers:
Understand Core Concepts
Deepen your knowledge:
Quick Reference
| What | Where |
|---|---|
| Create/manage users | Settings → Users |
| Create/manage teams | Teams |
| Create/manage services | Services |
| Create escalation policies | Policies |
| Create on-call schedules | Schedules |
| View/manage incidents | Incidents |
| Configure notifications | Settings → Notifications |
| Connect integrations | Settings → Integrations |
Last updated for v1
Edit this page on GitHub