Drift Detection Troubleshooting
Resolve configuration drift detection and monitoring issues
Drift Detection Troubleshooting
Resolve issues with configuration drift detection including monitoring failures, baseline problems, and alert delivery.
Monitoring Not Running
Schedule Not Active
Symptoms:
- No new drift events detected
- Last scan was days/weeks ago
- Schedule shows inactive
Solutions:
-
Enable monitoring schedule
1. Go to Drift Detection → Schedules 2. Find your schedule 3. Toggle to Active 4. Save changes -
Check schedule configuration
- Verify frequency is set
- Check if schedule is paused
- Review start/end times if configured
No Baselines Configured
Symptoms:
- Monitoring runs but no drift detected
- "No baselines to compare" message
Solutions:
-
Create baselines
- Go to Drift Detection → Baselines
- Create baseline for resources to monitor
- Activate baselines
-
Verify baseline scope
- Check which resources are included
- Ensure relevant resource types selected
Connection Issues
Symptoms:
- "Unable to fetch configuration" errors
- Monitoring fails repeatedly
Solutions:
-
Test M365 connection
- Settings → Integrations → Test
- Fix connection issues first
-
Check permissions
- Drift detection needs same permissions as assessments
- Verify all permissions granted
Drift Events Not Appearing
False Silence
Symptoms:
- Know configuration changed
- No drift event created
- History shows no changes
Causes:
- Change not in monitored scope
- Baseline doesn't include that setting
- Comparison rule too lenient
Solutions:
-
Check baseline scope
- Review what's included in baseline
- Add missing resource types
- Expand monitoring scope
-
Review comparison rules
- Rules may be too lenient
- Exact match vs contains
- Adjust thresholds
-
Check resource type coverage
- Not all M365 settings monitored
- Review supported resource types
Delayed Detection
Symptoms:
- Changes detected hours/days later
- Expecting immediate detection
Causes:
- Scheduled monitoring (not real-time)
- Schedule frequency too low
Solutions:
-
Understand schedule frequency
- Drift detection runs on schedule
- Not real-time monitoring
- Increase frequency if needed
-
Run manual scan
- Drift Detection → Run Now
- Immediate scan for changes
Too Many Drift Events
Alert Fatigue
Symptoms:
- Hundreds of drift events
- Minor changes creating noise
- Can't find important changes
Causes:
- Baseline too strict
- Monitoring scope too broad
- Normal changes flagged
Solutions:
-
Adjust baseline thresholds
- Allow minor variations
- Use regex patterns
- Exclude volatile settings
-
Narrow monitoring scope
- Focus on critical settings
- Exclude low-risk resources
- Prioritize by severity
-
Use severity filtering
- Focus on high/critical
- Acknowledge low-priority events
- Configure alert thresholds
Expected Changes Flagged
Symptoms:
- Planned changes create drift events
- Maintenance creates alerts
- Known-good changes flagged
Solutions:
-
Update baseline after changes
- Reflect new expected state
- Drift Detection → Update Baseline
- Re-snapshot current state
-
Acknowledge events
- Mark as acknowledged
- Add note explaining change
- Won't alert again
-
Use maintenance windows
- Pause monitoring during changes
- Resume after updates
- Update baseline before resuming
Baseline Issues
Baseline Creation Fails
Symptoms:
- Can't create new baseline
- "Failed to snapshot" error
Solutions:
-
Check M365 connection
- Test connection first
- Fix any connection issues
-
Verify permissions
- Need read access to resource types
- Check required permissions
-
Try smaller scope
- Start with single resource type
- Expand after successful creation
Baseline Out of Date
Symptoms:
- Baseline represents old state
- Many drift events from outdated comparison
Solutions:
-
Update baseline
1. Go to Drift Detection → Baselines 2. Select baseline 3. Click Update from Current 4. Confirm update -
Review before updating
- Check current drift events
- Decide which changes to accept
- Update to new known-good state
Comparison Rule Problems
Symptoms:
- Wrong settings flagged
- Expected matches not matching
- Regex not working
Solutions:
-
Test comparison rules
- Preview before saving
- Test with sample data
- Adjust patterns
-
Use simpler rules
- Start with exact match
- Add flexibility as needed
- Document rule logic
Alert Delivery Issues
Not Receiving Alerts
Symptoms:
- Drift events exist but no alerts
- Email alerts not arriving
- Webhook not triggering
Solutions:
-
Check alert configuration
1. Drift Detection → Alerts 2. Verify alerts enabled 3. Check severity thresholds 4. Verify delivery method -
Email delivery issues
- Check spam/junk folder
- Verify email address correct
- Add securtea.io to allowlist
-
Webhook issues
- Test webhook URL
- Check authentication
- Review webhook logs
Too Many Alerts
Symptoms:
- Alert overload
- Every small change triggers alert
- Can't keep up with notifications
Solutions:
-
Adjust severity threshold
- Only alert on high/critical
- Lower severity → no alert
-
Configure digest alerts
- Daily/weekly summary
- Instead of per-event
-
Reduce monitoring scope
- Focus on critical resources
- Less scope = fewer alerts
Alert Delays
Symptoms:
- Alerts arrive late
- Real-time expected but delayed
Solutions:
-
Understand alert timing
- Alerts sent after scan completes
- Not real-time
- Based on schedule frequency
-
Check delivery method
- Webhooks are immediate
- Email may have delays
- Check email queue status
History and Snapshots
History Missing
Symptoms:
- Can't view historical configurations
- Snapshots not being saved
Causes:
- Retention period exceeded
- Snapshots disabled
- Storage issues
Solutions:
-
Check retention settings
- History has retention limits
- Older snapshots may be purged
-
Enable snapshot storage
- Settings → Drift Detection
- Enable historical snapshots
Can't Compare Versions
Symptoms:
- Compare feature unavailable
- Only one version exists
Solutions:
-
Need multiple snapshots
- Run monitoring multiple times
- Wait for more history
-
Check date range
- Select appropriate dates
- Both versions must exist
Performance Issues
Scans Taking Too Long
Symptoms:
- Monitoring timeout
- Scans run for hours
- Progress stuck
Causes:
- Large tenant
- Too many resource types
- API throttling
Solutions:
-
Reduce scope
- Monitor fewer resource types
- Focus on critical resources
- Split into multiple baselines
-
Adjust frequency
- Less frequent = more thorough
- Trade-off between coverage and speed
-
Schedule off-peak
- Run during low-activity periods
- Avoid busy times
Error Codes
Common Drift Detection Errors
| Code | Meaning | Solution |
|---|---|---|
| DRIFT001 | No connection | Setup/verify M365 connection |
| DRIFT002 | Permission denied | Grant required permissions |
| DRIFT003 | Baseline not found | Create or select baseline |
| DRIFT004 | Snapshot failed | Check connection, retry |
| DRIFT005 | Comparison error | Review comparison rules |
| DRIFT006 | Alert delivery failed | Check delivery settings |
| DRIFT007 | Schedule error | Review schedule config |
Best Practices
Baseline Management
- Start narrow - Small scope initially
- Expand gradually - Add resources as needed
- Update regularly - After planned changes
- Document changes - Note why baseline updated
Alert Configuration
- Severity-based - Different actions per severity
- Team routing - Right alerts to right people
- Digest option - Consider daily summaries
- Test alerts - Verify delivery working
Regular Maintenance
- Review open events - Don't let them pile up
- Acknowledge or resolve - Keep queue manageable
- Update baselines - Keep current
- Tune thresholds - Reduce noise over time
What's Next?
- Troubleshooting Overview - All issues
- Drift Detection Guide - Feature overview
- Creating Baselines - Baseline setup