testing-ransomware-recovery-procedures
$
npx mdskill add mukul975/Anthropic-Cybersecurity-Skills/testing-ransomware-recovery-proceduresValidate ransomware recovery plans against RTO and RPO targets.
- Confirms backup restore operations meet data integrity and completeness standards.
- Requires isolated lab environments and documented recovery targets per tier.
- Executes simulated encryption scenarios to measure actual recovery performance.
- Delivers pass/fail metrics on RTO and RPO against business requirements.
SKILL.md
.github/skills/testing-ransomware-recovery-proceduresView on GitHub ↗
---
name: testing-ransomware-recovery-procedures
description: >-
Test and validate ransomware recovery procedures including backup restore operations,
RTO/RPO target verification, recovery sequencing, and clean restore validation to ensure
organizational resilience against destructive ransomware attacks.
domain: cybersecurity
subdomain: incident-response
tags: [incident-response, ransomware, disaster-recovery, backup, rto, rpo, resilience]
version: "1.0"
author: mahipal
license: Apache-2.0
---
# Testing Ransomware Recovery Procedures
## When to Use
Use this skill when:
- Validating that ransomware recovery plans actually work under realistic conditions
- Measuring RTO (Recovery Time Objective) and RPO (Recovery Point Objective) against business requirements
- Testing backup restore operations to confirm data integrity and completeness after simulated encryption
- Conducting tabletop exercises or live recovery drills for ransomware scenarios
- Auditing disaster recovery readiness as part of compliance or cyber insurance requirements
**Do not use** for active incident response during a live ransomware attack. Use dedicated IR playbooks instead.
## Prerequisites
- Isolated recovery test environment (air-gapped or network-segmented lab)
- Access to backup infrastructure (Veeam, Commvault, Rubrik, AWS Backup, Azure Backup)
- Documented RTO/RPO targets per application tier from business impact analysis
- Backup copies available for restore testing (production replicas or test snapshots)
- Recovery runbooks with step-by-step procedures for each critical system
## Workflow
### Step 1: Define Recovery Test Scope
Identify critical systems and their tiered recovery targets:
| Tier | System Type | RTO Target | RPO Target | Example |
|------|------------|------------|------------|---------|
| Tier 1 | Mission-critical | < 1 hour | < 15 min | Active Directory, core database |
| Tier 2 | Business-critical | < 4 hours | < 1 hour | ERP, email, CRM |
| Tier 3 | Business-operational | < 24 hours | < 4 hours | File shares, internal apps |
| Tier 4 | Non-critical | < 72 hours | < 24 hours | Dev/test, analytics |
### Step 2: Prepare Test Environment
```bash
# Verify isolated recovery network is segmented
# No routes to production should exist
ip route show | grep -v "192.168.100.0/24" # recovery VLAN only
# Verify backup catalog is accessible
restic snapshots --repo s3:s3.amazonaws.com/backup-bucket --password-file /etc/restic/pw
# Or for Veeam:
# Get-VBRBackup | Where-Object {$_.JobType -eq "Backup"} | Select Name, LastPointCreationTime
```
### Step 3: Execute Restore and Measure RTO
For each tiered system, measure the full recovery timeline:
1. **Detection to Decision** - Time from simulated alert to restore decision
2. **Backup Locate** - Time to identify and select the correct clean restore point
3. **Restore Execution** - Time to restore data/VM/application from backup
4. **Validation** - Time to verify data integrity and application functionality
5. **Service Restoration** - Time until the system is fully operational
```
Recovery Timeline Measurement:
T0: Incident declared (simulated ransomware detection)
T1: Recovery team assembled and backup identified
T2: Restore initiated from clean backup
T3: Restore completed, integrity checks passed
T4: Application validated and service restored
Actual RTO = T4 - T0
Actual RPO = T0 - backup_timestamp
```
### Step 4: Validate Data Integrity Post-Restore
```bash
# Compare file counts between backup manifest and restored data
find /restored/data -type f | wc -l
# Compare against pre-backup manifest
# Verify database consistency after restore
pg_isready -h localhost -p 5432
psql -c "SELECT count(*) FROM critical_table;" -d restored_db
# Hash verification of critical files
sha256sum /restored/data/critical_config.xml
# Compare against known-good hash from backup manifest
```
### Step 5: Test Credential Rotation and Security Hardening
After restore, validate that security controls are re-established:
1. Rotate all service account passwords and API keys
2. Verify MFA is enabled on all administrative accounts
3. Confirm EDR/AV agents are running and reporting to management console
4. Validate firewall rules block known C2 indicators
5. Check that restored systems have latest security patches
### Step 6: Document Results and Calculate Gap
```
Recovery Test Report:
System: [Name]
Tier: [1-4]
RTO Target: [target] Actual RTO: [measured] Gap: [delta]
RPO Target: [target] Actual RPO: [measured] Gap: [delta]
Data Integrity: [PASS/FAIL]
Application Validation: [PASS/FAIL]
Security Controls Restored: [PASS/FAIL]
Status: [MEETS TARGET / EXCEEDS TARGET / FAILS TARGET]
Remediation Required: [description if FAILS]
```
## Key Concepts
| Term | Definition |
|------|-----------|
| **RTO** | Recovery Time Objective: maximum acceptable downtime for a system after a disaster |
| **RPO** | Recovery Point Objective: maximum acceptable data loss measured in time |
| **WRT** | Work Recovery Time: time to verify system integrity after restore completes |
| **MTD** | Maximum Tolerable Downtime: absolute limit before unacceptable business impact |
| **Clean Restore Point** | A backup verified to be free of ransomware artifacts or encryption |
| **Recovery Sequencing** | The order in which interdependent systems must be restored |
| **Air-Gapped Backup** | Backup stored on media physically disconnected from the network |
## Tools & Systems
| Tool | Purpose |
|------|---------|
| Veeam Backup & Replication | VM and physical server backup and restore |
| Commvault | Enterprise data protection and recovery orchestration |
| Rubrik | Cloud-native backup with ransomware recovery SLA |
| AWS Backup | Centralized backup for AWS services |
| Azure Backup | Microsoft cloud backup with immutable vault |
| Restic | Open-source encrypted backup tool |
| Velero | Kubernetes cluster backup and restore |
## Common Pitfalls
- **Not testing restores regularly**: Backups that are never tested often fail when needed. Test quarterly at minimum.
- **Ignoring recovery sequencing**: Restoring an application before its database dependency causes cascading failures.
- **Skipping credential rotation**: Restored systems may contain compromised credentials that allow re-infection.
- **Using production network for testing**: Recovery tests on production networks risk spreading simulated or real infections.
- **Measuring RTO without WRT**: Restore completion is not recovery completion. Include validation and hardening time.
- **No immutable backups**: If ransomware can encrypt or delete backups, recovery is impossible. Use air-gapped or immutable storage.
## References
- NIST SP 800-184: Guide for Cybersecurity Event Recovery
- CISA Ransomware Guide: https://www.cisa.gov/stopransomware
- Veeam RTO/RPO Best Practices: https://www.veeam.com/blog/recovery-time-recovery-point-objectives.html
- NIST CSF 2.0 RC.RP (Recovery Planning)
More from mukul975/Anthropic-Cybersecurity-Skills
- acquiring-disk-image-with-dd-and-dcflddCreate forensically sound bit-for-bit disk images using dd and dcfldd while preserving evidence integrity through hash verification.
- analyzing-active-directory-acl-abuseDetect dangerous ACL misconfigurations in Active Directory using ldap3 to identify GenericAll, WriteDACL, and WriteOwner abuse paths
- analyzing-android-malware-with-apktoolPerform static analysis of Android APK malware samples using apktool for decompilation, jadx for Java source recovery, and androguard for permission analysis, manifest inspection, and suspicious API call detection.
- analyzing-api-gateway-access-logs>
- analyzing-apt-group-with-mitre-navigatorAnalyze advanced persistent threat (APT) group techniques using MITRE ATT&CK Navigator to create layered heatmaps of adversary TTPs for detection gap analysis and threat-informed defense.
- analyzing-azure-activity-logs-for-threats>
- analyzing-bootkit-and-rootkit-samples>
- analyzing-browser-forensics-with-hindsightAnalyze Chromium-based browser artifacts using Hindsight to extract browsing history, downloads, cookies, cached content, autofill data, saved passwords, and browser extensions from Chrome, Edge, Brave, and Opera for forensic investigation.
- analyzing-campaign-attribution-evidenceCampaign attribution analysis involves systematically evaluating evidence to determine which threat actor or group is responsible for a cyber operation. This skill covers collecting and weighting attr
- analyzing-certificate-transparency-for-phishingMonitor Certificate Transparency logs using crt.sh and Certstream to detect phishing domains, lookalike certificates, and unauthorized certificate issuance targeting your organization.