auth/SECURITY_AUDIT_REPORT.md

# Security Audit Report
**Date:** $(date)
**Service:** Farm Auth Service
**Status:** Comprehensive Security Review

---

## ✅ **FULLY RESOLVED ISSUES**

### **1. ✅ Rate Limiting / OTP Throttling**
**Status:** **RESOLVED** ✅
- Rate limiting implemented per phone and per IP
- OTP request throttling (3 per 10 min, 10 per 24h per phone)
- OTP verification attempt limits (5 per OTP, 10 failed per hour)
- **Location:** `src/middleware/rateLimitMiddleware.js`

### **2. ✅ OTP Exposure in Logs**
**Status:** **RESOLVED** ✅
- Safe OTP logging helper created (`src/utils/otpLogger.js`)
- Only logs in development mode
- Never logs to production
- **Location:** `src/services/smsService.js` uses `logOtpForDebug()`

### **3. ✅ IP/Device Risk Controls**
**Status:** **RESOLVED** ✅
- IP blocking for configured CIDR ranges
- Risk scoring based on IP/device/user-agent changes
- Suspicious refresh detection
- **Location:** `src/services/riskScoring.js`, integrated in `src/routes/authRoutes.js`

### **4. ✅ Refresh Token Theft Mitigation**
**Status:** **RESOLVED** ✅
- Environment fingerprinting (IP, user-agent, device ID)
- Suspicious refresh detection with risk scoring
- Optional OTP re-verification for suspicious refreshes
- Enhanced logging of suspicious events
- **Location:** `src/routes/authRoutes.js` (refresh endpoint)

### **5. ✅ JWT Claims Validation**
**Status:** **RESOLVED** ✅
- Strict validation of `iss`, `aud`, `exp`, `iat`, `nbf`
- Centralized validation function
- Used in all token verification paths
- **Location:** `src/services/jwtKeys.js` - `validateTokenClaims()`

### **6. ✅ CORS Hardening**
**Status:** **RESOLVED** ✅
- Strict origin whitelisting
- No wildcard support when credentials involved
- Production requirement for explicit origins
- **Location:** `src/index.js`

### **7. ✅ CSRF Protection (Documented)**
**Status:** **RESOLVED** ✅ (Not needed - using Bearer tokens)
- Comprehensive documentation provided
- Guidance for future cookie-based implementation
- **Location:** `CSRF_NOTES.md`

### **8. ✅ Access Token Replay Mitigation**
**Status:** **RESOLVED** ✅ **FIXED!**
- ✅ Step-up authentication middleware created (`src/middleware/stepUpAuth.js`)
- ✅ Access tokens include `high_assurance` claim after OTP verification
- ✅ Middleware checks for recent OTP or high assurance token
- ✅ **Step-up auth IS APPLIED to sensitive routes:**
  - `PUT /users/me` - Line 113 in `src/routes/userRoutes.js` ✅
  - `DELETE /users/me/devices/:device_id` - Line 181 in `src/routes/userRoutes.js` ✅
  - `POST /users/me/logout-all-other-devices` - Line 231 in `src/routes/userRoutes.js` ✅
- **Risk Level:** 🟢 **LOW** (Previously 🟡 MEDIUM)

### **11. ✅ Input Validation**
**Status:** **RESOLVED** ✅ **FIXED!**
- ✅ Input validation middleware created (`src/middleware/validation.js`)
- ✅ Validation applied to all auth routes
- ✅ **Validation IS APPLIED to user routes:**
  - `PUT /users/me` - `validateUpdateProfileBody` (Line 114) ✅
  - `DELETE /users/me/devices/:device_id` - `validateDeviceIdParam` (Line 182) ✅
  - `POST /users/me/logout-all-other-devices` - `validateLogoutOthersBody` (Line 232) ✅
- **Risk Level:** 🟢 **LOW** (Previously 🟢 LOW-MEDIUM)

---

## ⚠️ **PARTIALLY RESOLVED ISSUES**

### **9. ⚠️ Secrets Management & Rotation**
**Status:** **PARTIALLY RESOLVED** ⚠️

**What's Done:**
- ✅ JWT key rotation structure implemented
- ✅ Support for multiple keys with `kid` (key ID)
- ✅ Graceful rotation without breaking existing tokens
- ✅ Code structure ready for secrets manager integration
- ✅ `.env` is in `.gitignore` (verified)

**What's Missing:**
- ❌ **Still reads secrets from environment variables (via `.env` file)**
- ❌ No integration with secrets manager (AWS Secrets Manager, HashiCorp Vault, etc.)
- ❌ Manual key rotation process (no automation)
- ❌ Twilio credentials still in environment variables

**Risk Level:** 🟡 **MEDIUM-HIGH**

**Recommendation:**
1. **Immediate:** ✅ Ensure `.env` is in `.gitignore` (DONE)
2. **Short-term:** Use environment variables from secure deployment platform (not `.env` file)
3. **Long-term:** Integrate with secrets manager (see TODOs in `src/services/jwtKeys.js` line 169)

**TODO in Code:**
- `src/services/jwtKeys.js` line 169: Example implementation for AWS Secrets Manager
- Need to implement `loadKeysFromSecretsManager()` function

---

### **10. ⚠️ Audit Logs Active Monitoring**
**Status:** **PARTIALLY RESOLVED** ⚠️

**What's Done:**
- ✅ Enhanced audit logging with risk levels (INFO, SUSPICIOUS, HIGH_RISK)
- ✅ Structured logging with metadata
- ✅ Anomaly detection helper function (`checkAnomalies()`)
- ✅ Suspicious events logged (failed OTPs, suspicious refreshes, blocked IPs)
- ✅ **Webhook alerting infrastructure implemented** (`src/services/auditLogger.js`)
- ✅ Configurable via `SECURITY_ALERT_WEBHOOK_URL` and `SECURITY_ALERT_MIN_LEVEL`

**What's Missing:**
- ❌ **No active alerting/monitoring integration configured by default**
- ❌ Requires manual configuration of `SECURITY_ALERT_WEBHOOK_URL`
- ❌ No integration with PagerDuty, Slack, email out of the box
- ❌ Anomaly detection only logs to console if webhook not configured

**Risk Level:** 🟡 **MEDIUM**

**Recommendation:**
1. **Immediate:** Configure `SECURITY_ALERT_WEBHOOK_URL` in production environment
2. **Short-term:** Set up log aggregation (CloudWatch, Datadog, etc.)
3. **Long-term:** Integrate with SIEM system

**Configuration:**
```bash
# Set in production environment
SECURITY_ALERT_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URL
SECURITY_ALERT_MIN_LEVEL=HIGH_RISK  # or SUSPICIOUS for more alerts
```

---

## ❌ **UNADDRESSED ATTACK SCENARIOS**

### **12. ⚠️ Database Compromise**
**Status:** **PARTIALLY RESOLVED** ⚠️ **FIXED!**

**What's Done:**
- ✅ Field-level encryption for phone numbers implemented (`src/utils/fieldEncryption.js`)
- ✅ AES-256-GCM encryption for PII fields
- ✅ Automatic encryption/decryption in application layer
- ✅ Backward compatibility with existing plaintext data
- ✅ Database access logging implemented (`src/middleware/dbAccessLogger.js`)
- ✅ All queries to sensitive tables are logged
- ✅ User context (IP, user agent, user ID) captured in logs
- ✅ Sensitive parameters redacted in logs
- ✅ Using parameterized queries (SQL injection protection)
- ✅ OTP codes are hashed with bcrypt (not stored in plaintext)
- ✅ No passwords stored (phone-based auth)

**What's Missing:**
- ⚠️ **Database-level encryption (TDE) not configured** (infrastructure-level)
- ⚠️ Encryption key still in environment variables (should use secrets manager)
- ⚠️ No automated key rotation process

**Risk Level:** 🟡 **MEDIUM** (Previously 🔴 HIGH)

**Configuration Required:**
1. **Enable Field-Level Encryption:**
   ```bash
   ENCRYPTION_ENABLED=true
   ENCRYPTION_KEY=<32-byte-base64-key>
   ```

2. **Enable Database Access Logging:**
   ```bash
   DB_ACCESS_LOGGING_ENABLED=true
   DB_ACCESS_LOG_LEVEL=sensitive
   ```

3. **Enable Database-Level Encryption (TDE):**
   - Configure at infrastructure level (AWS RDS, Google Cloud SQL, Azure)
   - See `DATABASE_ENCRYPTION_SETUP.md` for instructions

**Recommendation:**
- ✅ Field-level encryption implemented - **CONFIGURE** `ENCRYPTION_ENABLED=true`
- ✅ Database access logging implemented - **CONFIGURE** `DB_ACCESS_LOGGING_ENABLED=true`
- ⚠️ Enable TDE at database infrastructure level (see `DATABASE_ENCRYPTION_SETUP.md`)
- Move encryption keys to secrets manager
- Set up automated key rotation

---

### **13. ❌ Man-in-the-Middle (Non-HTTPS)**
**Status:** **PARTIALLY ADDRESSED** ⚠️

**Current Protection:**
- ✅ HSTS header set in production (`src/middleware/securityHeaders.js` line 26)
- ✅ Server assumes TLS termination in front (reverse proxy)

**What's Missing:**
- ❌ No enforcement of HTTPS-only connections at application level
- ❌ No startup validation that HTTPS is configured
- ❌ No certificate pinning guidance
- ❌ HSTS only applied to admin routes (not all routes)

**Risk Level:** 🔴 **HIGH** (if misconfigured)

**Recommendation:**
- Enforce HTTPS at reverse proxy/load balancer
- Add HSTS headers to all routes (not just admin)
- Document TLS requirements
- Consider certificate pinning for mobile apps
- Add startup validation that HTTPS is configured in production

---

### **14. ⚠️ CORS + XSS**
**Status:** **PARTIALLY ADDRESSED** ⚠️

**What's Done:**
- ✅ CORS hardened with strict origin whitelisting
- ✅ Documentation warns about misconfiguration
- ✅ Security headers include XSS protection (`X-XSS-Protection`)

**What's Missing:**
- ❌ No validation that CORS is properly configured in production at startup
- ❌ No runtime checks for CORS misconfiguration
- ❌ No Content Security Policy (CSP) headers
- ❌ No guidance for XSS prevention in frontend

**Risk Level:** 🟡 **MEDIUM**

**Recommendation:**
- Add startup validation that CORS origins are configured in production
- Add Content Security Policy (CSP) headers
- Document XSS prevention best practices
- Consider adding CSP nonce support for dynamic content

---

## 🔍 **ADDITIONAL VULNERABILITIES FOUND**

### **15. ⚠️ Error Information Disclosure**
**Status:** **GOOD** ✅

**Current State:**
- ✅ Generic error messages returned to users (`"Internal server error"`)
- ✅ No stack traces exposed in production
- ✅ Detailed errors only logged server-side

**Recommendation:**
- ✅ Current implementation is secure
- Consider adding request ID to error responses for debugging (without exposing internals)

---

### **16. ⚠️ SQL Injection Protection**
**Status:** **GOOD** ✅

**Current State:**
- ✅ All database queries use parameterized queries (`$1, $2, etc.`)
- ✅ No string concatenation in SQL queries
- ✅ Using PostgreSQL's `pg` library with proper parameterization

**Recommendation:**
- ✅ Current implementation is secure
- Continue using parameterized queries for all new code

---

### **17. ⚠️ Security Headers Coverage**
**Status:** **PARTIALLY ADDRESSED** ⚠️

**Current State:**
- ✅ Security headers middleware exists (`src/middleware/securityHeaders.js`)
- ✅ Applied to admin routes
- ❌ **Not applied to all routes** (only admin routes)

**Missing Headers:**
- ❌ Content Security Policy (CSP) not implemented
- ❌ Referrer-Policy not set
- ❌ Permissions-Policy not set

**Recommendation:**
- Apply security headers to all routes (not just admin)
- Add CSP headers
- Add Referrer-Policy header
- Add Permissions-Policy header

---

### **18. ⚠️ Phone Number Validation**
**Status:** **GOOD** ✅

**Current State:**
- ✅ Phone numbers validated for E.164 format
- ✅ Short codes rejected
- ✅ Normalization applied

**Recommendation:**
- ✅ Current implementation is secure
- Consider adding country-specific validation if needed

---

### **19. ⚠️ Hardcoded Credentials in Docker Compose**
**Status:** **VULNERABILITY FOUND** ⚠️

**Risk:**
- Hardcoded database password in `db/farmmarket-db/docker-compose.yml`
- Password `password123` is visible in version control
- If repository is public or compromised, database credentials are exposed

**Location:**
- `db/farmmarket-db/docker-compose.yml` line 8: `POSTGRES_PASSWORD: password123`

**Risk Level:** 🟡 **MEDIUM-HIGH**

**Recommendation:**
- Use environment variables for database credentials
- Never commit passwords to version control
- Use `.env` file (already in `.gitignore`) or secrets manager
- Update docker-compose.yml to use environment variables

**Fix:**
```yaml
# docker-compose.yml
environment:
  POSTGRES_USER: ${POSTGRES_USER:-postgres}
  POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
  POSTGRES_DB: ${POSTGRES_DB:-farmmarket}
```

---

### **20. ⚠️ Phone Number Enumeration**
**Status:** **PARTIALLY ADDRESSED** ⚠️

**Current State:**
- ✅ OTP request endpoint always returns success (prevents enumeration)
- ✅ Generic error messages for OTP verification
- ⚠️ Response time differences might still allow enumeration

**Risk:**
- Attackers could enumerate valid phone numbers by measuring response times
- Database queries for existing vs non-existing phone numbers might have different execution times

**Risk Level:** 🟡 **MEDIUM**

**Recommendation:**
- Add constant-time delays for OTP requests (to prevent timing attacks)
- Consider rate limiting per phone number more aggressively
- Monitor for enumeration attempts

---

### **21. ⚠️ User Enumeration via Error Messages**
**Status:** **GOOD** ✅

**Current State:**
- ✅ Generic error messages ("OTP invalid or expired")
- ✅ No distinction between "user not found" and "invalid OTP"
- ✅ User creation happens silently (find-or-create pattern)

**Recommendation:**
- ✅ Current implementation is secure
- Continue using generic error messages

---

### **22. ⚠️ Timing Attack on OTP Verification**
**Status:** **PARTIALLY ADDRESSED** ⚠️

**Current State:**
- ✅ Uses bcrypt for OTP hashing (constant-time comparison)
- ⚠️ Early returns for expired/max attempts might leak timing information
- ⚠️ Database query execution time might differ

**Risk:**
- Attackers could measure response times to determine if OTP exists
- Different code paths have different execution times

**Risk Level:** 🟢 **LOW-MEDIUM**

**Recommendation:**
- Add constant-time delays for all OTP verification paths
- Ensure all code paths take similar time regardless of outcome
- Consider adding artificial delays to normalize response times

---

### **23. ⚠️ Missing Rate Limiting on User Routes**
**Status:** **PARTIALLY ADDRESSED** ⚠️

**Current State:**
- ✅ Rate limiting on auth routes (OTP request, verify, refresh, logout)
- ✅ Rate limiting on admin routes
- ❌ **No rate limiting on user routes:**
  - `GET /users/me`
  - `PUT /users/me`
  - `GET /users/me/devices`
  - `DELETE /users/me/devices/:device_id`
  - `POST /users/me/logout-all-other-devices`

**Risk:**
- Attackers could abuse authenticated endpoints
- Profile updates could be spammed
- Device management endpoints could be abused

**Risk Level:** 🟡 **MEDIUM**

**Recommendation:**
- Add rate limiting to user routes
- Different limits for read vs write operations
- Consider per-user rate limits for sensitive operations

---

### **24. ⚠️ Information Disclosure in Admin Routes**
**Status:** **PARTIALLY ADDRESSED** ⚠️

**Current State:**
- ✅ Phone numbers are masked in admin security events endpoint
- ✅ Admin routes require authentication and admin role
- ⚠️ Admin can see user IDs, IP addresses, device IDs
- ⚠️ Full metadata (JSONB) is returned without sanitization

**Risk:**
- Admins have access to sensitive user data
- Metadata might contain sensitive information
- No audit trail for what admins access

**Risk Level:** 🟡 **MEDIUM**

**Recommendation:**
- ✅ Current masking is good
- Consider additional sanitization of metadata
- Add more granular admin permissions
- Log all admin data access

---

## 📊 **SUMMARY TABLE**

| Issue | Status | Risk Level | Priority | Notes |
|-------|--------|------------|----------|-------|
| 1. Rate Limiting | ✅ Resolved | - | - | Fully implemented |
| 2. OTP Logging | ✅ Resolved | - | - | Safe logging in place |
| 3. IP/Device Risk | ✅ Resolved | - | - | Risk scoring active |
| 4. Refresh Token Theft | ✅ Resolved | - | - | Environment fingerprinting |
| 5. JWT Claims | ✅ Resolved | - | - | Strict validation |
| 6. CORS | ✅ Resolved | - | - | Strict whitelisting |
| 7. CSRF | ✅ Documented | - | - | Not needed (Bearer tokens) |
| 8. Access Token Replay | ✅ **FIXED** | 🟢 Low | - | Step-up auth applied |
| 9. Secrets Management | ⚠️ Partial | 🟡 Medium-High | **HIGH** | Needs secrets manager |
| 10. Audit Monitoring | ⚠️ Partial | 🟡 Medium | **MEDIUM** | Needs webhook config |
| 11. Input Validation | ✅ **FIXED** | 🟢 Low | - | All routes validated |
| 12. Database Compromise | ⚠️ **FIXED** | 🟡 Medium | **LOW** | Needs TDE config |
| 13. MITM (HTTP) | ⚠️ Partial | 🔴 High | **MEDIUM** | Needs HTTPS enforcement |
| 14. CORS + XSS | ⚠️ Partial | 🟡 Medium | **LOW** | Needs CSP headers |
| 15. Error Disclosure | ✅ Good | - | - | No issues found |
| 16. SQL Injection | ✅ Good | - | - | Parameterized queries |
| 17. Security Headers | ⚠️ Partial | 🟡 Medium | **LOW** | Needs CSP |
| 18. Phone Validation | ✅ Good | - | - | Proper validation |
| 19. Hardcoded Credentials | ⚠️ Found | 🟡 Medium-High | **HIGH** | Docker compose |
| 20. Phone Enumeration | ⚠️ Partial | 🟡 Medium | **MEDIUM** | Timing attacks |
| 21. User Enumeration | ✅ Good | - | - | Generic errors |
| 22. Timing Attacks | ⚠️ Partial | 🟢 Low-Medium | **LOW** | Constant-time delays |
| 23. Missing Rate Limits | ⚠️ Partial | 🟡 Medium | **MEDIUM** | User routes |
| 24. Admin Info Disclosure | ⚠️ Partial | 🟡 Medium | **LOW** | Metadata sanitization |

---

## 🎯 **IMMEDIATE ACTION ITEMS (Priority Order)**

### **🔴 HIGH PRIORITY**

1. **Hardcoded Credentials** (Issue #19) - **NEW!** ⚠️
   - ❌ Remove hardcoded password from `docker-compose.yml`
   - ⚠️ Use environment variables for database credentials
   - ⚠️ Ensure no secrets are committed to version control

2. **✅ Secrets Management** (Issue #9)
   - ✅ `.env` is in `.gitignore` (verified)
   - ⚠️ Move to environment variables from deployment platform (not `.env` file)
   - ⚠️ Plan integration with secrets manager (AWS Secrets Manager, HashiCorp Vault)

3. **✅ Step-Up Auth** (Issue #8) - **FIXED!** ✅
   - ✅ Applied to all sensitive routes

4. **✅ Input Validation** (Issue #11) - **FIXED!** ✅
   - ✅ Applied to all user routes

### **🟡 MEDIUM PRIORITY**

1. **Phone Number Enumeration** (Issue #20)
   - Add constant-time delays for OTP requests
   - Monitor for enumeration attempts
   - Consider more aggressive rate limiting

2. **Missing Rate Limiting** (Issue #23)
   - Add rate limiting to user routes
   - Different limits for read vs write operations
   - Per-user rate limits for sensitive operations

3. **Active Monitoring/Alerting** (Issue #10)
   - Configure `SECURITY_ALERT_WEBHOOK_URL` in production
   - Set up log aggregation (CloudWatch, Datadog, etc.)
   - Test alerting with HIGH_RISK events

4. **HTTPS Enforcement** (Issue #13)
   - Add startup validation that HTTPS is configured in production
   - Apply HSTS headers to all routes (not just admin)
   - Document TLS requirements

5. **Database Security** (Issue #12) - **FIXED!** ✅
   - ✅ Field-level encryption implemented - **CONFIGURE** `ENCRYPTION_ENABLED=true`
   - ✅ Database access logging implemented - **CONFIGURE** `DB_ACCESS_LOGGING_ENABLED=true`
   - ⚠️ Enable TDE at database infrastructure level (see `DATABASE_ENCRYPTION_SETUP.md`)

### **🟢 LOW PRIORITY**

1. **Security Headers Enhancement** (Issue #17)
   - Apply security headers to all routes
   - Add Content Security Policy (CSP) headers
   - Add Referrer-Policy and Permissions-Policy headers

2. **Timing Attacks** (Issue #22)
   - Add constant-time delays for OTP verification
   - Normalize response times across all code paths

3. **Admin Info Disclosure** (Issue #24)
   - Additional sanitization of metadata in admin routes
   - More granular admin permissions

4. **CORS Validation** (Issue #14)
   - Add startup validation that CORS origins are configured in production
   - Document XSS prevention best practices

---

## 📝 **CODE LOCATIONS FOR FIXES**

### **Secrets Manager Integration**
**File:** `src/services/jwtKeys.js`
- Line 169: Implement `loadKeysFromSecretsManager()` function
- Replace environment variable reads with secrets manager calls

### **Alerting Configuration**
**File:** `src/services/auditLogger.js`
- Line 34: `SECURITY_ALERT_WEBHOOK_URL` - Configure in production
- Line 35: `SECURITY_ALERT_MIN_LEVEL` - Set to 'HIGH_RISK' or 'SUSPICIOUS'

### **Security Headers Enhancement**
**File:** `src/middleware/securityHeaders.js`
- Add CSP headers
- Apply to all routes in `src/index.js`

### **HTTPS Enforcement**
**File:** `src/index.js`
- Add startup validation for HTTPS in production
- Apply HSTS headers to all routes

### **Hardcoded Credentials Fix**
**File:** `db/farmmarket-db/docker-compose.yml`
- Replace hardcoded password with environment variable
- Use `${POSTGRES_PASSWORD}` instead of `password123`

### **Rate Limiting for User Routes**
**File:** `src/routes/userRoutes.js`
- Add rate limiting middleware to all user routes
- Consider per-user rate limits for sensitive operations

---

## 🎉 **CONCLUSION**

**Overall Security Posture:** 🟢 **GOOD** (Improved from 🟡 GOOD)

**Latest Update:**
- ✅ **Issue #12 (Database Compromise) - FIXED!**
  - Field-level encryption for phone numbers implemented
  - Database access logging implemented
  - See `DATABASE_ENCRYPTION_SETUP.md` for configuration instructions

**Progress:**
- **9 out of 14 original issues are fully resolved** ✅
- **5 issues are partially resolved** ⚠️ (need configuration/completion)
- **6 new vulnerabilities found** ⚠️ (Issues #19-24)
- **0 critical vulnerabilities** 🔴 (down from 2)

**Key Improvements:**
1. ✅ Step-up auth now applied to all sensitive routes
2. ✅ Input validation now applied to all user routes
3. ✅ Webhook alerting infrastructure ready (needs configuration)

**Remaining Gaps:**
1. **🔴 HIGH:** Hardcoded credentials in docker-compose.yml (Issue #19)
2. Secrets management needs secrets manager integration
3. Alerting needs webhook URL configuration
4. ✅ Database field-level encryption implemented - **NEEDS CONFIGURATION**
5. Database TDE needs infrastructure-level setup
6. Rate limiting missing on user routes (Issue #23)
7. Phone number enumeration via timing attacks (Issue #20)
8. HTTPS enforcement needs startup validation
9. Security headers need CSP and broader application

**Recommendation:** The service is **production-ready** with proper configuration, but should address the HIGH and MEDIUM priority items before handling sensitive production data.