950 lines
43 KiB
Markdown
950 lines
43 KiB
Markdown
# Farm Auth Service - Architecture Documentation
|
|
|
|
## 1. High-Level Overview
|
|
|
|
The Farm Auth Service is a Node.js + Express authentication and security service that provides phone-based authentication using OTP (One-Time Password) via SMS, JWT-based access and refresh tokens, comprehensive rate limiting, security hardening, and audit logging. The service is designed for a mobile application ecosystem where users authenticate using their phone numbers.
|
|
|
|
**Core Functionality:**
|
|
- Phone number-based authentication with OTP verification via SMS (Twilio)
|
|
- JWT access tokens (short-lived) and refresh tokens (long-lived) with rotation
|
|
- Device tracking and multi-device session management
|
|
- Comprehensive rate limiting at multiple levels (phone, IP, user)
|
|
- Security hardening: CORS validation, security headers, field-level encryption, timing attack protection, enumeration detection
|
|
- Audit logging with risk scoring and webhook alerting
|
|
- Admin dashboard for security event monitoring
|
|
|
|
**External Systems:**
|
|
- **PostgreSQL Database**: Stores users, OTP codes, refresh tokens, devices, and audit logs
|
|
- **Redis** (optional): Used for rate limiting counters and OTP tracking (falls back to in-memory store)
|
|
- **Twilio**: SMS provider for OTP delivery (optional - service works without it for development)
|
|
- **Webhook Endpoints**: For security alerts (Slack, Discord, or custom webhooks)
|
|
|
|
---
|
|
|
|
## 2. Architecture & Components
|
|
|
|
### 2.1 HTTP/API Layer
|
|
|
|
**Files:**
|
|
- `src/index.js` - Express server setup and middleware configuration
|
|
- `src/routes/authRoutes.js` - Authentication endpoints
|
|
- `src/routes/userRoutes.js` - User profile and device management endpoints
|
|
- `src/routes/adminRoutes.js` - Admin security dashboard endpoints
|
|
|
|
**Responsibilities:**
|
|
- Request routing and middleware orchestration
|
|
- Input validation and sanitization
|
|
- Response formatting
|
|
- Error handling
|
|
|
|
**Middleware Order (Critical):**
|
|
1. Trust proxy configuration (if behind reverse proxy)
|
|
2. CORS validation (startup and runtime)
|
|
3. JSON body parser
|
|
4. Security headers (global)
|
|
5. Route-specific middleware (validation, rate limiting, auth)
|
|
|
|
**Key Configuration:**
|
|
- `TRUST_PROXY`: Set to `'true'` if behind reverse proxy (nginx, load balancer)
|
|
- `CORS_ALLOWED_ORIGINS`: Comma-separated list of allowed origins (required in production)
|
|
- `ENABLE_ADMIN_DASHBOARD`: Set to `'true'` to enable admin routes
|
|
|
|
### 2.2 Authentication Core
|
|
|
|
**Files:**
|
|
- `src/services/otpService.js` - OTP generation, hashing (bcrypt), storage, and verification
|
|
- `src/services/tokenService.js` - JWT access/refresh token issuance, rotation, and validation
|
|
- `src/services/jwtKeys.js` - JWT key management with rotation support
|
|
- `src/middleware/authMiddleware.js` - JWT access token validation
|
|
- `src/middleware/stepUpAuth.js` - Step-up authentication for sensitive operations
|
|
|
|
**Responsibilities:**
|
|
- OTP generation (6-digit random codes)
|
|
- OTP hashing with bcrypt (10 rounds)
|
|
- OTP storage in database with expiry and attempt tracking
|
|
- JWT token signing with key rotation support
|
|
- Refresh token rotation and reuse detection
|
|
- Device fingerprinting and tracking
|
|
|
|
**Key Features:**
|
|
- **OTP Security**: Hashed with bcrypt, constant-time verification to prevent timing attacks
|
|
- **Token Rotation**: Refresh tokens rotate on each use, old tokens are revoked
|
|
- **Reuse Detection**: Detects if a refresh token is reused (theft indicator)
|
|
- **Step-Up Auth**: Requires recent OTP verification for sensitive operations
|
|
|
|
### 2.3 Security Layer
|
|
|
|
**Files:**
|
|
- `src/middleware/rateLimitMiddleware.js` - OTP request/verification rate limiting
|
|
- `src/middleware/userRateLimit.js` - User route rate limiting (read/write/sensitive)
|
|
- `src/middleware/adminRateLimit.js` - Admin route rate limiting
|
|
- `src/middleware/securityHeaders.js` - Security headers (CSP, HSTS, X-Frame-Options, etc.)
|
|
- `src/utils/corsValidator.js` - CORS configuration validation
|
|
- `src/utils/timingProtection.js` - Timing attack protection for OTP flows
|
|
- `src/utils/enumerationDetection.js` - Phone number enumeration detection
|
|
- `src/services/riskScoring.js` - Risk scoring for login/refresh attempts
|
|
- `src/middleware/validation.js` - Input validation middleware
|
|
|
|
**Responsibilities:**
|
|
- Rate limiting at multiple levels (phone, IP, user, admin)
|
|
- Security headers enforcement
|
|
- CORS origin validation (startup and runtime)
|
|
- Timing attack mitigation (constant-time OTP verification)
|
|
- Enumeration detection and IP blocking
|
|
- Risk scoring based on IP/device changes
|
|
- Input validation and sanitization
|
|
|
|
**Key Features:**
|
|
- **Multi-Level Rate Limiting**: Phone-based, IP-based, and user-based limits
|
|
- **Enumeration Protection**: Detects and blocks IPs attempting phone number enumeration
|
|
- **Timing Attack Protection**: All OTP operations use constant-time execution
|
|
- **Risk Scoring**: Calculates risk scores for suspicious login/refresh attempts
|
|
|
|
### 2.4 Persistence Layer
|
|
|
|
**Files:**
|
|
- `src/db.js` - PostgreSQL connection pool and query wrapper
|
|
- `src/middleware/dbAccessLogger.js` - Optional database access logging
|
|
- `src/utils/fieldEncryption.js` - Field-level encryption for PII (phone numbers)
|
|
- `src/utils/encryptedPhoneSearch.js` - Phone number search with encryption support
|
|
|
|
**Database Tables:**
|
|
- `users` - User accounts (phone number, name, role, user_type)
|
|
- `otp_codes` - OTP codes (hashed, with expiry and attempt tracking)
|
|
- `refresh_tokens` - Refresh tokens (hashed, with rotation tracking)
|
|
- `user_devices` - Device tracking (platform, model, OS, app version)
|
|
- `auth_audit` - Security audit logs (all authentication events)
|
|
|
|
**Responsibilities:**
|
|
- Database connection management
|
|
- Query execution with optional logging
|
|
- Field-level encryption for sensitive data (phone numbers)
|
|
- Database schema management (auto-creates tables if missing)
|
|
|
|
**Key Features:**
|
|
- **Field-Level Encryption**: Phone numbers encrypted at rest (AES-256-GCM)
|
|
- **Database Access Logging**: Optional logging of all DB queries (for security auditing)
|
|
- **Backward Compatibility**: Handles both encrypted and plaintext phone numbers during migration
|
|
|
|
### 2.5 Integration Layer
|
|
|
|
**Files:**
|
|
- `src/services/smsService.js` - Twilio SMS integration
|
|
- `src/services/auditLogger.js` - Audit logging with webhook alerting
|
|
- `src/services/redisClient.js` - Redis client with graceful fallback
|
|
|
|
**Responsibilities:**
|
|
- SMS delivery via Twilio (with fallback logging)
|
|
- Security event logging to database
|
|
- Webhook alerting for high-risk events
|
|
- Redis connection management (optional, falls back to in-memory)
|
|
|
|
**Key Features:**
|
|
- **Twilio Integration**: Sends OTP via SMS (optional - works without for development)
|
|
- **Webhook Alerting**: Sends alerts to Slack/Discord/custom webhooks for SUSPICIOUS/HIGH_RISK events
|
|
- **Redis Fallback**: Gracefully falls back to in-memory store if Redis unavailable
|
|
|
|
---
|
|
|
|
## 3. Request Flows
|
|
|
|
### 3.1 OTP Login Flow
|
|
|
|
**Step-by-Step:**
|
|
|
|
1. **Client requests OTP** (`POST /auth/request-otp`)
|
|
- Input validation (phone number format)
|
|
- Check for active OTP (2-minute no-resend rule)
|
|
- Rate limit by phone number (3 per 10 min, 10 per day)
|
|
- Rate limit by IP address (20 per 10 min, 100 per day)
|
|
- Check if IP is blocked (enumeration or CIDR ranges)
|
|
- Enumeration detection (if suspicious, apply stricter limits)
|
|
- Timing protection wrapper (constant-time execution)
|
|
- Normalize phone number (E.164 format)
|
|
- Generate 6-digit OTP code
|
|
- Hash OTP with bcrypt (10 rounds)
|
|
- Encrypt phone number (if encryption enabled)
|
|
- Store OTP in database (delete old OTPs for same phone)
|
|
- Mark OTP as active in Redis/memory (2-minute TTL)
|
|
- Send SMS via Twilio (or log to console if not configured)
|
|
- Log audit event (otp_request, INFO risk level)
|
|
- Return success (even if SMS fails - OTP is generated)
|
|
|
|
2. **Client verifies OTP** (`POST /auth/verify-otp`)
|
|
- Input validation (phone number, 6-digit code, device_id, device_info)
|
|
- Rate limit failed verifications (10 per hour per phone)
|
|
- Check if IP is blocked
|
|
- Timing protection wrapper (constant-time execution)
|
|
- Normalize phone number
|
|
- Encrypt phone number for search
|
|
- Query OTP from database (with constant-time dummy hash if not found)
|
|
- Check expiry, max attempts, and verify code (all with constant-time bcrypt.compare)
|
|
- If invalid: increment attempt count, log suspicious event, return generic error
|
|
- If valid: delete OTP, find or create user, decrypt phone number
|
|
- Update user last_login_at
|
|
- Upsert device record (track platform, model, OS, app version)
|
|
- Calculate risk score (IP change, device change, user agent change)
|
|
- Log audit event (login, risk level based on score)
|
|
- Check for anomalies (multiple failed attempts, high-risk IPs)
|
|
- Issue access token (with high_assurance flag) and refresh token
|
|
- Return user data, tokens, and device info
|
|
|
|
**Mermaid Sequence Diagram:**
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Client
|
|
participant API
|
|
participant RateLimiter
|
|
participant OTPService
|
|
participant DB
|
|
participant Twilio
|
|
participant AuditLogger
|
|
|
|
Client->>API: POST /auth/request-otp<br/>{phone_number}
|
|
API->>API: Validate input
|
|
API->>RateLimiter: Check active OTP (2-min rule)
|
|
RateLimiter-->>API: No active OTP
|
|
API->>RateLimiter: Rate limit by phone (3/10min)
|
|
RateLimiter-->>API: Allowed
|
|
API->>RateLimiter: Rate limit by IP (20/10min)
|
|
RateLimiter-->>API: Allowed
|
|
API->>API: Check IP blocking
|
|
API->>OTPService: Generate OTP
|
|
OTPService->>DB: Store hashed OTP
|
|
OTPService->>RateLimiter: Mark active (2-min TTL)
|
|
API->>Twilio: Send SMS
|
|
Twilio-->>API: SMS sent (or error)
|
|
API->>AuditLogger: Log otp_request event
|
|
API-->>Client: {ok: true}
|
|
|
|
Client->>API: POST /auth/verify-otp<br/>{phone_number, code, device_id}
|
|
API->>API: Validate input
|
|
API->>RateLimiter: Check failed attempts (10/hour)
|
|
RateLimiter-->>API: Allowed
|
|
API->>OTPService: Verify OTP (constant-time)
|
|
OTPService->>DB: Query OTP (with dummy hash if not found)
|
|
OTPService->>OTPService: bcrypt.compare (constant-time)
|
|
alt OTP Valid
|
|
OTPService->>DB: Delete OTP
|
|
API->>DB: Find or create user
|
|
API->>DB: Upsert device
|
|
API->>API: Calculate risk score
|
|
API->>AuditLogger: Log login (with risk level)
|
|
API->>API: Issue access + refresh tokens
|
|
API-->>Client: {user, access_token, refresh_token}
|
|
else OTP Invalid
|
|
OTPService->>DB: Increment attempt count
|
|
API->>AuditLogger: Log suspicious attempt
|
|
API-->>Client: {error: "OTP invalid or expired"}
|
|
end
|
|
```
|
|
|
|
### 3.2 Token Refresh Flow
|
|
|
|
**Step-by-Step:**
|
|
|
|
1. **Client requests token refresh** (`POST /auth/refresh`)
|
|
- Input validation (refresh_token)
|
|
- Check if IP is blocked
|
|
- Decode refresh token to get key ID
|
|
- Verify refresh token signature (try all keys if key ID not found)
|
|
- Validate JWT claims (iss, aud, exp, iat)
|
|
- Query refresh token from database (by token_id)
|
|
- Verify token hash matches (bcrypt.compare)
|
|
- Check if token is revoked or expired
|
|
- Check refresh token idle timeout (max idle minutes)
|
|
- Calculate risk score (IP change, device change, user agent change)
|
|
- If suspicious: log suspicious refresh event
|
|
- If suspicious and REQUIRE_OTP_ON_SUSPICIOUS_REFRESH: return step_up_required error
|
|
- Update token last_used_at
|
|
- Revoke old refresh token
|
|
- Issue new access token and new refresh token (rotation)
|
|
- Update device last_seen_at
|
|
- Log audit event (token_refresh, risk level based on score)
|
|
- Return new tokens
|
|
|
|
**Mermaid Sequence Diagram:**
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Client
|
|
participant API
|
|
participant TokenService
|
|
participant JWTKeys
|
|
participant DB
|
|
participant RiskScoring
|
|
participant AuditLogger
|
|
|
|
Client->>API: POST /auth/refresh<br/>{refresh_token}
|
|
API->>API: Validate input
|
|
API->>API: Check IP blocking
|
|
API->>TokenService: Verify refresh token
|
|
TokenService->>JWTKeys: Get key secret (by key ID)
|
|
JWTKeys-->>TokenService: Key secret
|
|
TokenService->>TokenService: Verify JWT signature
|
|
TokenService->>TokenService: Validate claims (iss, aud, exp)
|
|
TokenService->>DB: Query refresh token (by token_id)
|
|
DB-->>TokenService: Token record
|
|
TokenService->>TokenService: Verify token hash (bcrypt)
|
|
alt Token Valid
|
|
TokenService->>TokenService: Check expiry & idle timeout
|
|
API->>RiskScoring: Calculate risk score
|
|
RiskScoring->>DB: Get previous auth info
|
|
RiskScoring-->>API: Risk score & reasons
|
|
alt Suspicious Refresh
|
|
API->>AuditLogger: Log suspicious refresh
|
|
alt Require OTP
|
|
API-->>Client: {error: "step_up_required"}
|
|
else Allow with Risk
|
|
API->>TokenService: Rotate refresh token
|
|
TokenService->>DB: Revoke old token
|
|
TokenService->>DB: Store new token
|
|
API->>AuditLogger: Log refresh (SUSPICIOUS/HIGH_RISK)
|
|
API-->>Client: {access_token, refresh_token}
|
|
end
|
|
else Normal Refresh
|
|
API->>TokenService: Rotate refresh token
|
|
TokenService->>DB: Revoke old token
|
|
TokenService->>DB: Store new token
|
|
API->>DB: Update device last_seen_at
|
|
API->>AuditLogger: Log refresh (INFO)
|
|
API-->>Client: {access_token, refresh_token}
|
|
end
|
|
else Token Invalid
|
|
API-->>Client: {error: "Invalid refresh token"}
|
|
end
|
|
```
|
|
|
|
### 3.3 Logout Flow
|
|
|
|
**Step-by-Step:**
|
|
|
|
1. **Single-device logout** (`POST /auth/logout`)
|
|
- Input validation (refresh_token)
|
|
- Verify refresh token (same as refresh flow)
|
|
- If token invalid/already revoked: return success (idempotent)
|
|
- Revoke all refresh tokens for user + device
|
|
- Log audit event (logout, INFO)
|
|
- Return success
|
|
|
|
2. **Logout all other devices** (`POST /users/me/logout-all-other-devices`)
|
|
- Requires authentication (access token)
|
|
- Requires step-up auth (recent OTP or high_assurance token)
|
|
- Rate limited (10 per hour per user)
|
|
- Get current device_id from header or body
|
|
- Mark all other devices as inactive
|
|
- Revoke refresh tokens for all other devices
|
|
- Log audit event (logout_all_other_devices, INFO)
|
|
- Return count of revoked devices
|
|
|
|
3. **Logout from all devices** (`POST /users/me/logout-all-devices`)
|
|
- Requires authentication (access token)
|
|
- Requires step-up auth (recent OTP or high_assurance token)
|
|
- Rate limited (10 per hour per user)
|
|
- Revoke all refresh tokens for the user (all devices)
|
|
- Mark all devices as inactive
|
|
- Increment user's `token_version` to invalidate all existing access tokens
|
|
- Log audit event (logout_all_devices, HIGH_RISK) - triggers security alert
|
|
- Return success with revoked tokens count
|
|
- **Security Note**: This is a critical security operation used when account compromise is suspected. All existing access tokens become invalid immediately, even if they haven't expired yet.
|
|
|
|
4. **Revoke specific device** (`DELETE /users/me/devices/:device_id`)
|
|
- Requires authentication (access token)
|
|
- Requires step-up auth (recent OTP or high_assurance token)
|
|
- Rate limited (10 per hour per user)
|
|
- Validate device_id parameter
|
|
- Mark device as inactive
|
|
- Revoke refresh tokens for device
|
|
- Log audit event (device_revoked, INFO)
|
|
- Return success
|
|
|
|
**Mermaid Sequence Diagram:**
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Client
|
|
participant API
|
|
participant TokenService
|
|
participant DB
|
|
participant AuditLogger
|
|
|
|
Note over Client,AuditLogger: Single Device Logout
|
|
Client->>API: POST /auth/logout<br/>{refresh_token}
|
|
API->>TokenService: Verify refresh token
|
|
TokenService-->>API: Token info
|
|
API->>TokenService: Revoke refresh token
|
|
TokenService->>DB: Mark token revoked
|
|
API->>AuditLogger: Log logout event
|
|
API-->>Client: {ok: true}
|
|
|
|
Note over Client,AuditLogger: Logout All Other Devices
|
|
Client->>API: POST /users/me/logout-all-other-devices<br/>{current_device_id}
|
|
API->>API: Verify access token
|
|
API->>API: Check step-up auth
|
|
API->>API: Rate limit check (10/hour)
|
|
API->>DB: Mark other devices inactive
|
|
API->>TokenService: Revoke tokens for other devices
|
|
TokenService->>DB: Revoke tokens
|
|
API->>AuditLogger: Log logout_all_other_devices
|
|
API-->>Client: {ok: true, revoked_devices_count: N}
|
|
|
|
Note over Client,AuditLogger: Logout All Devices (Global Logout)
|
|
Client->>API: POST /users/me/logout-all-devices
|
|
API->>API: Verify access token
|
|
API->>API: Check step-up auth
|
|
API->>API: Rate limit check (10/hour)
|
|
API->>TokenService: Revoke all user tokens
|
|
TokenService->>DB: Revoke all refresh tokens
|
|
TokenService->>DB: Mark all devices inactive
|
|
TokenService->>DB: Increment token_version
|
|
API->>AuditLogger: Log logout_all_devices (HIGH_RISK)
|
|
AuditLogger->>AuditLogger: Trigger security alert
|
|
API-->>Client: {ok: true, revoked_tokens_count: N}
|
|
```
|
|
|
|
### 3.4 Admin Security Events Flow
|
|
|
|
**Step-by-Step:**
|
|
|
|
1. **Admin requests security events** (`GET /admin/security-events`)
|
|
- Requires authentication (access token)
|
|
- Requires admin role (security_admin)
|
|
- Rate limited (100 per 15 minutes per admin)
|
|
- Validate and sanitize query parameters (risk_level, limit, offset, search)
|
|
- Build parameterized SQL query (prevent injection)
|
|
- Query auth_audit table with filters
|
|
- Mask phone numbers (keep last 4 digits)
|
|
- Sanitize all output fields
|
|
- Get total count for pagination
|
|
- Get statistics (last 24 hours: total, high_risk, suspicious, info)
|
|
- Log admin access event (admin_view_security_events, INFO)
|
|
- Return events, pagination info, and statistics
|
|
|
|
**Mermaid Sequence Diagram:**
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Admin
|
|
participant API
|
|
participant AuthMiddleware
|
|
participant AdminAuth
|
|
participant AdminRateLimit
|
|
participant DB
|
|
participant AuditLogger
|
|
|
|
Admin->>API: GET /admin/security-events<br/>?risk_level=HIGH_RISK&limit=200
|
|
API->>AuthMiddleware: Verify access token
|
|
AuthMiddleware-->>API: User info
|
|
API->>AdminAuth: Check admin role
|
|
AdminAuth-->>API: Authorized
|
|
API->>AdminRateLimit: Check rate limit (100/15min)
|
|
AdminRateLimit-->>API: Allowed
|
|
API->>API: Sanitize query params
|
|
API->>DB: Query auth_audit (parameterized)
|
|
DB-->>API: Events data
|
|
API->>API: Mask phone numbers
|
|
API->>API: Sanitize output
|
|
API->>DB: Get total count
|
|
API->>DB: Get statistics (24h)
|
|
API->>AuditLogger: Log admin access
|
|
API-->>Admin: {events, pagination, stats}
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Timeouts, Expiry, and Limits
|
|
|
|
| Name | ENV Variable / Config | Default Value | Defined In | What It Affects |
|
|
|------|----------------------|---------------|------------|-----------------|
|
|
| **OTP Expiry** | `OTP_TTL_SECONDS` | `120` (2 minutes) | `src/services/otpService.js:10` | OTP validity period |
|
|
| **OTP Resend Throttle** | (hardcoded) | `120` seconds | `src/middleware/rateLimitMiddleware.js:154` | Minimum time between OTP requests for same phone |
|
|
| **Max OTP Verification Attempts** | `OTP_VERIFY_MAX_ATTEMPTS` | `5` | `src/services/otpService.js:12` | Maximum attempts to verify an OTP before it's invalidated |
|
|
| **JWT Access Token Expiry** | `JWT_ACCESS_TTL` | `'15m'` (15 minutes) | `src/config.js:72` | Access token lifetime |
|
|
| **JWT Refresh Token Expiry** | `JWT_REFRESH_TTL` | `'7d'` (7 days) | `src/config.js:73` | Refresh token lifetime |
|
|
| **Refresh Token Max Idle** | `REFRESH_MAX_IDLE_MINUTES` | `4320` (3 days) | `src/config.js:58-60` | Maximum idle time before refresh token expires |
|
|
| **Step-Up Auth Window** | `STEP_UP_OTP_WINDOW_MINUTES` | `5` minutes | `src/middleware/stepUpAuth.js:26` | Time window for "recent" OTP verification for step-up auth |
|
|
| **OTP Request - Phone (10 min)** | `OTP_REQ_PHONE_10MIN_LIMIT` | `3` | `src/middleware/rateLimitMiddleware.js:24` | Max OTP requests per phone per 10 minutes |
|
|
| **OTP Request - Phone (24h)** | `OTP_REQ_PHONE_DAY_LIMIT` | `10` | `src/middleware/rateLimitMiddleware.js:25` | Max OTP requests per phone per 24 hours |
|
|
| **OTP Request - IP (10 min)** | `OTP_REQ_IP_10MIN_LIMIT` | `20` | `src/middleware/rateLimitMiddleware.js:26` | Max OTP requests per IP per 10 minutes |
|
|
| **OTP Request - IP (24h)** | `OTP_REQ_IP_DAY_LIMIT` | `100` | `src/middleware/rateLimitMiddleware.js:27` | Max OTP requests per IP per 24 hours |
|
|
| **OTP Verify Failed (1h)** | `OTP_VERIFY_FAILED_PER_HOUR_LIMIT` | `10` | `src/middleware/rateLimitMiddleware.js:31` | Max failed verification attempts per phone per hour |
|
|
| **Enumeration IP Block Duration** | `ENUMERATION_BLOCK_DURATION` | `3600` (1 hour) | `src/middleware/rateLimitMiddleware.js:40` | Duration IP is blocked after enumeration detection |
|
|
| **User Rate Limit - Read** | `USER_RATE_LIMIT_READ_MAX` | `100` | `src/middleware/userRateLimit.js:25` | Max read requests per user per 15 minutes |
|
|
| **User Rate Limit - Read Window** | `USER_RATE_LIMIT_READ_WINDOW` | `900` (15 min) | `src/middleware/userRateLimit.js:26` | Time window for read rate limit |
|
|
| **User Rate Limit - Write** | `USER_RATE_LIMIT_WRITE_MAX` | `20` | `src/middleware/userRateLimit.js:29` | Max write requests per user per 15 minutes |
|
|
| **User Rate Limit - Write Window** | `USER_RATE_LIMIT_WRITE_WINDOW` | `900` (15 min) | `src/middleware/userRateLimit.js:30` | Time window for write rate limit |
|
|
| **User Rate Limit - Sensitive** | `USER_RATE_LIMIT_SENSITIVE_MAX` | `10` | `src/middleware/userRateLimit.js:33` | Max sensitive requests per user per hour |
|
|
| **User Rate Limit - Sensitive Window** | `USER_RATE_LIMIT_SENSITIVE_WINDOW` | `3600` (1 hour) | `src/middleware/userRateLimit.js:34` | Time window for sensitive rate limit |
|
|
| **Admin Rate Limit** | `ADMIN_RATE_LIMIT_MAX` | `100` | `src/middleware/adminRateLimit.js:23` | Max admin requests per admin per 15 minutes |
|
|
| **Admin Rate Limit Window** | `ADMIN_RATE_LIMIT_WINDOW` | `900` (15 min) | `src/middleware/adminRateLimit.js:24` | Time window for admin rate limit |
|
|
| **Twilio HTTP Timeout** | (hardcoded) | `5000` ms | `src/services/auditLogger.js:459` | Webhook request timeout (also used for Twilio if configured) |
|
|
| **Webhook Retry Delay** | (hardcoded) | `3000` ms | `src/services/auditLogger.js:498` | Delay before retrying failed webhook alerts |
|
|
| **OTP Request Min Delay** | `OTP_REQUEST_MIN_DELAY` | `500` ms | `src/utils/timingProtection.js:26` | Minimum delay for OTP requests (timing attack protection) |
|
|
| **OTP Verify Min Delay** | `OTP_VERIFY_MIN_DELAY` | `300` ms | `src/utils/timingProtection.js:30` | Minimum delay for OTP verification (timing attack protection) |
|
|
| **Timing Max Jitter** | `TIMING_MAX_JITTER` | `100` ms | `src/utils/timingProtection.js:34` | Maximum random jitter added to delays |
|
|
| **Enumeration Max Phones/IP (10min)** | `ENUMERATION_MAX_PHONES_PER_IP_10MIN` | `5` | `src/utils/enumerationDetection.js:32` | Max unique phone numbers per IP in 10 minutes |
|
|
| **Enumeration Max Phones/IP (1h)** | `ENUMERATION_MAX_PHONES_PER_IP_HOUR` | `20` | `src/utils/enumerationDetection.js:33` | Max unique phone numbers per IP in 1 hour |
|
|
| **Enumeration Alert Threshold (10min)** | `ENUMERATION_ALERT_THRESHOLD_10MIN` | `10` | `src/utils/enumerationDetection.js:40` | Unique phones threshold for alert (10 min) |
|
|
| **Enumeration Alert Threshold (1h)** | `ENUMERATION_ALERT_THRESHOLD_HOUR` | `50` | `src/utils/enumerationDetection.js:41` | Unique phones threshold for alert (1 hour) |
|
|
|
|
---
|
|
|
|
## 5. Security Features
|
|
|
|
### 5.1 CORS Behavior
|
|
|
|
**Configuration:**
|
|
- **Startup Validation**: CORS configuration is validated at startup (`src/index.js:29-34`)
|
|
- **Runtime Monitoring**: Runtime CORS checks log warnings for suspicious patterns (`src/index.js:58-63`)
|
|
- **Origin Whitelisting**: Only explicitly configured origins are allowed (never wildcard `*` when credentials are involved)
|
|
- **No-Origin Requests**: Requests without origin (mobile apps, Postman) are allowed
|
|
|
|
**Implementation:**
|
|
- `CORS_ALLOWED_ORIGINS`: Comma-separated list of allowed origins (required in production)
|
|
- Development mode: Allows all origins if no origins configured (with warning)
|
|
- Production mode: Throws error if `CORS_ALLOWED_ORIGINS` is empty
|
|
|
|
**Files:**
|
|
- `src/index.js:36-86` - CORS middleware configuration
|
|
- `src/utils/corsValidator.js` - CORS validation utilities
|
|
|
|
### 5.2 Security Headers
|
|
|
|
**Headers Set Globally:**
|
|
- `X-Frame-Options: DENY` - Prevents clickjacking
|
|
- `X-Content-Type-Options: nosniff` - Prevents MIME type sniffing
|
|
- `X-XSS-Protection: 1; mode=block` - Enables XSS filter (legacy browsers)
|
|
- `Strict-Transport-Security` - HSTS (only in production, max-age=31536000, includeSubDomains, preload)
|
|
- `Content-Security-Policy` - CSP with nonce support for inline scripts/styles
|
|
- `Referrer-Policy: strict-origin-when-cross-origin` - Controls referrer information
|
|
- `Permissions-Policy` - Restricts browser features (geolocation, microphone, camera, etc.)
|
|
|
|
**Files:**
|
|
- `src/middleware/securityHeaders.js` - Security headers middleware
|
|
|
|
### 5.3 Authentication & Authorization
|
|
|
|
**Authentication:**
|
|
- **OTP-Based**: Phone number + 6-digit OTP code
|
|
- **JWT Access Tokens**: Short-lived (15 minutes), signed with HS256, include `token_version` claim
|
|
- **JWT Refresh Tokens**: Long-lived (7 days), stored hashed in database, rotated on each use
|
|
- **Device Tracking**: Tracks device identifier, platform, model, OS version, app version
|
|
- **Token Versioning**: Access tokens include `token_version` claim that is validated against user's current version in database. When user logs out from all devices, `token_version` is incremented, invalidating all existing access tokens immediately.
|
|
|
|
**Authorization:**
|
|
- **Role-Based**: Admin routes require `role === 'security_admin'`
|
|
- **Step-Up Auth**: Sensitive operations require recent OTP verification or `high_assurance` token flag
|
|
- **Token Claims**: Validates `iss` (issuer), `aud` (audience), `exp` (expiration), `iat` (issued at), `token_version` (for access token invalidation)
|
|
|
|
**Files:**
|
|
- `src/middleware/authMiddleware.js` - Access token validation
|
|
- `src/middleware/adminAuth.js` - Admin role check
|
|
- `src/middleware/stepUpAuth.js` - Step-up authentication
|
|
|
|
### 5.4 Audit Logging
|
|
|
|
**Events Logged:**
|
|
- `otp_request` - OTP request (success/failed)
|
|
- `otp_verify` - OTP verification (success/failed)
|
|
- `login` - User login (success/blocked)
|
|
- `token_refresh` - Token refresh (success, with risk level)
|
|
- `logout` - User logout
|
|
- `device_revoked` - Device revocation
|
|
- `logout_all_other_devices` - Logout all other devices
|
|
- `logout_all_devices` - Logout from all devices (HIGH_RISK, triggers security alert)
|
|
- `admin_view_security_events` - Admin access to security dashboard
|
|
|
|
**Risk Levels:**
|
|
- `INFO` - Normal operations
|
|
- `SUSPICIOUS` - Unusual patterns (IP change, device change, multiple failures)
|
|
- `HIGH_RISK` - Blocked IPs, high risk scores (>=50), enumeration attempts
|
|
|
|
**Alerting:**
|
|
- **Webhook Integration**: Sends alerts to `SECURITY_ALERT_WEBHOOK_URL` for SUSPICIOUS/HIGH_RISK events
|
|
- **Anomaly Detection**: Detects patterns (multiple failed OTPs, multiple high-risk events from same IP)
|
|
- **Retry Logic**: Retries failed webhook alerts once after 3 seconds
|
|
|
|
**Files:**
|
|
- `src/services/auditLogger.js` - Audit logging and webhook alerting
|
|
- `src/services/riskScoring.js` - Risk score calculation
|
|
|
|
### 5.5 Data Protection
|
|
|
|
**Field-Level Encryption:**
|
|
- **Algorithm**: AES-256-GCM (authenticated encryption)
|
|
- **Fields Encrypted**: Phone numbers (before storing in database)
|
|
- **Key Management**: 32-byte key from `ENCRYPTION_KEY` (base64 encoded)
|
|
- **Backward Compatibility**: Handles both encrypted and plaintext data during migration
|
|
|
|
**Database Access Logging:**
|
|
- **Optional Feature**: Enabled with `DB_ACCESS_LOGGING_ENABLED=true`
|
|
- **Logs**: All database queries with context (user ID, IP, user agent)
|
|
- **Use Case**: Security auditing, compliance
|
|
|
|
**Files:**
|
|
- `src/utils/fieldEncryption.js` - Field-level encryption
|
|
- `src/middleware/dbAccessLogger.js` - Database access logging
|
|
|
|
### 5.6 Protection Against Attacks
|
|
|
|
**Brute-Force / Enumeration:**
|
|
- Rate limiting at multiple levels (phone, IP, user)
|
|
- Enumeration detection (tracks unique phone numbers per IP)
|
|
- IP blocking for enumeration attempts (1 hour block)
|
|
- Stricter rate limits when enumeration detected
|
|
|
|
**Timing Attacks:**
|
|
- Constant-time OTP verification (always performs bcrypt.compare, uses dummy hash if OTP not found)
|
|
- Timing protection wrappers for OTP request and verification flows
|
|
- Minimum delay enforcement to prevent timing leaks
|
|
|
|
**Man-in-the-Middle:**
|
|
- HTTPS enforcement via HSTS header (production)
|
|
- Security headers (CSP, X-Frame-Options) prevent various MITM attacks
|
|
- JWT token validation with signature verification
|
|
|
|
**Token Replay:**
|
|
- Refresh token rotation (new token issued, old token revoked)
|
|
- Reuse detection (if old token is used, all tokens for device are revoked)
|
|
- Access token short expiry (15 minutes) limits replay window
|
|
- Token versioning: Access tokens include `token_version` claim that is validated on each request. When user logs out from all devices, version is incremented, immediately invalidating all existing access tokens (even if not expired)
|
|
|
|
**Files:**
|
|
- `src/utils/timingProtection.js` - Timing attack protection
|
|
- `src/utils/enumerationDetection.js` - Enumeration detection
|
|
- `src/services/tokenService.js` - Token rotation and reuse detection
|
|
|
|
---
|
|
|
|
## 6. Error Handling & Failure Modes
|
|
|
|
### 6.1 OTP Sending Failures
|
|
|
|
**Behavior:**
|
|
- If Twilio is not configured: OTP is logged to console, request still succeeds
|
|
- If Twilio fails: Error is logged, OTP is still generated and stored, request succeeds
|
|
- **Rationale**: OTP generation should not fail if SMS delivery fails (user can check logs in development)
|
|
|
|
**Error Response:**
|
|
- Success response returned even if SMS fails (for development/testing)
|
|
- Production recommendation: Return error if SMS fails (uncomment error return in `src/routes/authRoutes.js:213`)
|
|
|
|
**Files:**
|
|
- `src/services/smsService.js` - SMS sending with fallback logging
|
|
|
|
### 6.2 Database Failures
|
|
|
|
**Behavior:**
|
|
- Connection pool errors: Logged, process exits (`src/db.js:11-14`)
|
|
- Query errors: Propagated to route handler, return 500 error
|
|
- **No Retries**: Database queries are not retried automatically (application-level retries can be added)
|
|
|
|
**Error Response:**
|
|
- `500 Internal Server Error` with generic message: `{error: 'Internal server error'}`
|
|
|
|
**Files:**
|
|
- `src/db.js` - Database connection and query wrapper
|
|
|
|
### 6.3 JWT Validation Errors
|
|
|
|
**Behavior:**
|
|
- Invalid token format: `401 Unauthorized` - `{error: 'Invalid token format'}`
|
|
- Invalid/expired token: `401 Unauthorized` - `{error: 'Invalid or expired token'}`
|
|
- Invalid claims: `401 Unauthorized` - `{error: 'Invalid token claims'}`
|
|
- Missing Authorization header: `401 Unauthorized` - `{error: 'Missing Authorization header'}`
|
|
|
|
**Key Rotation:**
|
|
- If key ID not found: Tries all available keys (for rotation support)
|
|
- If no key matches: Returns `401 Unauthorized`
|
|
|
|
**Files:**
|
|
- `src/middleware/authMiddleware.js` - JWT validation
|
|
- `src/services/tokenService.js` - Refresh token validation
|
|
|
|
### 6.4 Rate Limit Exceeded
|
|
|
|
**Behavior:**
|
|
- OTP request rate limit: `429 Too Many Requests` - `{success: false, message: 'Too many OTP requests...'}`
|
|
- OTP verify rate limit: `429 Too Many Requests` - `{success: false, message: 'Too many attempts...'}`
|
|
- User route rate limit: `429 Too Many Requests` - `{error: 'Too many requests', retry_after: seconds}`
|
|
- Admin route rate limit: `429 Too Many Requests` - `{error: 'Too many requests', retry_after: seconds}`
|
|
|
|
**Headers:**
|
|
- `X-RateLimit-Limit`: Maximum requests allowed
|
|
- `X-RateLimit-Remaining`: Remaining requests in window
|
|
- `X-RateLimit-Reset`: ISO timestamp when limit resets
|
|
- `X-RateLimit-Type`: Type of rate limit (read/write/sensitive/admin)
|
|
|
|
**Files:**
|
|
- `src/middleware/rateLimitMiddleware.js` - OTP rate limiting
|
|
- `src/middleware/userRateLimit.js` - User route rate limiting
|
|
- `src/middleware/adminRateLimit.js` - Admin rate limiting
|
|
|
|
### 6.5 Retries & Fallbacks
|
|
|
|
**Redis Fallback:**
|
|
- If Redis unavailable: Falls back to in-memory store (per-process, not shared)
|
|
- Rate limiting continues to work (with per-instance limits, not global)
|
|
- Warning logged on first failure, then silent
|
|
|
|
**Webhook Alerting:**
|
|
- If webhook fails: Retries once after 3 seconds
|
|
- If retry fails: Error logged, but main request flow continues (non-blocking)
|
|
|
|
**Files:**
|
|
- `src/services/redisClient.js` - Redis client with graceful fallback
|
|
- `src/services/auditLogger.js:334-516` - Webhook alerting with retry
|
|
|
|
---
|
|
|
|
## 7. Configuration & Environment Variables
|
|
|
|
### 7.1 Required Variables
|
|
|
|
| Variable | Description | Example | Required |
|
|
|----------|-------------|---------|----------|
|
|
| `DATABASE_URL` | PostgreSQL connection string | `postgres://user:pass@localhost:5432/dbname` | ✅ Yes |
|
|
| `JWT_ACCESS_SECRET` | Secret for signing access tokens (min 32 chars) | `hex-string-32-chars-minimum` | ✅ Yes |
|
|
| `JWT_REFRESH_SECRET` | Secret for signing refresh tokens (min 32 chars) | `hex-string-32-chars-minimum` | ✅ Yes |
|
|
|
|
### 7.2 Optional Variables - Timeouts & Expiry
|
|
|
|
| Variable | Description | Default | Example |
|
|
|----------|-------------|---------|---------|
|
|
| `JWT_ACCESS_TTL` | Access token expiry | `15m` | `15m`, `1h` |
|
|
| `JWT_REFRESH_TTL` | Refresh token expiry | `7d` | `7d`, `30d` |
|
|
| `REFRESH_MAX_IDLE_MINUTES` | Refresh token max idle time | `4320` (3 days) | `4320` |
|
|
| `OTP_TTL_SECONDS` | OTP validity in seconds | `120` (2 min) | `120` |
|
|
| `STEP_UP_OTP_WINDOW_MINUTES` | Step-up auth window | `5` | `5` |
|
|
|
|
### 7.3 Optional Variables - Rate Limits
|
|
|
|
| Variable | Description | Default | Example |
|
|
|----------|-------------|---------|---------|
|
|
| `OTP_REQ_PHONE_10MIN_LIMIT` | Max OTP requests per phone (10 min) | `3` | `3` |
|
|
| `OTP_REQ_PHONE_DAY_LIMIT` | Max OTP requests per phone (24h) | `10` | `10` |
|
|
| `OTP_REQ_IP_10MIN_LIMIT` | Max OTP requests per IP (10 min) | `20` | `20` |
|
|
| `OTP_REQ_IP_DAY_LIMIT` | Max OTP requests per IP (24h) | `100` | `100` |
|
|
| `OTP_VERIFY_MAX_ATTEMPTS` | Max OTP verification attempts | `5` | `5` |
|
|
| `OTP_VERIFY_FAILED_PER_HOUR_LIMIT` | Max failed verifications per phone (1h) | `10` | `10` |
|
|
| `USER_RATE_LIMIT_READ_MAX` | Max read requests per user (15 min) | `100` | `100` |
|
|
| `USER_RATE_LIMIT_WRITE_MAX` | Max write requests per user (15 min) | `20` | `20` |
|
|
| `USER_RATE_LIMIT_SENSITIVE_MAX` | Max sensitive requests per user (1h) | `10` | `10` |
|
|
| `ADMIN_RATE_LIMIT_MAX` | Max admin requests per admin (15 min) | `100` | `100` |
|
|
|
|
### 7.4 Optional Variables - Security Features
|
|
|
|
| Variable | Description | Default | Example |
|
|
|----------|-------------|---------|---------|
|
|
| `ENCRYPTION_ENABLED` | Enable field-level encryption | `false` | `true` |
|
|
| `ENCRYPTION_KEY` | 32-byte encryption key (base64) | - | `base64-encoded-32-byte-key` |
|
|
| `DB_ACCESS_LOGGING_ENABLED` | Enable database access logging | `false` | `true` |
|
|
| `DB_ACCESS_LOG_LEVEL` | DB access log level ('all' or 'sensitive') | `sensitive` | `all`, `sensitive` |
|
|
| `CORS_ALLOWED_ORIGINS` | Comma-separated allowed origins | - | `https://app.example.com,https://api.example.com` |
|
|
| `ENUMERATION_MAX_PHONES_PER_IP_10MIN` | Max unique phones per IP (10 min) | `5` | `5` |
|
|
| `ENUMERATION_MAX_PHONES_PER_IP_HOUR` | Max unique phones per IP (1h) | `20` | `20` |
|
|
| `ENUMERATION_ALERT_THRESHOLD_10MIN` | Alert threshold for enumeration (10 min) | `10` | `10` |
|
|
| `ENUMERATION_ALERT_THRESHOLD_HOUR` | Alert threshold for enumeration (1h) | `50` | `50` |
|
|
| `OTP_REQUEST_MIN_DELAY` | Min delay for OTP requests (ms) | `500` | `500` |
|
|
| `OTP_VERIFY_MIN_DELAY` | Min delay for OTP verify (ms) | `300` | `300` |
|
|
| `TIMING_MAX_JITTER` | Max jitter for timing protection (ms) | `100` | `100` |
|
|
| `BLOCKED_IP_RANGES` | Comma-separated CIDR blocks | - | `10.0.0.0/8,172.16.0.0/12` |
|
|
| `REQUIRE_OTP_ON_SUSPICIOUS_REFRESH` | Require OTP on suspicious refresh | `false` | `true` |
|
|
| `SECURITY_ALERT_WEBHOOK_URL` | Webhook URL for security alerts | - | `https://hooks.slack.com/...` |
|
|
| `SECURITY_ALERT_MIN_LEVEL` | Minimum risk level for alerts | `HIGH_RISK` | `SUSPICIOUS`, `HIGH_RISK` |
|
|
|
|
### 7.5 Optional Variables - JWT Key Rotation
|
|
|
|
| Variable | Description | Default | Example |
|
|
|----------|-------------|---------|---------|
|
|
| `JWT_ACTIVE_KEY_ID` | Key ID for signing new tokens | `1` | `1`, `2` |
|
|
| `JWT_KEYS_JSON` | JSON mapping key IDs to secrets | - | `{"1":"secret1","2":"secret2"}` |
|
|
| `JWT_REFRESH_KEY_ID` | Key ID for refresh tokens | Same as active | `1` |
|
|
| `JWT_ISSUER` | JWT issuer claim | `farm-auth-service` | `farm-auth-service` |
|
|
| `JWT_AUDIENCE` | JWT audience claim | `mobile-app` | `mobile-app` |
|
|
|
|
### 7.6 Optional Variables - External Services
|
|
|
|
| Variable | Description | Default | Example |
|
|
|----------|-------------|---------|---------|
|
|
| `TWILIO_ACCOUNT_SID` | Twilio account SID | - | `ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx` |
|
|
| `TWILIO_AUTH_TOKEN` | Twilio auth token | - | `your_auth_token` |
|
|
| `TWILIO_MESSAGING_SERVICE_SID` | Twilio messaging service SID | - | `MGxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx` |
|
|
| `TWILIO_FROM_NUMBER` | Twilio phone number (E.164) | - | `+1234567890` |
|
|
| `REDIS_URL` | Redis connection URL | - | `redis://localhost:6379` |
|
|
| `REDIS_HOST` | Redis host | `localhost` | `localhost` |
|
|
| `REDIS_PORT` | Redis port | `6379` | `6379` |
|
|
| `REDIS_PASSWORD` | Redis password | - | `password` |
|
|
|
|
### 7.7 Optional Variables - Server Configuration
|
|
|
|
| Variable | Description | Default | Example |
|
|
|----------|-------------|---------|---------|
|
|
| `PORT` | Server port | `3000` | `3000` |
|
|
| `NODE_ENV` | Environment | - | `development`, `production` |
|
|
| `TRUST_PROXY` | Trust proxy headers | `false` | `true` |
|
|
| `ENABLE_ADMIN_DASHBOARD` | Enable admin routes | `false` | `true` |
|
|
|
|
---
|
|
|
|
## 8. Future Improvements / Notes
|
|
|
|
### 8.1 Planned Improvements (from TODOs in code)
|
|
|
|
1. **Secrets Manager Integration**
|
|
- Load JWT keys from AWS Secrets Manager / HashiCorp Vault (instead of environment variables)
|
|
- Load encryption keys from secrets manager
|
|
- **File**: `src/services/jwtKeys.js:161-174` (TODO comment)
|
|
|
|
2. **Automated Key Rotation**
|
|
- Implement automated JWT key rotation without downtime
|
|
- Re-encrypt existing data when encryption keys are rotated
|
|
- **File**: `src/services/jwtKeys.js` (key rotation support exists, but automation needed)
|
|
|
|
3. **SIEM Integration**
|
|
- Integrate with SIEM systems (Splunk, ELK, etc.) for centralized log aggregation
|
|
- Export audit logs to SIEM for advanced threat detection
|
|
- **File**: `src/services/auditLogger.js` (webhook exists, but SIEM integration needed)
|
|
|
|
4. **CSP Nonces**
|
|
- Fully implement CSP nonces for inline scripts/styles (currently allows `unsafe-inline` for compatibility)
|
|
- **File**: `src/middleware/securityHeaders.js:28-29` (nonce support exists but not fully utilized)
|
|
|
|
5. **Database Connection Pooling Tuning**
|
|
- Add configuration for connection pool size, timeout, etc.
|
|
- **File**: `src/db.js` (basic pool, no tuning options)
|
|
|
|
6. **Rate Limiting Improvements**
|
|
- Implement distributed rate limiting (currently per-instance if Redis unavailable)
|
|
- Add rate limit headers to all rate-limited endpoints
|
|
- **File**: `src/middleware/rateLimitMiddleware.js` (Redis fallback exists, but distributed limiting needed)
|
|
|
|
7. **OTP Delivery Alternatives**
|
|
- Support multiple SMS providers (fallback if Twilio fails)
|
|
- Support email OTP delivery
|
|
- Support push notification OTP delivery
|
|
- **File**: `src/services/smsService.js` (only Twilio supported)
|
|
|
|
8. **Advanced Risk Scoring**
|
|
- Machine learning-based risk scoring
|
|
- Geographic anomaly detection (unusual locations)
|
|
- Device fingerprinting improvements
|
|
- **File**: `src/services/riskScoring.js` (basic scoring exists)
|
|
|
|
### 8.2 Potential Risks & Technical Debt
|
|
|
|
1. **In-Memory Rate Limiting**
|
|
- If Redis is unavailable, rate limiting uses in-memory store (per-instance, not shared)
|
|
- **Risk**: Rate limits are per-instance, not global (can be bypassed with multiple instances)
|
|
- **Mitigation**: Always use Redis in production, or implement distributed rate limiting
|
|
|
|
2. **OTP Storage**
|
|
- OTPs are stored in database (not just Redis)
|
|
- **Risk**: Database can become a bottleneck for high-volume OTP requests
|
|
- **Mitigation**: Consider moving OTP storage to Redis entirely (with DB backup for audit)
|
|
|
|
3. **Phone Number Encryption Migration**
|
|
- Handles both encrypted and plaintext phone numbers (backward compatibility)
|
|
- **Risk**: Plaintext phone numbers still in database if encryption was enabled after data existed
|
|
- **Mitigation**: Implement migration script to encrypt all existing phone numbers
|
|
|
|
4. **Webhook Alerting**
|
|
- Webhook failures are logged but don't block requests
|
|
- **Risk**: Security alerts might be missed if webhook is down
|
|
- **Mitigation**: Implement alert queue (Redis/RabbitMQ) with retry logic and dead-letter queue
|
|
|
|
5. **Database Access Logging**
|
|
- Database access logging is optional and can impact performance
|
|
- **Risk**: Performance degradation if enabled in high-traffic scenarios
|
|
- **Mitigation**: Use async logging, batch writes, or separate logging database
|
|
|
|
6. **JWT Key Rotation**
|
|
- Key rotation support exists, but manual process
|
|
- **Risk**: Manual key rotation can cause downtime if not done correctly
|
|
- **Mitigation**: Implement automated key rotation with gradual rollout
|
|
|
|
7. **CORS Configuration**
|
|
- CORS validation at startup, but runtime checks are warnings only
|
|
- **Risk**: Misconfiguration might not be caught until runtime
|
|
- **Mitigation**: Add stricter runtime validation or fail-fast on suspicious patterns
|
|
|
|
8. **Error Messages**
|
|
- Some error messages are generic to prevent information leakage
|
|
- **Risk**: Generic errors can make debugging difficult
|
|
- **Mitigation**: Log detailed errors server-side, return generic errors to clients
|
|
|
|
---
|
|
|
|
## Appendix: Database Schema
|
|
|
|
### Key Tables
|
|
|
|
**users**
|
|
- `id` (UUID, PK)
|
|
- `phone_number` (VARCHAR(20), UNIQUE, encrypted if ENCRYPTION_ENABLED)
|
|
- `name` (VARCHAR(255))
|
|
- `role` (enum: 'user', 'admin', 'moderator')
|
|
- `user_type` (enum: 'seller', 'buyer', 'service_provider')
|
|
- `token_version` (INT, DEFAULT 1) - Incremented on logout-all-devices to invalidate all access tokens
|
|
- `created_at`, `updated_at`, `last_login_at`
|
|
|
|
**otp_codes**
|
|
- `id` (UUID, PK)
|
|
- `phone_number` (VARCHAR(20), encrypted if ENCRYPTION_ENABLED)
|
|
- `otp_hash` (VARCHAR(255), bcrypt hash)
|
|
- `expires_at` (TIMESTAMPTZ)
|
|
- `attempt_count` (INT)
|
|
- `created_at` (TIMESTAMPTZ)
|
|
|
|
**refresh_tokens**
|
|
- `id` (UUID, PK)
|
|
- `user_id` (UUID, FK)
|
|
- `token_id` (UUID, UNIQUE)
|
|
- `token_hash` (VARCHAR(255), bcrypt hash)
|
|
- `device_id` (VARCHAR(255))
|
|
- `user_agent` (TEXT)
|
|
- `ip_address` (VARCHAR(45))
|
|
- `expires_at` (TIMESTAMPTZ)
|
|
- `last_used_at` (TIMESTAMPTZ)
|
|
- `revoked_at` (TIMESTAMPTZ, NULL = active)
|
|
- `reuse_detected_at` (TIMESTAMPTZ)
|
|
- `rotated_from_id` (UUID, FK to refresh_tokens)
|
|
|
|
**user_devices**
|
|
- `id` (UUID, PK)
|
|
- `user_id` (UUID, FK)
|
|
- `device_identifier` (TEXT)
|
|
- `device_platform` (TEXT)
|
|
- `device_model` (TEXT)
|
|
- `os_version` (TEXT)
|
|
- `app_version` (TEXT)
|
|
- `language_code` (TEXT)
|
|
- `timezone` (TEXT)
|
|
- `first_seen_at` (TIMESTAMPTZ)
|
|
- `last_seen_at` (TIMESTAMPTZ)
|
|
- `is_active` (BOOLEAN)
|
|
- UNIQUE (user_id, device_identifier)
|
|
|
|
**auth_audit**
|
|
- `id` (UUID, PK)
|
|
- `user_id` (UUID, FK, nullable)
|
|
- `action` (VARCHAR(100))
|
|
- `status` (VARCHAR(50))
|
|
- `risk_level` (VARCHAR(20): 'INFO', 'SUSPICIOUS', 'HIGH_RISK')
|
|
- `ip_address` (VARCHAR(45))
|
|
- `user_agent` (TEXT)
|
|
- `device_id` (VARCHAR(255))
|
|
- `meta` (JSONB)
|
|
- `created_at` (TIMESTAMPTZ)
|
|
|
|
---
|
|
|
|
## Document Version
|
|
|
|
- **Version**: 1.0
|
|
- **Last Updated**: 2024
|
|
- **Author**: Architecture Documentation Generator
|
|
- **Maintained By**: Development Team
|
|
|