Security Review

Reviewed: 2026-03-08 Status: Schema hardening done — runtime fixes pending

CRITICAL

1. No validation of AI-generated SQL before DuckDB execution

AI output goes straight into DuckDB with no sanitization or validation
DuckDB supports COPY TO (file exfiltration), ATTACH (external DB access), INSTALL/LOAD (extensions), filesystem reads via read_csv_auto()
Prompt injection could trick the AI into generating malicious SQL
Fix: Open DuckDB in read-only mode ({ access_mode: 'READ_ONLY' }). Implement SQL keyword blocklist or AST-based validator rejecting COPY, ATTACH, INSTALL, LOAD, CREATE, DROP, ALTER, INSERT, UPDATE, DELETE, PRAGMA
FIXED: api/src/utils/sqlValidator.js implemented. DuckDB opened in read-only mode. Validator checks: no semicolons, no SQL comments, must start with SELECT, blocked keyword blocklist, no cross-schema references, max SQL length (2000 chars), no file path literals, AE role must include location_sectionCode filter. Blocked keyword regex strips quoted identifiers and string literals first (single-quoted, double-quoted, backtick) to avoid false positives on human-readable column aliases (e.g. "Installation Load (KVA)" contains the word LOAD but is not a SQL keyword). Keywords inside quotes cannot execute — defense-in-depth is preserved by the no-semicolons, must-start-with-SELECT, and read-only mode checks.

HIGH

2. limit/offset query params injectable into SQL

getWidgetData wraps stored SQL as SELECT * FROM ({stored_sql}) LIMIT {limit} OFFSET {offset}
No validation that limit/offset are positive integers or have a max cap
String interpolation could allow injection (e.g. limit=1; DROP TABLE ...)
Fix: Validate as positive integers. Use DuckDB parameterized queries ($1, $2). Enforce max limit (e.g. 1000)

3. No automatic userId scoping — relies on per-controller discipline

Every controller must manually include userId in queries
One missed check = IDOR (user A accesses user B's data)
Fix: Create a shared middleware or utility that always injects userId into queries. Add compound indexes like { canvasId: 1, userId: 1 } for defense-in-depth
Schema fix applied: Compound index { canvasId: 1, userId: 1 } added to widget. Runtime middleware still pending

4. Unbounded messages array in conversation documents

No max message count. Conversation can grow indefinitely
MongoDB 16MB document limit can be hit, causing write failures
Reopened conversations (edit flow) compound the problem
Fix: Cap messages per conversation (e.g. 50). Add maxlength on message content. Consider separate collection if conversations grow long
FIXED: Messages capped at 50 via schema validator. content maxlength set to 5000

MEDIUM

5. No maxlength on any string field across all schemas

canvas.name, widget.title, widget.sql, message.content, conversation.name — all unbounded
A user could send a 10MB string as a canvas name or message body
Fix: Add maxlength constraints — canvas/widget names (200 chars), message content (5000 chars), SQL (10000 chars). Add Fastify request body size limits
FIXED: maxlength added — canvas.name (200), widget.title (200), widget.sql (10000), widget.aiModel (100), conversation.name (200), message.content (5000), message.sql (10000), user.name (200). Fastify body size limits still pending

6. Path parameters not validated as ObjectId format

canvasId, widgetId, conversationId documented as Type: <string> with no ObjectId format validation
Invalid IDs cause Mongoose CastError (500) instead of clean 400 responses
Fix: Add Fastify schema validation on path params (pattern: "^[0-9a-fA-F]{24}$"). Return 400 for invalid IDs

7. Race condition on active conversation check (TOCTOU)

startConversation: (1) check no active conversation, (2) create new one — two separate ops
Two simultaneous requests can both pass the check
Fix: Use MongoDB unique partial index: { userId: 1 }, { unique: true, partialFilterExpression: { status: "active" } } on conversation collection
FIXED: Unique partial index added to conversation schema

8. Non-atomic cascade deletions

deleteCanvas does 3 sequential deletes (widgets, conversations, canvas) without a transaction
deleteWidget similarly: delete conversation, delete widget, reposition — non-atomic
Fix: Wrap cascade deletions in a MongoDB transaction

9. Raw SQL exposed in responses

startConversation, sendMessage, getConversation responses include sql field
Reveals DuckDB table/view names, column names, role-scoped WHERE logic
Fix: Evaluate if SQL needs to be in frontend responses. Consider admin-only access or aliasing internal names

10. No rate limiting on AI-triggering endpoints

startConversation and sendMessage both trigger Claude API calls (expensive)
No documented rate limiting. Edit flow bypasses active conversation check
Fix: Implement per-user rate limiting (e.g. 10 AI requests/min). Use @fastify/rate-limit

LOW

11. Missing compound database indexes

canvas: missing { userId: 1, displayOrder: 1 }
widget: missing { canvasId: 1, position: 1 } and { canvasId: 1, userId: 1 }
conversation: missing { userId: 1, status: 1 } (queried on every conversation start)
Fix: Add compound indexes to match actual query patterns
FIXED: Added { userId: 1, displayOrder: 1 } on canvas, { canvasId: 1, position: 1 } and { canvasId: 1, userId: 1 } on widget, { userId: 1, status: 1 } on conversation

12. Session schema fields not marked as required

sessionToken, userId, userType, expiresAt all lack required: true
A session without expiresAt would never be TTL-cleaned (immortal session)
Fix: Add required: true to all session fields
FIXED: required: true added to sessionToken, userId, userType, expiresAt

13. chartConfig accepts arbitrary unvalidated JSON

chartConfig: { type: Object } — any JSON structure stored and sent to frontend
If frontend renders string values as HTML, XSS is possible
Fix: Define strict JSON schema for chartConfig
Status: chartConfig JSON schema validation deferred until frontend contract is finalized

Security Review

CRITICAL​

1. No validation of AI-generated SQL before DuckDB execution​

HIGH​

2. limit/offset query params injectable into SQL​

3. No automatic userId scoping — relies on per-controller discipline​

4. Unbounded messages array in conversation documents​

MEDIUM​

5. No maxlength on any string field across all schemas​

6. Path parameters not validated as ObjectId format​

7. Race condition on active conversation check (TOCTOU)​

8. Non-atomic cascade deletions​

9. Raw SQL exposed in responses​

10. No rate limiting on AI-triggering endpoints​

LOW​

11. Missing compound database indexes​

12. Session schema fields not marked as required​

13. chartConfig accepts arbitrary unvalidated JSON​

CRITICAL

1. No validation of AI-generated SQL before DuckDB execution

HIGH

2. limit/offset query params injectable into SQL

3. No automatic userId scoping — relies on per-controller discipline

4. Unbounded messages array in conversation documents

MEDIUM

5. No maxlength on any string field across all schemas

6. Path parameters not validated as ObjectId format

7. Race condition on active conversation check (TOCTOU)

8. Non-atomic cascade deletions

9. Raw SQL exposed in responses

10. No rate limiting on AI-triggering endpoints

LOW

11. Missing compound database indexes

12. Session schema fields not marked as required

13. chartConfig accepts arbitrary unvalidated JSON