Workflow

Pipeline Overview

The workflow is a 10-step deterministic pipeline with agent-in-the-loop judgment calls. Steps labeled [CODE] are pure deterministic functions. Steps labeled [AGENT] involve LLM judgment (schema-validated). [HUMAN] steps require explicit approval.


Step Details

Step 1–3: Ingestion & Normalization [CODE]

StepDescription
fetch_listingsFetches 13 raw listings from 2 mock providers (Jaap, Funda)
normalize_listingsMaps provider schemas to NormalizedListing. Runs prompt injection scanner. Flags suspicious content.
deduplicate_listingsCross-provider dedup by address + listing key. Removes exact and near-duplicates.

Step 4–5: Enrichment

StepActorDescription
calculate_commutesCODEComputes commute time + distance in parallel with concurrency limit
research_neighbourhoodsAGENTAssesses each neighbourhood on 3 dimensions: quiet, transport, green space. Produces evidence-cited assessments.

Step 6–7: Scoring

StepActorDescription
calculate_deterministic_scoresCODEAffordability, commute, neighbourhood, space scores. Hard constraints applied. Recommendation tier.
generate_qualitative_evaluationsAGENTAgent evaluates fit against qualitative preferences. Cannot override hard constraint results.

Step 8–10: Delivery & Approval

StepActorDescription
build_shortlistCODE + AGENTRanks by combined score. Agent synthesizes summary with trade-offs.
draft_realtor_messageAGENTDrafts professional email. Agent cannot send — only draft.
create_pending_actionCODECreates pending action with idempotency key + payload hash. Enters awaiting_approval.

Approval & Execution [HUMAN + CODE]

StepActorDescription
await_human_approvalHUMANUser reviews draft. Can edit, approve, or reject.
execute_approved_actionCODEVerifies payload hash. Checks idempotency. Sends email. Records atomically.

State Machine

created → listings_fetched → listings_normalized → listings_deduplicated
  → enrichment_running → enrichment_complete → ranking_complete
  → shortlist_created → awaiting_approval → action_executed → completed

Any state can transition to: failed → retry

Transitions are validated by is_valid_transition() — invalid transitions rejected at the controller level.


Retry & Recovery

MechanismHow
Per-step retryTenacity-based with exponential backoff. Configurable per step.
Crash recoveryState persisted after each step. resume_workflow() reconstructs from DB.
IdempotencyEvery action keyed: email:listing_id:recipient:payload_hash. Duplicates blocked.
ReconciliationPost-crash: detects sent email, marks completed — no duplicate send.

Concurrency

Steps processing multiple items run in parallel with a configurable semaphore (max_concurrent_enrichments, default 4). Applies to: calculate_commutes, research_neighbourhoods, generate_qualitative_evaluations.


Audit Trail

Every transition, agent call, security event, and external action produces an audit event:

Event TypeActorTrigger
workflow.startedSYSTEMWorkflow creation
workflow.step_startedDETStep begins
workflow.step_completedDET / AGENTStep finishes
security.suspicious_contentDETPrompt injection detected
action.createdSYSTEMPending action created
action.approvedHUMANUser approves
action.executedTOOLExternal action complete
action.duplicate_preventedSYSTEMIdempotency check blocks duplicate
system.recoverySYSTEMCrash recovery reconciles