
Building Intelligence Import: How Project Culper Handles Legacy Data
One of the hardest problems in deploying any intelligence platform isn’t building the new system. It’s bringing years of existing data into it without losing the human context behind every record.
This week we shipped self-service data import for Project Culper. The import pipeline accepts CSV exports from legacy systems like iTrak and maps them to Culper’s normalized schema through a three-step process: file upload, column mapping, and commit. Before writing a single row, the system reads only the file headers and runs fuzzy string matching to suggest mappings automatically. An officer who exports “report_body” from their old system sees it pre-matched to Narrative. “incident_date” maps to Date/Time. They confirm, adjust if needed, and import.
The trickier problem was historical attribution. Legacy entries reference officers who may no longer work at the organization. Rather than orphan those records or assign them to a generic “imported” user, Culper creates ghost users: real rows in the users table with attribution-only accounts. James Smith’s 2019 reports still say James Smith wrote them. He just can’t log in.
On the intelligence extraction side, we replaced a rigid state machine in the backend with a simpler interaction model. When the NER model highlights a name in an entry, analysts interact naturally: confirm a known entity, create a new one, or dismiss a false positive. Uninteracted highlights generate no training signal at all, which keeps the per-tenant model honest.