
Project Culper Dev Blog: Entity Resolution and the Underline System
This week on Project Culper, the work centered on two connected problems: how the backend resolves a detected span against the entity registry, and how the frontend communicates that result to the agent visually.
The entity resolution pipeline now runs in three tiers. When the NER service returns a detected span, the backend first checks for an exact name match against the entities table. If that fails, it checks the aliases JSONB column using PostgreSQL’s containment operator. If that also fails, it runs a fuzzy match using pg_trgm trigram similarity with a 0.7 threshold, returning the top three candidates ranked by score. The resolution result travels back to the frontend as either exact, alias, fuzzy, or unknown, and that result drives what the agent sees next.

The underline system was redesigned this week to reflect that distinction visually. When a span resolves to a known registry entity, the underline color is determined by the object type assigned to that entity: amber for persons, cyan for locations, green for vehicles, red for organizations. That color is pulled from the registry record, not guessed by the NER model. When a span does not resolve to anything in the registry, the underline renders as a purple-to-blue left-to-right gradient, which is the agent’s signal that something was detected but not yet catalogued. Choosing a gradient here was deliberate: it reads as “the system noticed something” rather than “you made a mistake,” which is a meaningful distinction in an environment where agents are writing quickly.
The entry_entity_spans table was also updated this week to replace implicit confirmation state with an explicit status field: detected, confirmed, linked, and rejected. Rejected spans become training negatives for the NER model.