Spec map — the foundation everything routes through


This skill contains the complete conceptual architecture for Politogy VRM’s foundational data system. It defines how voter data enters, flows through, and is controlled within VRM. It is the single source of truth for all data-related engineering decisions.

When building from this spec: This document tells you HOW THE SYSTEM THINKS. For specific field names, data types, and schema details, read references/field-schema.md. For brand/design standards, use the politogy-vrm-brand skill.


1. Core Identity: VRM Is a Data Company

VRM is not a campaign management company. VRM is a big data company. The campaign management features — Relationship Mode, Campaign Mode, Petition Mode — are the hooks that bring users onto the platform. The data those users generate, combined with voter file data and third-party enrichment, is the primary asset and long-term value of the company.

Every design decision must serve the collection, enrichment, and control of that data. The system has two masters: it must be genuinely excellent campaign software (because usage generates data), and it must be world-class data infrastructure that collects, normalizes, and controls access to every piece of information flowing through it.


2. The Unified Voter Profile (UVP)

Everything in VRM orbits the Unified Voter Profile (UVP) — the canonical representation of one voter. Every feature, every mode, every interaction across every customer account writes data back to a UVP.

A UVP is Politogy’s record, not the customer’s. Customers interact with voter profiles through the software, but the underlying UVP belongs to Politogy. Customers see a curated view. Politogy sees everything.

2.1 Data Layers

A UVP is composed of nine data layers. Each has a defined source, update cadence, and visibility rules.

LayerContentsSourceDefault Visibility
Identity CoreLegal name, DOB, voter reg ID, party, reg status, reg dateState voter fileAll users
Contact InfoMailing address, residential address, phone(s), email(s)Voter file + enrichment + user inputControlled per field
GeographicCounty, precinct, legislative districts, city/ward, school district, congressional districtVoter file + geocodingAll users
Vote HistoryElections participated, method (mail/in-person/drop-off), ballot statusState voter fileAll users
EnrichmentEstimated demographics, household composition, consumer data, social handles, donor historyThird-party providersControlled per field
Relationship DataTags, notes, contact log, relationship score, custom fieldsCustomer input (Relationship Mode)Originating account only
Campaign DataWalk/call lists, doors knocked, calls, survey responses, canvass outcomesCustomer input (Campaign Mode)Originating account only
Petition DataSignature status, petition assignments, follow-up, verificationCustomer input (Petition Mode)Originating account only
Aggregate IntelligenceCross-account behavioral patterns, sentiment scoring, propensity models, engagement trendsPolitogy-computed from all inputsPolitogy internal only (sellable asset)

The Aggregate Intelligence layer is what makes Politogy a data company. It’s computed from patterns across ALL customer accounts. No single customer can replicate it.

2.2 Data Priority Rules

When the same field has values from multiple layers, conflicts resolve by strict priority:

  1. User-entered data wins. User input becomes the display value. Previous values are preserved as alternates, never deleted. User corrections are also signal — if a user changes a phone number, the voter file’s number may be stale across the board.
  2. Enrichment fills gaps. Only appears when no higher-priority source has a value. Recedes to alternate when a user or voter file value arrives.
  3. Voter file is the baseline. Authoritative for registration-level data (party, status, districts, vote history) but secondary for contact info. Never overwritten.

The user never thinks about this hierarchy. They see the best available value. The system tracks the full provenance stack.

2.3 Provenance Tracking

Every value on a UVP carries metadata:

  • Source: Voter file, enrichment provider name, user ID + account ID, or Politogy-computed
  • Timestamp: When recorded or last confirmed
  • Confidence: For enrichment data, the provider’s confidence score
  • Superseded by: If overridden, which value replaced it and when

Provenance is internal infrastructure. Customers don’t see it. It makes VRM’s data defensible and sellable.


3. Data Ingestion

Data enters VRM through three channels.

3.1 State Voter File Import

Foundational data load. Bulk dataset from Secretary of State. Populates Identity Core, Contact Info, Geographic, and Vote History layers.

This is a Politogy-tier operation. Customers never import voter files directly. Politogy ingests the file, processes it against the master UVP database, and results flow to customer accounts based on geographic scope.

Matching: Primary key = voter registration ID. Secondary = name + DOB + address. Merge rules: Updates Identity Core, Contact Info, Geographic, Vote History. NEVER touches Relationship, Campaign, Petition, or Aggregate Intelligence. Versioning: Every import is versioned with prior-state snapshots. Diffable. Signal detection: When voter file contradicts user-entered data, the discrepancy is flagged for Politogy review. User value stays primary.

3.2 User-Generated Data

The richest data source. Every note, tag, door knock, survey response, and contact update flows into the UVP. Always immediate — no batch processing. Visible to same-account users instantly. Invisible to other accounts. Visible to Politogy always.

User-generated data is the highest-value input because it represents real human interaction with real voters. A canvasser logging sentiment is a data point no voter file or enrichment provider can deliver.

Conflict behavior: User edits promote to primary; voter file value demotes to alternate, never deleted. Cross-account convergence (multiple accounts updating same voter’s phone to same new value) is strong signal.

3.3 Third-Party Enrichment

Augments UVPs with external data: phones, emails, demographics, consumer data. Politogy-tier operation. Fills gaps, never overwrites. Carries confidence score and last-enriched timestamp. Politogy controls which enrichment fields customers see.


4. The Two-Tier Data Architecture

Most important architectural concept in VRM.

DimensionPolitogy Tier (Internal)Customer Tier (Software Users)
WhoPolitogy employees, backend adminsCampaign staff, consultants, field volunteers
SeesALL data, ALL accounts, ALL layersOnly what Politogy chooses to expose, within account scope
ControlsExposure settings, field visibility, enrichment, imports, global schemaTheir own tags, notes, contact logs, custom fields, team permissions
OwnsThe UVP and all data in itLicenses access to a curated view

4.1 Data Exposure Control System

Politogy admins control, in real time, which data fields are visible to customer-tier users. Field-level granularity:

Exposure LevelBehavior
Always VisibleShown to all customer-tier users in all accounts
Tier-GatedVisible only to customers on specific subscription tiers
Account-SpecificVisible only to explicitly authorized accounts
Internal OnlyNever shown to any customer-tier user
EmbargoedTemporarily hidden pending review or QA

Real-time switching required. No deployments, no cache flushes. Customer-tier users never see evidence of hidden fields — no blank fields, no lock icons. Their view is seamless.

4.2 Data Monetization

Revenue from data that customers can’t build alone:

  • Aggregated Sentiment Data: Cross-account canvass sentiment aggregated per voter
  • Behavioral Propensity Scores: Likelihood to vote, respond to door knock, switch parties, sign petition
  • Contact Freshness: Multi-account confirmation scoring
  • Enrichment Passthrough: Wholesale enrichment resold at field-level margin

Compliance: ToS and privacy policy must state that anonymized/aggregated data may be used commercially. Individual account data never shared with other customers in identifiable form.


5. The Field System

Balances data cleanliness (valuable at scale) with user flexibility (keeps data flowing).

5.1 Global Fields (Politogy-Controlled)

Canonical data fields defined by Politogy. Users cannot rename, delete, or redefine them.

  • Locked canonical name (e.g., voter_first_name), display label, data type, validation rules
  • Display label adjustable by Politogy; canonical name never changes
  • Appear in every customer account automatically
  • Users CAN write to global fields (subject to priority rules). Cannot change definition.

5.2 User Custom Fields

Users create their own fields for campaign-specific needs. Essential for flexibility.

  • Scoped to the creating account. Other accounts can’t see them.
  • User defines name, data type (text, number, date, yes/no, dropdown, multi-select), optional description
  • Stored in separate namespace from global fields, tagged as user-generated with full provenance
  • Visible to Politogy even though scoped to one account. Politogy sees all custom fields across all accounts.

5.3 Field Promotion: Custom → Global

The feedback loop that makes VRM’s schema smarter over time. Politogy monitors custom field usage across all accounts. When patterns emerge (many accounts creating the same field), Politogy promotes it to global.

Promotion Process:

  1. Detection: Dashboard shows custom fields across accounts, grouped by similarity. System flags candidates at threshold (e.g., 10+ accounts with similar name/type).
  2. Review: Politogy admin defines canonical name, display label, data type, validation.
  3. Migration: Existing custom field data maps into new global field. Custom fields retired and replaced.
  4. Rollout: New global field appears in ALL accounts going forward.

Example: 35 accounts create “preferred name” / “nickname” / “goes by” → Politogy promotes to voter_preferred_name globally.

Field promotion is one-way. Once global, always global. Data subject to same exposure controls as any other global field.

5.4 Why This Matters

Without this system: “First Name,” “Name,” “First,” “first_name,” “FName” — ungovernable and unsellable. The solution: let users create freely, watch what they create, promote valuable patterns into canonical schema. Custom field namespace = sandbox. Global field schema = production. Promotion = quality gate.


6. Cross-Mode Data Flow

Three modes — Relationship, Campaign, Petition — are different lenses on the SAME UVP. Actions in one mode are immediately visible in all others.

NON-NEGOTIABLE: There is ONE voter profile. Modes are views, not copies. If code duplicates voter data between modes, the architecture is wrong. Every mode reads from and writes to the same UVP.


7. Search, Filter & Segmentation

Accessible from any screen. Searches all visible UVP fields. Sub-second response against full database. Handles fuzzy matching. Results respect access scope.

7.2 Filtering & List Building

Filterable dimensions: Geographic, Registration, Vote History, Relationship, Campaign, Petition, Custom Fields. Filters stack (AND default, OR available). Saved filters = living segments with dynamic membership.


8. Data Quality & Hygiene

Automated Signals

  • Stale record detection (12+ months without voter file update)
  • USPS address validation
  • Duplicate detection with user-assisted merge
  • Status change alerts (party switches, active→inactive)
  • Cross-account consistency checks (conflicting user data across accounts)

Data Health Dashboard

Two versions: Customer-tier (simplified — totals, completeness, last refresh). Politogy-tier (full — cross-account conflicts, enrichment coverage, field promotion candidates, quality scores, audit logs).


9. Data Lifecycle & Retention

Record States

  1. Active: Current registration. Default working state.
  2. Inactive: Registration lapsed/suspended. Retained, excluded from default lists.
  3. Deceased: Flagged via voter file or enrichment. Archived, never deleted.
  4. Moved Out of Jurisdiction: Address change outside scope. Treated like Inactive.
  5. Merged: Duplicate consolidated. Pointer to surviving record.

Retention Rules

  • No voter record is ever hard-deleted.
  • User-generated data retained indefinitely, tied to UVP.
  • Churned customer account data retained by Politogy. Continues feeding Aggregate Intelligence.

10. Import & Export

Customer-Tier Import

Supplemental CSVs, walk list results, donor lists, petition sheets. Match against existing UVPs. Unmatched records flagged, never auto-created.

Customer-Tier Export

Filtered segments to CSV/PDF. Respects role + geographic scope. Logged. Contains ONLY customer-visible data.

Politogy-Tier Extraction

Full UVP exports, cross-account aggregates, enrichment audits, field usage analytics. Basis for data sales.


11. AI & Intelligent Data Operations

AI transforms raw data into the Aggregate Intelligence layer. The data architecture in this spec is designed to feed AI from day one.

Principle: Build the tracks, not the train. The AI models are the train. The data architecture is the tracks. Clean normalized data, full provenance, cross-account visibility, a dedicated computed-data layer, and an event stream = tracks any AI model can run on.

11.1 AI-Ready Architecture Requirements

The following properties make the data AI-ready:

  • Clean normalized data: Global field system ensures consistent structure across all accounts.
  • Full provenance: Every data point carries source, timestamp, confidence. AI weights inputs by reliability.
  • Cross-account visibility: Politogy trains models on ALL accounts simultaneously. This is the data moat.
  • Aggregate Intelligence container: Schema-flexible layer accepting key-value pairs with full provenance metadata. Same exposure controls as any other layer.
  • Exposure-controlled output: AI-generated fields go through the same Data Exposure Control System. Monetization and AI infrastructure are the same system.

11.2 Planned AI Capabilities (Priority Order)

  1. Custom Field Normalization & Promotion Detection: AI clusters semantically similar custom fields across accounts. “2A stance” = “gun rights position”? Surfaces promotion candidates with confidence scores.
  2. Sentiment Extraction from Free Text: Extracts structured sentiment from canvass notes. “Very concerned about property taxes, will vote for anyone who cuts them” → issue: property taxes, position: wants cuts, intensity: strong. Feeds Aggregate Intelligence.
  3. Data Quality & Anomaly Detection: Likely duplicates (Robert vs. Bob), phone numbers on too many records, addresses geocoding to commercial buildings, user-vs-voter-file divergence patterns.
  4. Voter Propensity Scoring: Likelihood to vote, respond to door knock, switch parties, sign petition, estimated persuadability by issue. The premium sellable data product.
  5. Contact Freshness Scoring: Multi-account interaction history scores contact field accuracy. Recently-confirmed phone > three-year-old voter file phone.
  6. Predictive Segmentation: AI identifies natural voter clusters and suggests segments users haven’t built. Drives engagement, generates more data.

11.3 How AI Writes to the UVP

AI-generated values stored in Aggregate Intelligence layer with:

  • Source: Specific model + version (e.g., sentiment-extractor-v2.1)
  • Inputs: Which data points the model consumed, with lineage
  • Confidence: Model confidence score
  • Timestamp: When last computed (staleness clock)
  • Refresh cadence: Defined recomputation schedule per capability

AI values follow same priority rules — never overwrite user-entered or voter file data. Subject to same exposure controls.

11.4 Infrastructure to Build Now (Pre-AI)

These must be in the foundational architecture from day one:

  1. Aggregate Intelligence layer on the UVP: Schema-flexible, key-value with provenance, same exposure controls.
  2. Computed field data type: Read-only for all users (including Politogy admins). Only writable by system processes.
  3. Event stream / change log: Every UVP write emits an event for downstream AI consumers. Enables real-time model updates without full database scans.
  4. Custom field metadata API: Surfaces all custom fields across all accounts with usage counts, creation dates, sample values. Input for field normalization AI.
  5. Model output versioning: When models update, all affected values recompute. Old values archived, not deleted. Track which model version generated which values.

12. Permissions Summary

RoleTierCan ViewCan EditCan Import/Export
Politogy Super AdminPolitogyEverything across all accountsAll data, controls, exposure, schemaFull extraction + import
Politogy AnalystPolitogyAll data, read-onlyNothingExtraction only
Account AdminCustomerAll exposed data within accountUser-generated fields, custom fields, team permsImport + export within scope
Account ManagerCustomerExposed data within assigned geographyRelationship + Campaign + Petition dataExport within scope
Field UserCustomerVoters on assigned lists onlyContact logs, surveys, canvass resultsNo
ViewerCustomerRead-only within assigned geographyNothingNo

13. Summary Mental Model

VRM is a data company that distributes campaign software. State voter files create the foundation — millions of UVPs owned by Politogy. Each profile is enriched with third-party data. Customers interact through three modes, generating the highest-value data: real human interactions with real voters. User data always takes priority (freshest intel). Politogy watches everything — aggregating sentiment, detecting patterns, scoring propensities, promoting custom fields into canonical schema. AI transforms raw intelligence into structured, scored, sellable data products no single campaign could build alone.

The data exposure system controls what customers see. Basic accounts get essentials. Premium accounts get richer data. National campaigns pay top dollar for Aggregate Intelligence. Every piece of data, from every source, from every account, flows into the UVP — the single source of truth that grows more valuable with every interaction.

One platform. One login. Total control. And underneath it all, the most comprehensive voter data asset in the country.