Spec map — the foundation everything routes through
Home: All V2 Specs · Schema: Field Schema Consumed by: Tribes, Contacts, Polling Engine, Voter Contact, Forms, Surveys
This skill contains the complete conceptual architecture for Politogy VRM’s foundational data system. It defines how voter data enters, flows through, and is controlled within VRM. It is the single source of truth for all data-related engineering decisions.
When building from this spec: This document tells you HOW THE SYSTEM THINKS. For specific field names, data types, and schema details, read references/field-schema.md. For brand/design standards, use the politogy-vrm-brand skill.
1. Core Identity: VRM Is a Data Company
VRM is not a campaign management company. VRM is a big data company. The campaign management features — Relationship Mode, Campaign Mode, Petition Mode — are the hooks that bring users onto the platform. The data those users generate, combined with voter file data and third-party enrichment, is the primary asset and long-term value of the company.
Every design decision must serve the collection, enrichment, and control of that data. The system has two masters: it must be genuinely excellent campaign software (because usage generates data), and it must be world-class data infrastructure that collects, normalizes, and controls access to every piece of information flowing through it.
2. The Unified Voter Profile (UVP)
Everything in VRM orbits the Unified Voter Profile (UVP) — the canonical representation of one voter. Every feature, every mode, every interaction across every customer account writes data back to a UVP.
A UVP is Politogy’s record, not the customer’s. Customers interact with voter profiles through the software, but the underlying UVP belongs to Politogy. Customers see a curated view. Politogy sees everything.
2.1 Data Layers
A UVP is composed of nine data layers. Each has a defined source, update cadence, and visibility rules.
| Layer | Contents | Source | Default Visibility |
|---|---|---|---|
| Identity Core | Legal name, DOB, voter reg ID, party, reg status, reg date | State voter file | All users |
| Contact Info | Mailing address, residential address, phone(s), email(s) | Voter file + enrichment + user input | Controlled per field |
| Geographic | County, precinct, legislative districts, city/ward, school district, congressional district | Voter file + geocoding | All users |
| Vote History | Elections participated, method (mail/in-person/drop-off), ballot status | State voter file | All users |
| Enrichment | Estimated demographics, household composition, consumer data, social handles, donor history | Third-party providers | Controlled per field |
| Relationship Data | Tags, notes, contact log, relationship score, custom fields | Customer input (Relationship Mode) | Originating account only |
| Campaign Data | Walk/call lists, doors knocked, calls, survey responses, canvass outcomes | Customer input (Campaign Mode) | Originating account only |
| Petition Data | Signature status, petition assignments, follow-up, verification | Customer input (Petition Mode) | Originating account only |
| Aggregate Intelligence | Cross-account behavioral patterns, sentiment scoring, propensity models, engagement trends | Politogy-computed from all inputs | Politogy internal only (sellable asset) |
The Aggregate Intelligence layer is what makes Politogy a data company. It’s computed from patterns across ALL customer accounts. No single customer can replicate it.
2.2 Data Priority Rules
When the same field has values from multiple layers, conflicts resolve by strict priority:
- User-entered data wins. User input becomes the display value. Previous values are preserved as alternates, never deleted. User corrections are also signal — if a user changes a phone number, the voter file’s number may be stale across the board.
- Enrichment fills gaps. Only appears when no higher-priority source has a value. Recedes to alternate when a user or voter file value arrives.
- Voter file is the baseline. Authoritative for registration-level data (party, status, districts, vote history) but secondary for contact info. Never overwritten.
The user never thinks about this hierarchy. They see the best available value. The system tracks the full provenance stack.
2.3 Provenance Tracking
Every value on a UVP carries metadata:
- Source: Voter file, enrichment provider name, user ID + account ID, or Politogy-computed
- Timestamp: When recorded or last confirmed
- Confidence: For enrichment data, the provider’s confidence score
- Superseded by: If overridden, which value replaced it and when
Provenance is internal infrastructure. Customers don’t see it. It makes VRM’s data defensible and sellable.
3. Data Ingestion
Data enters VRM through three channels.
3.1 State Voter File Import
Foundational data load. Bulk dataset from Secretary of State. Populates Identity Core, Contact Info, Geographic, and Vote History layers.
This is a Politogy-tier operation. Customers never import voter files directly. Politogy ingests the file, processes it against the master UVP database, and results flow to customer accounts based on geographic scope.
Matching: Primary key = voter registration ID. Secondary = name + DOB + address. Merge rules: Updates Identity Core, Contact Info, Geographic, Vote History. NEVER touches Relationship, Campaign, Petition, or Aggregate Intelligence. Versioning: Every import is versioned with prior-state snapshots. Diffable. Signal detection: When voter file contradicts user-entered data, the discrepancy is flagged for Politogy review. User value stays primary.
3.2 User-Generated Data
The richest data source. Every note, tag, door knock, survey response, and contact update flows into the UVP. Always immediate — no batch processing. Visible to same-account users instantly. Invisible to other accounts. Visible to Politogy always.
User-generated data is the highest-value input because it represents real human interaction with real voters. A canvasser logging sentiment is a data point no voter file or enrichment provider can deliver.
Conflict behavior: User edits promote to primary; voter file value demotes to alternate, never deleted. Cross-account convergence (multiple accounts updating same voter’s phone to same new value) is strong signal.
3.3 Third-Party Enrichment
Augments UVPs with external data: phones, emails, demographics, consumer data. Politogy-tier operation. Fills gaps, never overwrites. Carries confidence score and last-enriched timestamp. Politogy controls which enrichment fields customers see.
4. The Two-Tier Data Architecture
Most important architectural concept in VRM.
| Dimension | Politogy Tier (Internal) | Customer Tier (Software Users) |
|---|---|---|
| Who | Politogy employees, backend admins | Campaign staff, consultants, field volunteers |
| Sees | ALL data, ALL accounts, ALL layers | Only what Politogy chooses to expose, within account scope |
| Controls | Exposure settings, field visibility, enrichment, imports, global schema | Their own tags, notes, contact logs, custom fields, team permissions |
| Owns | The UVP and all data in it | Licenses access to a curated view |
4.1 Data Exposure Control System
Politogy admins control, in real time, which data fields are visible to customer-tier users. Field-level granularity:
| Exposure Level | Behavior |
|---|---|
| Always Visible | Shown to all customer-tier users in all accounts |
| Tier-Gated | Visible only to customers on specific subscription tiers |
| Account-Specific | Visible only to explicitly authorized accounts |
| Internal Only | Never shown to any customer-tier user |
| Embargoed | Temporarily hidden pending review or QA |
Real-time switching required. No deployments, no cache flushes. Customer-tier users never see evidence of hidden fields — no blank fields, no lock icons. Their view is seamless.
4.2 Data Monetization
Revenue from data that customers can’t build alone:
- Aggregated Sentiment Data: Cross-account canvass sentiment aggregated per voter
- Behavioral Propensity Scores: Likelihood to vote, respond to door knock, switch parties, sign petition
- Contact Freshness: Multi-account confirmation scoring
- Enrichment Passthrough: Wholesale enrichment resold at field-level margin
Compliance: ToS and privacy policy must state that anonymized/aggregated data may be used commercially. Individual account data never shared with other customers in identifiable form.
5. The Field System
Balances data cleanliness (valuable at scale) with user flexibility (keeps data flowing).
5.1 Global Fields (Politogy-Controlled)
Canonical data fields defined by Politogy. Users cannot rename, delete, or redefine them.
- Locked canonical name (e.g.,
voter_first_name), display label, data type, validation rules - Display label adjustable by Politogy; canonical name never changes
- Appear in every customer account automatically
- Users CAN write to global fields (subject to priority rules). Cannot change definition.
5.2 User Custom Fields
Users create their own fields for campaign-specific needs. Essential for flexibility.
- Scoped to the creating account. Other accounts can’t see them.
- User defines name, data type (text, number, date, yes/no, dropdown, multi-select), optional description
- Stored in separate namespace from global fields, tagged as user-generated with full provenance
- Visible to Politogy even though scoped to one account. Politogy sees all custom fields across all accounts.
5.3 Field Promotion: Custom → Global
The feedback loop that makes VRM’s schema smarter over time. Politogy monitors custom field usage across all accounts. When patterns emerge (many accounts creating the same field), Politogy promotes it to global.
Promotion Process:
- Detection: Dashboard shows custom fields across accounts, grouped by similarity. System flags candidates at threshold (e.g., 10+ accounts with similar name/type).
- Review: Politogy admin defines canonical name, display label, data type, validation.
- Migration: Existing custom field data maps into new global field. Custom fields retired and replaced.
- Rollout: New global field appears in ALL accounts going forward.
Example: 35 accounts create “preferred name” / “nickname” / “goes by” → Politogy promotes to voter_preferred_name globally.
Field promotion is one-way. Once global, always global. Data subject to same exposure controls as any other global field.
5.4 Why This Matters
Without this system: “First Name,” “Name,” “First,” “first_name,” “FName” — ungovernable and unsellable. The solution: let users create freely, watch what they create, promote valuable patterns into canonical schema. Custom field namespace = sandbox. Global field schema = production. Promotion = quality gate.
6. Cross-Mode Data Flow
Three modes — Relationship, Campaign, Petition — are different lenses on the SAME UVP. Actions in one mode are immediately visible in all others.
NON-NEGOTIABLE: There is ONE voter profile. Modes are views, not copies. If code duplicates voter data between modes, the architecture is wrong. Every mode reads from and writes to the same UVP.
7. Search, Filter & Segmentation
7.1 Universal Search
Accessible from any screen. Searches all visible UVP fields. Sub-second response against full database. Handles fuzzy matching. Results respect access scope.
7.2 Filtering & List Building
Filterable dimensions: Geographic, Registration, Vote History, Relationship, Campaign, Petition, Custom Fields. Filters stack (AND default, OR available). Saved filters = living segments with dynamic membership.
8. Data Quality & Hygiene
Automated Signals
- Stale record detection (12+ months without voter file update)
- USPS address validation
- Duplicate detection with user-assisted merge
- Status change alerts (party switches, active→inactive)
- Cross-account consistency checks (conflicting user data across accounts)
Data Health Dashboard
Two versions: Customer-tier (simplified — totals, completeness, last refresh). Politogy-tier (full — cross-account conflicts, enrichment coverage, field promotion candidates, quality scores, audit logs).
9. Data Lifecycle & Retention
Record States
- Active: Current registration. Default working state.
- Inactive: Registration lapsed/suspended. Retained, excluded from default lists.
- Deceased: Flagged via voter file or enrichment. Archived, never deleted.
- Moved Out of Jurisdiction: Address change outside scope. Treated like Inactive.
- Merged: Duplicate consolidated. Pointer to surviving record.
Retention Rules
- No voter record is ever hard-deleted.
- User-generated data retained indefinitely, tied to UVP.
- Churned customer account data retained by Politogy. Continues feeding Aggregate Intelligence.
10. Import & Export
Customer-Tier Import
Supplemental CSVs, walk list results, donor lists, petition sheets. Match against existing UVPs. Unmatched records flagged, never auto-created.
Customer-Tier Export
Filtered segments to CSV/PDF. Respects role + geographic scope. Logged. Contains ONLY customer-visible data.
Politogy-Tier Extraction
Full UVP exports, cross-account aggregates, enrichment audits, field usage analytics. Basis for data sales.
11. AI & Intelligent Data Operations
AI transforms raw data into the Aggregate Intelligence layer. The data architecture in this spec is designed to feed AI from day one.
Principle: Build the tracks, not the train. The AI models are the train. The data architecture is the tracks. Clean normalized data, full provenance, cross-account visibility, a dedicated computed-data layer, and an event stream = tracks any AI model can run on.
11.1 AI-Ready Architecture Requirements
The following properties make the data AI-ready:
- Clean normalized data: Global field system ensures consistent structure across all accounts.
- Full provenance: Every data point carries source, timestamp, confidence. AI weights inputs by reliability.
- Cross-account visibility: Politogy trains models on ALL accounts simultaneously. This is the data moat.
- Aggregate Intelligence container: Schema-flexible layer accepting key-value pairs with full provenance metadata. Same exposure controls as any other layer.
- Exposure-controlled output: AI-generated fields go through the same Data Exposure Control System. Monetization and AI infrastructure are the same system.
11.2 Planned AI Capabilities (Priority Order)
- Custom Field Normalization & Promotion Detection: AI clusters semantically similar custom fields across accounts. “2A stance” = “gun rights position”? Surfaces promotion candidates with confidence scores.
- Sentiment Extraction from Free Text: Extracts structured sentiment from canvass notes. “Very concerned about property taxes, will vote for anyone who cuts them” → issue: property taxes, position: wants cuts, intensity: strong. Feeds Aggregate Intelligence.
- Data Quality & Anomaly Detection: Likely duplicates (Robert vs. Bob), phone numbers on too many records, addresses geocoding to commercial buildings, user-vs-voter-file divergence patterns.
- Voter Propensity Scoring: Likelihood to vote, respond to door knock, switch parties, sign petition, estimated persuadability by issue. The premium sellable data product.
- Contact Freshness Scoring: Multi-account interaction history scores contact field accuracy. Recently-confirmed phone > three-year-old voter file phone.
- Predictive Segmentation: AI identifies natural voter clusters and suggests segments users haven’t built. Drives engagement, generates more data.
11.3 How AI Writes to the UVP
AI-generated values stored in Aggregate Intelligence layer with:
- Source: Specific model + version (e.g.,
sentiment-extractor-v2.1) - Inputs: Which data points the model consumed, with lineage
- Confidence: Model confidence score
- Timestamp: When last computed (staleness clock)
- Refresh cadence: Defined recomputation schedule per capability
AI values follow same priority rules — never overwrite user-entered or voter file data. Subject to same exposure controls.
11.4 Infrastructure to Build Now (Pre-AI)
These must be in the foundational architecture from day one:
- Aggregate Intelligence layer on the UVP: Schema-flexible, key-value with provenance, same exposure controls.
- Computed field data type: Read-only for all users (including Politogy admins). Only writable by system processes.
- Event stream / change log: Every UVP write emits an event for downstream AI consumers. Enables real-time model updates without full database scans.
- Custom field metadata API: Surfaces all custom fields across all accounts with usage counts, creation dates, sample values. Input for field normalization AI.
- Model output versioning: When models update, all affected values recompute. Old values archived, not deleted. Track which model version generated which values.
12. Permissions Summary
| Role | Tier | Can View | Can Edit | Can Import/Export |
|---|---|---|---|---|
| Politogy Super Admin | Politogy | Everything across all accounts | All data, controls, exposure, schema | Full extraction + import |
| Politogy Analyst | Politogy | All data, read-only | Nothing | Extraction only |
| Account Admin | Customer | All exposed data within account | User-generated fields, custom fields, team perms | Import + export within scope |
| Account Manager | Customer | Exposed data within assigned geography | Relationship + Campaign + Petition data | Export within scope |
| Field User | Customer | Voters on assigned lists only | Contact logs, surveys, canvass results | No |
| Viewer | Customer | Read-only within assigned geography | Nothing | No |
13. Summary Mental Model
VRM is a data company that distributes campaign software. State voter files create the foundation — millions of UVPs owned by Politogy. Each profile is enriched with third-party data. Customers interact through three modes, generating the highest-value data: real human interactions with real voters. User data always takes priority (freshest intel). Politogy watches everything — aggregating sentiment, detecting patterns, scoring propensities, promoting custom fields into canonical schema. AI transforms raw intelligence into structured, scored, sellable data products no single campaign could build alone.
The data exposure system controls what customers see. Basic accounts get essentials. Premium accounts get richer data. National campaigns pay top dollar for Aggregate Intelligence. Every piece of data, from every source, from every account, flows into the UVP — the single source of truth that grows more valuable with every interaction.
One platform. One login. Total control. And underneath it all, the most comprehensive voter data asset in the country.