United States · Enterprise SaaS · Case Study

US Enterprise SaaS Platform: Migrating 8.4M Subscribers From Shared ESP to Dedicated Infrastructure

United States Enterprise SaaS Q3 2025 Cloud Server for Email Infrastructure
← Back to Case Studies
8.4M
Subscribers Migrated
41%→94%
Gmail Inbox Placement
−71%
Per-Message Infrastructure Cost
18 IPs
Dedicated Sending Pool

Shared ESP at scale: the economics and deliverability break simultaneously

A San Francisco-based enterprise SaaS company with 8.4 million email subscribers had been on a major shared ESP platform for six years. At their volume — averaging 22 million messages per month — the commercial ESP cost had become the third-largest line item in their engineering budget. More critically, Gmail inbox placement had deteriorated to 41% over the preceding 14 months, driven by co-tenant spam activity on shared IP pools that the ESP could not isolate on their shared-infrastructure tier.

The finance team flagged the cost. The deliverability team flagged the inbox rates. The decision to migrate was made in the same quarter.

18-IP pool structure across US and EU datacenters

Given the volume and geographic distribution of recipients (62% US, 31% EU, 7% other), the dedicated infrastructure was deployed across two datacenter regions with ISP-specific routing:

# US datacenter — primary routing virtual-mta-pool gmail-us { virtual-mta us-ip-1 through us-ip-6 # 6 IPs } virtual-mta-pool microsoft-us { virtual-mta us-ip-7 through us-ip-10 # 4 IPs } virtual-mta-pool yahoo-aol-us { virtual-mta us-ip-11 through us-ip-12 # 2 IPs } # EU datacenter — European ISP routing virtual-mta-pool eu-isps { virtual-mta eu-ip-1 through eu-ip-4 # 4 IPs (GMX, T-Online, Orange, others) } # Transactional — protected pool, minimal volume virtual-mta-pool transactional { virtual-mta trans-ip-1 through trans-ip-2 # 2 IPs }

16-week migration to full dedicated operation

The migration proceeded using a percentage-of-traffic routing approach: weeks 1–4 at 5% new infrastructure, escalating 10% per week until week 14 when all traffic shifted to dedicated. The old ESP was maintained in parallel and cancelled at week 16 after confirming stable delivery metrics.

Gmail inbox placement recovered from 41% to 94% over the migration period as new dedicated IPs built reputation through the engagement-led warming protocol. Monthly infrastructure cost at the conclusion was 29% of the original shared ESP contract — a 71% reduction on a per-message basis.

Inbox Placement by Major ISP

Before migration vs 90 days after
GmailOutlookYahooApple MailOther ■ Before ■ After
Scale and Cost Findings At 22 million messages per month, the economics of dedicated infrastructure become compelling independently of the deliverability argument. The monthly fixed cost of a fully managed dedicated environment — including 18 IPs, two datacenter regions, and full operational management — was substantially lower than the per-message pricing of any commercial ESP at that volume. The deliverability improvement was, in this case, an additional benefit rather than the primary driver.

Technical Assessment: Infrastructure Layers Examined

The infrastructure assessment for this engagement covered four layers: authentication configuration (SPF, DKIM, DMARC alignment), IP reputation status (Postmaster Tools, SNDS, blacklist check), PowerMTA configuration review (domain blocks, throttle settings, bounce handling), and operational practices (list hygiene frequency, bounce processing latency, FBL enrollment and processing status).

Authentication issues were the highest-priority finding. The DKIM key was 1024-bit (below current ISP recommendations of 2048-bit minimum), and DMARC was at p=none with no aggregate reports being collected or reviewed. The combination of outdated authentication and no visibility into sending path failures created an environment where reputation signals were degrading without detection.

Infrastructure Rebuild: Configuration Decisions

IP Pool Architecture

The IP pool was rebuilt with traffic type separation as the primary design principle. Transactional traffic (time-sensitive notifications, account events) was assigned a dedicated pool that was never shared with campaign traffic. This separation ensured that campaign performance issues — elevated deferral rates during high-volume sends — could not create queue delays affecting transactional delivery.

PoolTraffic TypeIPsmax-smtp-outProtection Level
trans-poolTransactional notifications210 per IPHighest — never paused or degraded
campaign-poolMarketing campaigns3-48 per IPStandard — subject to reputation management
warming-poolNew IP warmingAs needed2-3 per IPConservative — warming schedule only

PowerMTA Domain Block Configuration

ISP-specific domain blocks were configured for each major destination: Gmail (max-smtp-out: 8, retry-after: 15m), Outlook (max-smtp-out: 5, retry-after: 20m), Yahoo (max-smtp-out: 6, retry-after: 15m), and ISP-specific configurations for European providers including GMX, Web.de, T-Online, and OVH. Each block included mx-rollup directives to prevent connection count multiplication across MX host variants.

The smtp-pattern-list configuration was extended with custom patterns for ISP-specific diagnostic messages that were not being correctly classified by the default PowerMTA pattern library. These custom patterns ensured that permanent failures (invalid addresses, domain-level blocks) were bounced immediately rather than retried, and that greylisting responses from European ISPs were handled with appropriate retry intervals.

Authentication Upgrade

DKIM keys were rotated to 2048-bit RSA on all sending domains. The rotation followed the zero-downtime procedure: publish new public key under new selector, wait 48 hours for DNS propagation, update PowerMTA signing configuration, verify new selector appearing in Authentication-Results headers, then retire old selector after 7 days. DMARC was progressed from p=none through p=quarantine to p=reject over a 12-week period.

Gmail Inbox Placement
Before
62%
After
93%

Seed test improvement
Deferral Rate
Before
14%
After
2.8%

All major ISPs
Hard Bounce Rate
Before
3.2%
After
0.7%

Gmail
DMARC Alignment
Before
88%
After
99.6%

All domains

Operational Monitoring: What Changed Permanently

The infrastructure changes produced immediate delivery improvement, but the operational changes — the monitoring discipline and response protocols — are what sustain that improvement over time. Daily Postmaster Tools review and SNDS checks are now part of the infrastructure team's operational routine. FBL reports are processed in real time and feed directly into the suppression system.

The monthly configuration review cycle catches ISP behavior changes before they accumulate into delivery incidents. When Gmail adjusted its bulk sender requirements in 2024, the infrastructure was already operating at the authentication standard required — because the review cycle had identified and addressed the relevant requirements months before the enforcement deadline.

The technical changes in this engagement were straightforward. The more significant work was establishing the monitoring discipline that prevents the gradual drift that caused the original problems — an infrastructure that meets today's ISP requirements but has no ongoing review process will fall behind those requirements within 12-18 months.

— Cloud Server for Email Infrastructure Team

Long-Term Infrastructure Management and Lessons

The infrastructure improvements achieved in this engagement represent a point-in-time improvement, not a permanent outcome. Email deliverability is an ongoing operational discipline — ISP filtering systems evolve, list composition changes with growth, and the configuration settings that are optimal today may need adjustment in six months. The monitoring and review processes established during this engagement are what sustain the improved performance over time.

Key ongoing practices established: daily Postmaster Tools and SNDS review integrated into the operations team's monitoring dashboard, real-time FBL complaint processing feeding directly into the suppression system, quarterly DKIM key rotation cadence, and monthly ISP-specific configuration review against current best practices. These practices take less time than a single delivery incident response — and they prevent the incidents.

The Compounding Effect of Clean Infrastructure

One of the less-visible benefits of well-managed dedicated infrastructure is that it compounds over time. ISP reputation systems give weight to consistent historical behavior — a sender with 18 months of clean sending history recovers from a single incident faster than a sender with inconsistent history. The reputation capital built over time becomes a form of infrastructure resilience that is not visible in day-to-day metrics but matters significantly during incidents.

Transferable Principles From This Engagement

  • Traffic type isolation (transactional vs marketing vs cold) should be implemented before volume grows to the point where reputation events in one stream affect others — not after
  • Authentication upgrades (DKIM key rotation, DMARC enforcement progression) have near-zero operational risk when sequenced correctly — but generate significant risk when rushed
  • Bounce processing latency is the most-overlooked list hygiene factor — every hour of delay between a hard bounce and suppression is another potential send to an invalid or trap address
  • ISP-specific throttle configuration must be calibrated to your current reputation tier, not to a target tier — over-ambitious settings at low reputation delay recovery rather than accelerating it
Similar challenges in your infrastructure?

The infrastructure patterns in this case study recur across different sender types and volumes. A technical assessment identifies which apply to your environment and what the remediation sequence looks like for your specific configuration.

Similar infrastructure challenges?

Contact the technical team to discuss your specific situation. We assess each environment individually before recommending an architecture.