A Dutch B2B SaaS platform serving 400,000 users across Europe relied on email for critical user-facing operations: two-factor authentication codes, password resets, billing alerts, and trial expiration warnings. These transactional messages had hard delivery requirements — a 2FA code that arrives 8 minutes after request is useless.
The platform was sending approximately 2 million transactional messages and 8 million marketing emails per month through a single Postfix-based SMTP relay configured without traffic separation. The problem manifested quarterly when large marketing campaigns were sent: transactional delivery times spiked from a normal average of 45 seconds to 12–18 minutes, and average trial conversion during these windows dropped measurably.
Presenting Problems
- 2FA codes averaging 12–18 minutes delivery during marketing campaign windows (SLA requirement: under 2 minutes)
- Password reset emails delayed by 6–11 minutes — support ticket volume spiked 340% after each campaign send
- Transactional and marketing email sharing the same Postfix queue with no priority differentiation
- No per-ISP throttling — Postfix was attempting maximum connections simultaneously, triggering ISP rate limiting that then affected transactional messages
- Marketing campaign reputation events (complaint spikes) occasionally blocking the IP for 2–4 hours across all traffic types
We instrumented the existing Postfix queue with custom logging to measure per-message delivery latency during a live campaign send. The data showed the problem clearly: the Postfix queue was processing messages in FIFO order. When a 500,000-message campaign entered the queue simultaneously with 2FA requests, 2FA messages waited behind 12,000–40,000 campaign messages for queue processing.
The secondary problem was ISP throttling: sending 500,000 messages to Gmail in a burst caused Gmail to apply connection rate limits that then throttled the entire IP for 3–5 hours, affecting transactional delivery even after the campaign queue cleared.
Transactional Delivery Time — Campaign vs Non-Campaign Windows
The solution required complete traffic class separation at the infrastructure level — not just priority queuing within a shared system. A shared queue with priorities still produces contention; isolated infrastructure eliminates it entirely.
Transactional Delivery Time (seconds) — 12 Weeks
(from 780s during campaigns)
within 2-minute SLA
email delivery delays
transactional delivery
We'd been treating delivery delay as a capacity problem — just needed a bigger server. The infrastructure team showed us it was an architecture problem. The fix wasn't more resources, it was separate lanes. The support ticket reduction paid for the infrastructure change in the first month.
Technical Assessment: Infrastructure Layers Examined
The infrastructure assessment for this engagement covered four layers: authentication configuration (SPF, DKIM, DMARC alignment), IP reputation status (Postmaster Tools, SNDS, blacklist check), PowerMTA configuration review (domain blocks, throttle settings, bounce handling), and operational practices (list hygiene frequency, bounce processing latency, FBL enrollment and processing status).
Authentication issues were the highest-priority finding. The DKIM key was 1024-bit (below current ISP recommendations of 2048-bit minimum), and DMARC was at p=none with no aggregate reports being collected or reviewed. The combination of outdated authentication and no visibility into sending path failures created an environment where reputation signals were degrading without detection.
Infrastructure Rebuild: Configuration Decisions
IP Pool Architecture
The IP pool was rebuilt with traffic type separation as the primary design principle. Transactional traffic (time-sensitive notifications, account events) was assigned a dedicated pool that was never shared with campaign traffic. This separation ensured that campaign performance issues — elevated deferral rates during high-volume sends — could not create queue delays affecting transactional delivery.
| Pool | Traffic Type | IPs | max-smtp-out | Protection Level |
|---|---|---|---|---|
| trans-pool | Transactional notifications | 2 | 10 per IP | Highest — never paused or degraded |
| campaign-pool | Marketing campaigns | 3-4 | 8 per IP | Standard — subject to reputation management |
| warming-pool | New IP warming | As needed | 2-3 per IP | Conservative — warming schedule only |
PowerMTA Domain Block Configuration
ISP-specific domain blocks were configured for each major destination: Gmail (max-smtp-out: 8, retry-after: 15m), Outlook (max-smtp-out: 5, retry-after: 20m), Yahoo (max-smtp-out: 6, retry-after: 15m), and ISP-specific configurations for European providers including GMX, Web.de, T-Online, and OVH. Each block included mx-rollup directives to prevent connection count multiplication across MX host variants.
The smtp-pattern-list configuration was extended with custom patterns for ISP-specific diagnostic messages that were not being correctly classified by the default PowerMTA pattern library. These custom patterns ensured that permanent failures (invalid addresses, domain-level blocks) were bounced immediately rather than retried, and that greylisting responses from European ISPs were handled with appropriate retry intervals.
Authentication Upgrade
DKIM keys were rotated to 2048-bit RSA on all sending domains. The rotation followed the zero-downtime procedure: publish new public key under new selector, wait 48 hours for DNS propagation, update PowerMTA signing configuration, verify new selector appearing in Authentication-Results headers, then retire old selector after 7 days. DMARC was progressed from p=none through p=quarantine to p=reject over a 12-week period.
Results After 90 DaysSeed test improvement
All major ISPs
Gmail
All domains
Operational Monitoring: What Changed Permanently
The infrastructure changes produced immediate delivery improvement, but the operational changes — the monitoring discipline and response protocols — are what sustain that improvement over time. Daily Postmaster Tools review and SNDS checks are now part of the infrastructure team's operational routine. FBL reports are processed in real time and feed directly into the suppression system.
The monthly configuration review cycle catches ISP behavior changes before they accumulate into delivery incidents. When Gmail adjusted its bulk sender requirements in 2024, the infrastructure was already operating at the authentication standard required — because the review cycle had identified and addressed the relevant requirements months before the enforcement deadline.
The technical changes in this engagement were straightforward. The more significant work was establishing the monitoring discipline that prevents the gradual drift that caused the original problems — an infrastructure that meets today's ISP requirements but has no ongoing review process will fall behind those requirements within 12-18 months.
— Cloud Server for Email Infrastructure TeamTransactional email delays during marketing sends?
This is an infrastructure architecture problem — not a sending volume problem. We design and operate separated sending environments that guarantee transactional SLAs regardless of marketing activity.

