Self-Hosted Payment Gateway (PCI DSS & GDPR)
Architected a highly secure, self-hosted payment orchestrator to maintain data sovereignty and multi-processor routing.
01.Problem Statement
Initially, our platform was entirely dependent on Stripe, which offered limited merchant support outside of Western markets. This prevented us from expanding into high-growth regions like Indonesia where Stripe's local acceptance was poor. To scale globally, we needed to move away from vendor-specific hosted fields and implement our own orchestration layer that could route to multiple local PSPs (like Xendit) while maintaining a single, secure source of truth for card data.
02.Architecture Overview
The solution was built around a centralized, self-hosted Card Vault that tokenizes cardholder data independently of any specific payment provider. This architectural decoupling allows us to 'vault once' and then dynamically route transactions to the most effective local processor—routing Indonesian payments through Xendit's rails while maintaining Stripe for Western transactions—all without the customer ever re-entering their data or the platform's core databases ever seeing a raw PAN.
[ Client Devices ] ---> (PCI Scope)
|
v
[ API Gateway (WAF) ]
|
v
[ Chi Backend Core ] --> [ Payment Orchestrator ]
| |
v v
[ Encryption Svc ] [ Card Vault (Rust) ]
(Rust) |
v
[ Isolated Vault DB ]
03.Database Design
The main operational database structure is agnostic to payment info. A completely air-gapped PostgreSQL instance is tightly bound to the Card Vault. PANs are encrypted at rest. We utilize deterministic AES encryption so that Card Fingerprints can be queried to prevent duplicate card additions without decrypting the payload.
04.Key Decisions & Tradeoffs
Decisions
- Opted to self-host the orchestration layer's Card Vault within a physically isolated VPC subnet with zero outbound internet access, minimizing the blast radius of any potential compromise.
- Integrated a Rust-built Encryption Service utilizing AES-256-GCM via GCP KMS keys to proactively encrypt PII payloads. I personally contributed to the Hyperswitch open-source ecosystem by implementing the GCP KMS encryption provider.
- Expanded the orchestrator's capability to support local Southeast Asian markets; I authored and upstreamed the Xendit payment processor integration to the core routing engine.
- Used a strictly decoupled multi-database pattern. The primary operational DB stores only opaque transaction references, while the secure Vault DB handles cryptographic PAN mappings.
Tradeoffs
- Shouldered the immense compliance burden of an in-house PCI DSS Level 1 audit instead of delegating entirely to Stripe Elements, forcing strict CI/CD and infrastructural auditing.
- Increased local developmental friction. Replicating the production environment requires developers to run 5 heavy services (Vault, Router, Encryption, multiple DBs) locally.
05.Scaling Considerations
The stateless encryption and routing tier scales horizontally on Kubernetes based solely on CPU metrics. The Card Vault relies on heavily tuned PgBouncer instances for multiplexed connection pooling to PostgreSQL to handle high concurrent token exchanges during sales spikes.
06.Failure Scenarios & Mitigation
- Vault DB Failure: Automatic failover to a synchronous Hot Standby replica within 5 seconds. Checkout API handlers employ exponential backoff if they receive 503s.
- Encryption Service Down: PII-sensitive endpoints immediately hard-fail, rejecting new payment methods to strictly avoid writing unencrypted data to temporary memory or logs.
- Upstream PSP Outage: The payment router utilizes volume-based circuit breakers. If Stripe errors consistently, traffic is instantly redistributed to Adyen or Braintree.
07.Engineering Challenges
- Ensuring exactly-once payment execution. We implemented a unified idempotency layer that protects users from being double-charged during network jitters or upstream PSP timeouts, regardless of which local processor the payment is routed to.
- Proactive PII/Card data leakage prevention. We architected a high-performance regex-based middleware that proactively sanitizes all outgoing logs across the distributed system, preventing sensitive data from ever reaching our ELK stack.
- Latency budgets: Routing, encrypting, tokenizing, and communicating with upstream PSPs had to execute under 1000ms to prevent checkout abandonment.
08.Implementation Subsystem
import { Injectable } from '@nestjs/common';
import { PaymentOrchestratorClient } from '@payments/api';
import { EncryptionService } from './encryption.service';
@Injectable()
export class PaymentService {
constructor(
private orchestrator: PaymentOrchestratorClient,
private encryptionSvc: EncryptionService
) {}
async processPayment(orderId: string, customerData: SensitiveData) {
// 1. Encrypt PII before it hits any DB or external logging
const encryptedCustomer = await this.encryptionSvc.encrypt(customerData);
// 2. Instruct orchestrator to vault and route the payment
const paymentIntent = await this.orchestrator.payments.create({
amount: 15000,
currency: 'USD',
customer_id: encryptedCustomer.referenceId,
routing_algorithm: {
type: "cost_optimized",
fallback: ["stripe", "braintree"]
},
capture_method: "automatic"
});
return paymentIntent.client_secret;
}
}Impact & Outcome
Successfully diverted 100% of payment volume across multiple processors, achieving full self-hosted PCI DSS compliance and reducing vendor fees.