Architecture: Mainframe

Codelooru mainframe

Your bank's ATM runs 24 hours a day. The transaction you just made at the counter was confirmed in under a second. The insurance claim filed this morning will be processed overnight alongside three million others. None of that happens on a Kubernetes cluster. It happens on a mainframe.

For engineers who grew up on distributed systems, the mainframe is a foreign country. The vocabulary is different. The mental model is different. Even the pricing model is different. But the underlying engineering ideas are sound, battle-tested, and in several areas still ahead of what the cloud-native world has caught up to.

This post is the foundation for the entire Mainframe Decoded series. It covers the hardware, the operating system, the workload model, and the clustering architecture. If you are picking up any of the later posts in the series, start here.


The Big Picture

A mainframe is not simply a very large server. It is a purpose-built system designed around one core requirement: run a very large number of short, critical transactions simultaneously, with zero tolerance for data loss and near-zero tolerance for downtime.

The machine itself is called an IBM Z system (the current generation is the z16, announced in 2022). The operating system running on it is z/OS. On top of z/OS live the subsystems that do real work: CICS for online transaction processing, DB2 for relational data, JES for batch jobs, VSAM for file storage, and RACF for security. Each of these gets its own post in this series. This post explains the platform they all run on.

HARDWARE CPs zIIPs IFLs SAPs (I/O) Memory / RAIM z/OS OPERATING SYSTEM Address Spaces WLM PR/SM / LPARs RACF TCP/IP / VTAM SUBSYSTEMS CICS Online TXN DB2 Relational DB JES2 Batch jobs VSAM File storage MQ / TSO / IMS Messaging / Other APPLICATIONS COBOL / PL/I programs Java / Python on zIIP Batch JCL jobs REST / Web services Figure 1: The mainframe stack — applications sit on subsystems, which sit on z/OS, which runs on IBM Z hardware.

The Hardware: IBM Z

IBM Z hardware looks nothing like a rack of servers. It ships in large frames — the z16 stands about two meters tall — and is designed as a single coherent system rather than a collection of independent nodes. The machine you interact with as a single "mainframe" is formally called a Central Processor Complex, or CPC.

Inside the CPC, the processors are not all the same. IBM distinguishes between general-purpose processors and specialty engines, and the distinction matters for both performance and cost.

Central Processors (CPs)

The CP is the general-purpose processor. It runs z/OS and can execute any workload. Software license costs on the mainframe are calculated based on how much work runs on CPs, measured in MSUs (Million Service Units). This means that every MIPS consumed on a CP adds to the monthly bill. Running work on a CP is the most expensive path through the machine.

Specialty Engines

Specialty engines are physically the same processor chips as CPs but are configured by IBM to run specific workload types. Work running on a specialty engine is not counted toward the MSU-based software license charge. This is the mainframe's primary cost optimization lever.

The key specialty engines are:

  • zIIP (z Integrated Information Processor): Originally built to offload DB2 database work from CPs, zIIPs now handle a wide range of modern workloads including Java, Python, XML processing, IPSec encryption, z/OS container workloads, and REST API traffic via z/OS Connect. Most shops running modern workloads heavily exploit zIIPs to keep CP consumption down.
  • IFL (Integrated Facility for Linux): Dedicated to running Linux on IBM Z. An IFL cannot run z/OS. It exists to support Linux workloads without affecting the z/OS software bill.
  • SAP (System Assist Processor): Handles I/O subsystem operations. Every mainframe has at least one SAP. It manages device number translation and I/O path management so the CPs do not have to.

One important constraint: the number of zIIPs you can configure is governed by rules tied to the number of CPs on the system. You cannot simply add unlimited zIIPs to eliminate CP cost. The ratio rules are set by IBM and have changed over time, but the principle of "zIIPs complement CPs, not replace them" holds.

Memory

IBM Z uses RAIM (Redundant Array of Independent Memory), which does for RAM what RAID does for disks. Memory errors are detected and corrected without any interruption to running workloads. A z16 can hold several terabytes of physical memory. This is not a luxury; it is a requirement for running hundreds of isolated address spaces simultaneously without paging to disk.


Logical Partitions: PR/SM and LPARs

Before cloud computing invented virtual machines, mainframes had LPARs. An LPAR (Logical Partition) is a hardware-level partition of the CPC into isolated environments, each of which behaves like an independent machine. The hypervisor that manages LPARs is called PR/SM (Processor Resource/Systems Manager) and operates in firmware, below the operating system.

Each LPAR gets an allocation of processors, memory, and I/O channels. It boots its own operating system instance. A single physical CPC might run four LPARs: one for production z/OS, one for test z/OS, one for Linux on Z via IFLs, and one as a Coupling Facility (more on that shortly). From the perspective of each operating system, it owns its hardware entirely.

LPARs are the primary isolation boundary on a mainframe. They are closer to hardware virtualization than to containers. The firmware enforces the partition boundaries; a software bug in one LPAR cannot corrupt memory in another.

IBM Z — CENTRAL PROCESSOR COMPLEX (CPC) PR/SM Hypervisor (firmware) LPAR 1 Production z/OS CICS / DB2 / JES 8 CPs 4 zIIPs 512 GB memory LPAR 2 Test / Dev z/OS CICS / DB2 / JES 2 CPs 2 zIIPs 128 GB memory LPAR 3 Linux on Z Linux / z/VM Java / Containers 4 IFLs 256 GB memory LPAR 4 Coupling Facility CF firmware Cache / Lock / List ICF processors Figure 2: A single CPC running four LPARs, each with its own OS and processor allocation.

z/OS: The Operating System

z/OS is the flagship operating system for IBM Z mainframes. It is a direct descendant of MVS (Multiple Virtual Storage), first introduced in 1974. The name "Multiple Virtual Storage" explains the core design principle: every program runs in its own virtual address space, isolated from every other program. The name changed to OS/390, then z/OS as the architecture evolved to 64-bit, but the structural identity is the same operating system with five decades of continuous refinement.

z/OS is a monolithic kernel. It is certified as a UNIX operating system by The Open Group and supports POSIX APIs, TCP/IP, SSH, and standard file protocols. But those UNIX-compatible layers sit alongside the mainframe-native interfaces rather than replacing them. A mainframe programmer typically uses both.

Address Spaces

The fundamental unit of isolation in z/OS is the address space. Every program, subsystem, and user session runs in its own address space with its own virtual memory map. Address spaces are protected from each other at the hardware level. A bug in one address space cannot corrupt another.

This is not equivalent to a Unix process, though the concept is similar. Address spaces have a more complex internal structure, with separate memory regions for the operating system, the application, and shared system services. The details matter when you get to CICS storage management (covered in a later post), but for now, think of an address space as a hard container around a unit of work.

z/OS runs dozens to hundreds of address spaces simultaneously. CICS itself runs as one address space (or several, in larger configurations). DB2 runs as a set of address spaces. JES2 runs as a single privileged address space. Each started task, batch job, and user TSO session gets its own.

The Workload Manager

WLM (Workload Manager) is z/OS's policy-based scheduler. Rather than giving the operating system fixed priorities, administrators define service classes with performance goals: "95% of online transactions must complete within 0.3 seconds" or "batch jobs in this class should execute with at least 70% velocity." WLM continuously monitors whether goals are being met and redistributes CPU time across address spaces accordingly.

WLM is also sysplex-aware. In a cluster of mainframes, WLM can route incoming work to whichever LPAR in the cluster has capacity to meet the goal. This is policy-driven load balancing that has no direct equivalent in the distributed systems world, where load balancing is typically a separate infrastructure concern.

JES: The Job Entry Subsystem

JES (Job Entry Subsystem) is z/OS's batch processing engine. When a batch job is submitted, JES accepts it, queues it, schedules it for execution, and manages its output. JES2 is the more common variant; JES3 offers more sophisticated scheduling features but is less widely deployed.

JES maintains a spool: a disk area holding input streams, output, and job metadata for every job in the system, active or queued. Every `SYSOUT` written by a batch job goes to the JES spool. Operators and automated tools use the spool to monitor, hold, release, and purge jobs. JES is discussed in depth in the batch processing post in this series.


The Workload Model

Mainframes handle two fundamentally different types of work, and understanding both is essential before any of the subsystem posts will make sense.

Online Transaction Processing (OLTP)

OLTP is the real-time, interactive workload. A user or an external system submits a request: check this account balance, process this payment, update this record. The mainframe executes a short, discrete unit of work and returns a response, typically in milliseconds. Thousands of these transactions execute in parallel.

CICS is the primary vehicle for OLTP on z/OS. Each transaction is a named program that executes under CICS, reads or writes data, and terminates. The transaction model is discussed in full in the CICS architecture post.

Batch Processing

Batch is the scheduled, non-interactive workload. A batch job processes a large volume of records sequentially: generate all the month-end statements, calculate interest on every account, reconcile every transaction from the trading day. Batch jobs run when the system is not handling peak OLTP load, typically overnight.

Batch jobs are submitted as JCL (Job Control Language), which specifies the programs to run, the data sets to read and write, and the conditions for success and failure. JES manages the scheduling and execution of batch jobs. The JCL post covers what that actually looks like.

Most mainframe production environments run both workloads. The online day runs CICS transactions from 6am to 10pm. The batch window runs overnight. WLM manages the transition between these phases, protecting the OLTP response time goals during business hours and allowing batch to consume idle capacity at night.

time 00:00 06:00 12:00 18:00 22:00 24:00 BATCH JES / JCL jobs BATCH JES / JCL jobs ONLINE (CICS) Thousands of short transactions/second WLM protects response time goals Batch is deprioritized during this window WLM manages the boundary dynamically Figure 3: A typical mainframe day — OLTP during business hours, batch in the overnight window.

Parallel Sysplex: Mainframe Clustering

A single IBM Z machine is already highly available. But for workloads that need continuous operation through planned maintenance, hardware failure, or disaster recovery, IBM provides Parallel Sysplex: a cluster of up to 32 mainframe systems (each running their own z/OS instances) that appear as a single logical system.

The key enabler is the Coupling Facility (CF). The CF is a dedicated LPAR (or an external hardware unit) that provides three shared services to all members of the sysplex:

  • Caching: Shared in-memory data structures that all sysplex members can read and write with sub-millisecond latency.
  • Locking: Distributed lock management so that two systems in the sysplex cannot update the same record simultaneously.
  • List processing: Shared queues and lists used by subsystems like DB2 and CICS to coordinate across systems.

Because of the CF, applications running in a Parallel Sysplex can share data as if they were on the same machine, while actually running on separate hardware. DB2 can spread its data sharing group across multiple LPARs. CICS can route transactions to any available CICS region in the sysplex. If one machine fails, the others continue with no data loss, because the shared state lives in the CF rather than in any single system's memory.

IBM quotes up to 99.99999% availability for a properly configured Parallel Sysplex. Seven nines. That is less than four seconds of unplanned downtime per year.

IBM Z — System 1 z/OS LPAR (Production) CICS DB2 JES2 / WLM CPs + zIIPs Coupling Facility Shared Cache Lock Manager Shared Lists / Queues IBM Z — System 2 z/OS LPAR (Production) CICS DB2 JES2 / WLM CPs + zIIPs XCF (cross-system comms) Figure 4: Parallel Sysplex — two z/OS systems sharing state via the Coupling Facility.

How a Request Moves Through the System

To make this concrete, here is how a simple account balance inquiry moves through the mainframe stack.

A teller application running on a branch workstation sends a request over TCP/IP. The request arrives at the mainframe's VTAM or TCP/IP layer, which routes it to a CICS listener. CICS identifies the transaction code in the request and dispatches the corresponding COBOL program within its own address space.

The COBOL program issues an `EXEC CICS READ` command against a VSAM file or a DB2 SQL call. If it is a DB2 call, the request crosses from the CICS address space into the DB2 address space via a controlled inter-address-space communication path. DB2 reads the data, returns it to CICS, and the COBOL program formats the response.

Throughout this sequence, WLM is monitoring elapsed time. If the transaction is approaching its response time goal, WLM increases its CPU allocation. If the system is under load, WLM trades off lower-priority work to protect the goal.

The entire chain, from request arrival to response, typically takes 10 to 50 milliseconds. CICS processes thousands of such transactions per second within a single address space. That density and efficiency is the reason mainframes still exist.


Failure Modes and Fault Tolerance

Mainframes are designed around the assumption that components will fail and that the failure of any single component must not cause data loss or service interruption. Several mechanisms enforce this.

RAIM handles memory failures transparently. A failed memory chip is corrected in-flight with no application impact.

Redundant I/O paths mean every disk volume is reachable via multiple independent paths. If one path fails, the system automatically routes through an alternate without any operator intervention.

At the z/OS level, the Automatic Restart Manager (ARM) monitors critical subsystems and automatically restarts them if they fail. If CICS abends abnormally, ARM can restart it on the same system or on a different system in the sysplex before an operator has noticed the failure.

In a Parallel Sysplex, the Sysplex Failure Manager (SFM) monitors member systems and takes automated recovery action if a system stops responding. Work that was running on the failed system is redistributed to surviving members. The CF ensures that in-flight data is not lost because application state is shared across systems rather than local to any one of them.

The one failure scenario that gets attention is CF failure itself. IBM requires production sysplexes to run at least two CFs, with CF structures duplexed between them. A failure of one CF triggers automatic migration to the surviving CF with no application interruption.


Summary

A mainframe is a vertically integrated system built for one purpose: running very large numbers of critical transactions reliably, without data loss, and without unplanned downtime. The hardware model separates general-purpose CPs from specialty engines that handle specific workloads without contributing to software license costs. LPARs provide hardware-level isolation between environments. z/OS provides address-space-based isolation for every unit of work, with WLM managing performance goals across all of them.

The workload model divides into OLTP (real-time transactions handled by CICS) and batch (scheduled jobs handled by JES), with WLM managing the boundary between them dynamically throughout the day. Parallel Sysplex extends this model across up to 32 systems, using the Coupling Facility to share state and enable true active-active clustering at a scale and reliability level that the distributed systems world has spent decades trying to match.

Everything that follows in this series builds on this foundation. The next post covers CICS, the subsystem that handles all online transaction processing on z/OS.

Part of the Mainframe Decoded series — IBM Z and z/OS, clearly explained for engineers.



×