Optical Distillation Protocols for Spin–Optical Architectures: A Friendly Walk Through Single-Shot GHZ Generation

Introduction
Spin–optical modules in a nutshell
Single-shot generation of Bell, W and GHZ states
From raw to distilled GHZ: where optics alone can help
Double-click (DC) GHZ protocol
Double-click W-to-GHZ protocol
Why optical distillation matters for modular QEC
Outlook
References

Introduction

In this post I want to unpack a topic that keeps coming back in discussions about modular quantum architectures: how can I get really good multi-partite entanglement between remote modules without drowning myself in slow, error-prone two-qubit gates on memories? In particular, I will focus on optical distillation protocols in spin–optical architectures, inspired by the single-shot, emission-based schemes that directly generate multipartite entanglement such as GHZ and W states between communication qubits.

Concretely, I will walk through the basic building blocks (emitters, photons, beam splitters), how single-shot protocols create Bell, W and GHZ states, and then zoom in on the optical distillation layer: how I can clean up raw, noisy GHZ-like resources using only more optics and communication-qubit rotations, and essentially no memory–memory entangling gates.

My goal is not to reproduce a full fault-tolerance analysis, but to build intuition: if I have spin qubits that can emit photons on demand, an optical mixing network, and some decent detectors, what are the main optical tricks that let me push GHZ-state fidelities towards the surface-code threshold region?

Spin–optical modules in a nutshell

Let me start with the basic hardware picture. I consider a modular architecture where each module contains:

A communication qubit, typically an electronic spin (for instance an NV center or other color center) that can be optically excited.
One or more memory qubits, often nuclear spins or other long-lived degrees of freedom used as data or auxiliary qubits.
Local control (microwave / RF / lasers) to prepare the communication qubit and to swap states between communication and memory qubits.

The communication qubit couples to an optical mode. A resonant laser pulse selectively excites the "bright" state $\ket{1}$ to an excited orbital $\ket{1_e}$, from which it can decay and emit a single photon. The "dark" state $\ket{0}$ remains almost unaffected. If I prepare the spin in a superposition

\sqrt{1-\alpha}\,\ket{0} + \sqrt{\alpha}\,\ket{1},

then after the excitation and emission process, ideally I obtain an emitter–photon entangled state of the form

\sqrt{1-\alpha}\,\ket{0}\ket{0_{\rm ph}} + \sqrt{\alpha}\,\ket{1}\ket{1_{\rm ph}},

where $\ket{0_{\rm ph}}$ and $\ket{1_{\rm ph}}$ denote the vacuum and single-photon Fock states respectively. The tunable parameter $\alpha$ is sometimes called the bright-state population, and it controls both the success probability and the fidelity of the eventual heralded entanglement.

These photons travel through optical fibers to a central "entangling module" containing beam splitters, delay lines and detectors. By interfering photons from different modules and conditioning on specific detection patterns, I can project the remote spins into entangled states. Crucially, I do not need to send qubits between modules directly; I only send photons and classical messages.

Single-shot generation of Bell, W and GHZ states

Before talking about distillation, I need resource states to distil. The single-shot emission-based architecture I am considering is powerful because it can directly create, in a single global optical round:

Two-qubit Bell states between selected module pairs.
Four-qubit W states across four modules.
Four-qubit GHZ states across four modules.

Let me briefly sketch each of these, at an intuitive level.

Bell states from two emitters

Take two modules A and B. Each prepares its communication qubit in the bright superposition with parameter $\alpha$, emits a photon, and the two photonic modes are interfered on a balanced beam splitter. In the Heisenberg picture, the input creation operators $\hat p_1^\dagger, \hat p_2^\dagger$ transform to output modes $\hat q_1^\dagger, \hat q_2^\dagger$ like

\begin{pmatrix} \hat p_1^\dagger\\ \hat p_2^\dagger \end{pmatrix} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1\\ 1 & -1 \end{pmatrix} \begin{pmatrix} \hat q_1^\dagger\\ \hat q_2^\dagger \end{pmatrix}.

If I post-select on the case where exactly one detector clicks ("single-click" protocol), ideally the two spins A and B are projected into a Bell state

\ket{\Psi^{\pm}} = \frac{1}{\sqrt{2}}(\ket{01} \pm \ket{10}).

Non-number-resolving (threshold) detectors cannot distinguish between one and two photons, so multi-photon events leak noise into the output state. A simple and powerful optical distillation trick already appears here: run a double-click (DC) protocol where I repeat the whole emission–interference–detection sequence twice with an interleaved bit flip on the spins. By conditioning on a specific pattern of two successful clicks, I can eliminate the unwanted two-photon contribution and recover a cleaner Bell pair, even with non-PNR detectors.

Four-qubit W states from a 4$\times$4 beam-splitter network

Now consider four modules A, B, C, D. Each emits at most one photon, and these four photonic modes are injected into a 4$\times$4 interferometer built from two stages of 50:50 beam splitters. One concrete choice of unitary maps the inputs $\hat p_i^\dagger$ to outputs $\hat r_j^\dagger$ as

\begin{pmatrix} \hat p_1^\dagger\\ \hat p_2^\dagger\\ \hat p_3^\dagger\\ \hat p_4^\dagger \end{pmatrix} = \frac{1}{2} \begin{pmatrix} 1 & 1 & 1 & 1\\ 1 & -1 & 1 & -1\\ 1 & 1 & -1 & -1\\ 1 & -1 & -1 & 1 \end{pmatrix} \begin{pmatrix} \hat r_1^\dagger\\ \hat r_2^\dagger\\ \hat r_3^\dagger\\ \hat r_4^\dagger \end{pmatrix}.

If I condition on the event that exactly one of the four detectors clicks (say detector 1), then, in the ideal PNR, noiseless limit, the four spins are projected into the W state

\ket{W_4} = \frac{1}{2}\left(\ket{0001} + \ket{0010} + \ket{0100} + \ket{1000}\right).

Again, threshold detectors lead to admixtures of two-, three- and four-photon processes, reducing the fidelity, especially as $\alpha$ increases. However, this single-shot access to nontrivial multipartite states is exactly what enables richer distillation strategies.

Single-shot four-qubit GHZ states

The same interferometer can be used to create four-qubit GHZ states in a single optical round. Here I post-select on patterns with two detectors clicking (two photons in total) in specific pairs, e.g. $(D_1, D_2)$. In the ideal PNR case, the underlying interference ensures that only certain two-photon input patterns contribute, and the four emitters end up in

\ket{\Psi_4^-} = \frac{1}{\sqrt{2}}(\ket{0101} - \ket{1010}),

which I can convert via local single-qubit operations into the more standard

\ket{\Phi_4^+} = \frac{1}{\sqrt{2}}(\ket{0000} + \ket{1111}).

This is already a very attractive resource for distributed surface-code stabilizer measurements: a weight-4 GHZ state over four modules. However, as before, non-PNR detection lets higher-photon-number contributions sneak in, and realistic hardware introduces photon loss, spectral mismatch, state-preparation errors, and qubit decoherence. This is where both memory-based and optical distillation come into play.

From raw to distilled GHZ: where optics alone can help

In general, distillation means that I start with several copies of imperfect entangled states, run a protocol involving local operations and classical communication (LOCC), and post-select on outputs that are "better" (higher fidelity) but produced with lower probability. In this modular, spin–optical setting I can:

Use memory-based distillation: swap raw optical entanglement into memory qubits and apply memory–memory gates.
Or leverage optical distillation: run additional emission rounds and local rotations on the communication qubits only, without ever entangling memories.

Given that memory two-qubit gates are typically much slower and noisier than emitter control, optical distillation is especially appealing if I am targeting near-term threshold crossings.

Double-click (DC) GHZ protocol

Let me now focus on the main optical workhorse: the double-click GHZ protocol. Conceptually, it is the natural generalisation of the double-click Bell protocol to four parties.

Idea in words

Round 1 (raw GHZ generation). I run the single-shot GHZ protocol once. Conditioned on a suitable two-detector pattern, the four communication qubits are in a "raw" GHZ-like mixed state $\rho_{\rm raw} = F_{\rm raw}\,\ket{\Phi_4^+}\bra{\Phi_4^+} + (1-F_{\rm raw})\,\rho_{\rm noise},$ where $\rho_{\rm noise}$ mainly consists of basis states with three or four excitations (e.g., $\ket{1111}$, states with Hamming weight 3), fueled by multi-photon events and imperfect interference.
Bit-flip on all communication qubits. I apply $X$ on each of the four emitters. This maps $\ket{\Phi_4^+}$ to itself (up to global phase) but reshuffles the noise: high-Hamming-weight components (3 or 4 excitations) turn into low-Hamming-weight ones (1 or 0 excitations).
Round 2 (optical filtering). I now perform a second emission–interference–detection round with the same 4$\times$4 optical network. Crucially, the re-emission properties of the different components differ:
- The desired GHZ part has exactly two excitations after the bit flip and can give rise to the same two-detector patterns as in Round 1.
- The noise terms with at most one excitation cannot produce the two-photon coincidences that characterise the GHZ clicks.
By conditioning on observing the same two-detector pattern as in Round 1 (or a set of equivalent patterns), I effectively project onto the GHZ component and reject almost all noise. Remarkably, this works even with non-PNR (threshold) detectors: double emissions simply cannot satisfy both rounds' conditions simultaneously if they are associated with the wrong spin populations.

Why it works: a simple picture

It is useful to think in terms of excitation number. After the first round, ignoring decoherence and gate errors, the raw state can be decomposed as

\rho_{\rm raw} \approx p_2\,\rho_{(2)} + p_3\,\rho_{(3)} + p_4\,\rho_{(4)},

where $\rho_{(k)}$ denotes the part with Hamming weight $k$ (i.e., $k$ excitations among the four qubits). The GHZ component lives in $\rho_{(2)}$, while $\rho_{(3)}$ and $\rho_{(4)}$ come from multi-photon processes. Applying $X^{\otimes 4}$, I map

\begin{aligned} \rho_{(2)} &\to \rho_{(2)},\\ \rho_{(3)} &\to \rho_{(1)},\\ \rho_{(4)} &\to \rho_{(0)}. \end{aligned}

In the second emission round, I demand two-photon coincidences in the same style as Round 1. But a component with Hamming weight 0 or 1 simply cannot emit two indistinguishable photons into the right modes to satisfy the GHZ click pattern. As a result, those contributions are filtered out "for free" by the optical process itself; I do not need an explicit projection in spin space.

The net effect is an optical parity check that keeps the two-excitation sector (GHZ-like) and discards much of the rest.

Success probability vs fidelity

The main trade-off is that the double-click protocol squares the per-round success probability. If the raw GHZ success probability per attempt is $P_{\rm raw}(\alpha)$, the DC-GHZ success probability is roughly

P_{\rm DC}(\alpha) \sim \left(P_{\rm raw}(\alpha)\right)^2,

up to constant factors from the allowed detection patterns.

On the other hand, the GHZ fidelity after DC can be made almost independent of $\alpha$ in the idealised optical-noise-dominated regime, because the protocol very effectively removes multi-photon noise. This is quite different from single-click protocols, where I often need to take $\alpha$ extremely small to suppress higher-photon-number contributions.

From a surface-code perspective, what matters is whether I can simultaneously hit:

$F_{\rm GHZ} \gtrsim 0.98$ (to keep stabilizer measurement errors below threshold), and
a per-stabilizer success probability high enough that the code can be scheduled without excessive decoherence due to retries.

One of the attractive features of the DC-GHZ scheme is that both conditions can be met for realistic parameter ranges, with only modest hardware improvements beyond current emission-based platforms.

Double-click W-to-GHZ protocol

There is a second, slightly more subtle, optical distillation protocol that starts from W states rather than GHZ states and still ends up with GHZ-like resources.

Rotating a W state into a GHZ-containing superposition

Suppose I have a (noisy) four-qubit W state $\rho_W$. If I apply Hadamards and phase flips to all four qubits, I obtain a rotated state of the ideal form

\ket{\tilde W_4} = H^{\otimes 4} Z_2 Z_4 \ket{W_4} = \frac{1}{2} X_2 X_4 (\ket{W_4} - \ket{W_4'}) + \frac{1}{\sqrt{2}}\ket{\Psi_4^-},

where $\ket{W_4'}$ is the bit-flipped W (all bits inverted) and $\ket{\Psi_4^-}$ is a GHZ-type state. The key point is that $\ket{\tilde W_4}$ already contains a GHZ component with nonzero weight.

If I now let the rotated W state emit photons again through the same interferometer and post-select on two-photon coincidence patterns (as in GHZ generation), I effectively project more strongly onto the GHZ component. This gives an optical W-to-GHZ distillation:

\rho_W \xrightarrow{\text{rotate + emit + post-select}} \rho_{\rm GHZ}^{(\rm opt)}.

Limitations and when it helps

Unlike the DC-GHZ protocol, the double-click W-to-GHZ scheme cannot in general reach a perfect GHZ state under realistic noise. The reason is that the noise subspace of the W state also contains components that transform into GHZ-like two-excitation patterns under the Hadamard and phase rotations, so optical filtering is less selective.

However, in regimes where W-state generation is much more probable than GHZ generation (for instance, when $\alpha$ must be very small and losses are moderate), using W-based optical distillation can give a better rate–fidelity compromise. In other words, I might accept a slightly lower GHZ fidelity if I can generate usable resources substantially more often.

If my target is full-blown surface-code thresholds, the DC-GHZ approach tends to win. But if I am aiming for, say, small-code experiments or Bell-type tests that require good but not exceptional fidelities, or if my hardware is particularly well-optimised for W production, the optical W-to-GHZ protocol becomes interesting.

Why optical distillation matters for modular QEC

Let me close with a higher-level view: why should I, as someone interested in modular quantum error correction, care about these optical tricks?

Avoiding slow memory gates

In many solid-state platforms, communication qubits (electronic spins) have relatively fast, high-fidelity single-qubit control and emission processes, while memory qubits (nuclear spins or neighbouring electronic spins) have slower, more error-prone two-qubit gates. Fusion-based approaches to distributed surface codes typically:

Generate Bell pairs between modules.
Fuse them using memory–memory entangling gates to build larger GHZ states.

This can lead to deep circuits on the memories, with error thresholds saturating around $\sim 0.1\% - 0.16\%$ when realistic gate errors and decoherence are included. By contrast, optical distillation directly bootstraps a GHZ-like resource on the communication layer, offloading most of the heavy lifting to optics and classical post-processing.

Threshold implications

Once I have a reasonably high-fidelity, on-demand (or at least high-rate) GHZ factory connecting four neighbouring modules, I can use the resulting states as the entangling resource for distributed stabilizer measurements in a surface code. Numerical simulations show that:

Double-click GHZ protocols with photon-number-resolving detectors and modest hardware improvements can support surface-code thresholds around the few $10^{-3}$ level in circuit-level error rate.
Even without PNR detectors, thresholds close to or above $2 \times 10^{-3}$ are reachable if I accept somewhat lower GHZ success probabilities and improve loss and indistinguishability moderately.

This is competitive with, and in some regimes better than, scattering-based proposals that require significantly more demanding hardware (strong coupling to waveguides or cavities, high-performance optical circulators, etc.). In other words, single-shot emission plus optical distillation turns a conceptually simple spin–photon interface into a serious candidate for fault-tolerant modular architectures.

Architectural simplicity

From an engineering perspective, there is also a simplification benefit. Optical distillation keeps the protocol space relatively uniform:

Every entangling attempt looks like "prepare spins, emit photons, interfere, detect, herald".
Classical control logic only needs to keep track of detection patterns and whether I am in Round 1 or Round 2 of a DC protocol.
Memory qubits can be reserved primarily for storing data (surface-code qubits) rather than being used as an active entanglement-processing resource.

This kind of architectural regularity is extremely helpful when I move from theory diagrams to actual timing diagrams, FPGA firmware and cryostat wiring.

Outlook

To summarise, optical distillation protocols such as double-click GHZ and W-to-GHZ distillation show that, given a suitable multiport interferometer and spin–photon interfaces, I can:

Generate high-fidelity four-qubit GHZ states in a small number of optical rounds.
Offload most of the distillation burden from slow memory gates to fast optical operations.
Achieve surface-code-relevant fidelities and success rates under realistic noise and loss assumptions.

Going forward, I see several interesting directions:

Extending single-shot optical architectures to higher-weight GHZ states matched to higher-weight stabilizers or LDPC codes.
Combining optical and minimal memory-based distillation in hybrid protocols that flexibly trade time, fidelity and hardware complexity.
Embedding active feedback and adaptive routing directly into the optical layer, so that unsuccessful entanglement attempts are recycled or rerouted on the fly.

All of these directions share a common theme: leveraging the natural strengths of photonics (interference, fast detection, flexible routing) to push distributed quantum error correction closer to practical fault tolerance, without waiting for perfect gates on every memory qubit in the system.

References

Singh, S., Kashiwagi, R., Tanji, K., Roga, W., Bhatti, D., Takeoka, M., & Elkouss, D. (2026). Fault-tolerant modular quantum computing with surface codes using single-shot emission-based hardware. arXiv:2601.07241 [quant-ph].
Nickerson, N. H., Li, Y., & Benjamin, S. C. (2013). Topological quantum computing with a very noisy network and local error rates approaching one percent. Nature Communications, 4, 1756.
Nemoto, K., et al. (2014). Photonic architecture for scalable quantum information processing in NV-diamond. Physical Review X, 4, 031022.
Barrett, S. D., & P. Kok (2005). Efficient high-fidelity quantum computation using matter qubits and linear optics. Physical Review A, 71, 060310(R).
Briegel, H.-J., Dür, W., Cirac, J. I., & Zoller, P. (1998). Quantum repeaters: The role of imperfect local operations in quantum communication. Physical Review Letters, 81, 5932.
Fowler, A. G., Mariantoni, M., Martinis, J. M., & Cleland, A. N. (2012). Surface codes: Towards practical large-scale quantum computation. Physical Review A, 86, 032324.