dybilar

Notes on Google Willow (2024) aka "Quantum Error Correction Below the Surface Code Threshold"

אם ירצה ה׳

Status:Completed

Date: 30 Kislev 5785


10.1038/s41586-024-08449-y paper

Printable PDF

AI-generated podcast

Layman TL;DR

Scientists at Google have built a more reliable quantum computer memory, taking a step towards building practical, large-scale quantum computers.

Quantum Computing SME TL;DR

Google demonstrates below-threshold error correction with a distance-7 surface code, achieving a logical qubit lifetime exceeding its best physical qubit lifetime and showcasing 'real-time' decoding.

Physics/Engineering TL;DR

Using their new Willow processors, Google achieves an error suppression factor of >2 in a surface code, exceeding break-even performance. A 10⁻¹⁰ error floor due to correlated bursts poses a scalability challenge.

Non-Hype TL;DR

Google's "below threshold" results don't yet demonstrate practical fault tolerance. Scaling to larger code distances and mitigating correlated errors remain major challenges. The need for massive resource overhead poses a significant obstacle. Lastly, the results are specific to Google's hardware and might not generalize to other platforms

I. Prologue

Pars pro toto, the motivated reader is referred to the recent tomography study by AbuGhanem and Eleuch (2024) [9]. Their work investigated the performance of Google's Sycamore 2-qubit universal gate (one of the building blocks of the quantum circuits run on the 53 qbit Sycamore processor) using full quantum tomography on IBM's quantum computers. They observed a dramatic fidelity drop when transitioning from noise-free simulations to real-world quantum computers, even for a relatively simple five-qubit eight-cycle circuit. Specifically, the state fidelity plummeted from 97.72% in noise-free simulations to a mere 15.95% on actual hardware.

If error correction techniques can successfully suppress logical error rates, as shown in the Willow experiments, then the dramatic fidelity drops observed by AbuGhanem and Eleuch might be mitigated in future, larger-scale, error-corrected quantum computers. The latest Google Quantum AI "Willow" research [1] aims to demonstrate quantum error correction operating below the critical noise threshold, where logical error rates decrease exponentially with increasing code distance.

II. A Surface Code Memory Below Threshold

The Willow processor is introduced as an advanced superconducting quantum processor with improved qubit coherence times and gate fidelities due to new fabrication techniques and design optimizations. Two surface codes were implemented: a distance-5 code on a 72-qubit processor with real-time decoding, and a distance-7 code on a 105-qubit processor, which showed significant error suppression. The performance metrics revealed a logical error rate of 0.143% ± 0.003% per cycle for the distance-7 code, with the logical qubit lifetime exceeding the best physical qubit by 2.4 times. Real-time decoding maintained below-threshold performance with a decoder latency of 63 μs. The error suppression factor (Λ) was calculated to be 2.14 ± 0.02 for the distance-7 code, indicating exponential error suppression with increasing code distance.

III. Logical Error Sensitivity

The paper explored how logical errors scale with physical errors through error injection experiments, showing that logical error rates exhibited expected behavior below a 20% detection probability. A detailed analysis of error sources contributing to Λ was conducted, identifying CZ gates and stray interactions as significant contributors. The importance of data qubit leakage removal (DQLR) for improving logical performance, especially at higher distances, was demonstrated. Additionally, the stability of logical qubits over hours was shown to be crucial for long-running quantum algorithms.

IV. A Repetition Code Memory in the Ultra-Low Error Regime

Repetition codes were employed to test logical performance at ultra-low error rates, revealing a new error floor at 10^-10 due to correlated error bursts occurring roughly once per hour. Two types of failures were identified: transient failures, likely due to temporary noise or TLS interactions, and catastrophic failures, which involved large, localized error bursts affecting many qubits simultaneously. These results underscore the importance of understanding and mitigating rare, but highly damaging error events for large-scale quantum computing.

V. Real-Time Decoding

The implementation of real-time decoding necessary for fault-tolerant quantum computing was described, emphasizing that the decoder must keep pace with the processor's cycle time. The decoder performance achieved a latency of 63 μs, with a slight reduction in accuracy compared to offline decoding, while still maintaining below-threshold performance.

Issues

Median vs. Mean Values: The use of median values rather than mean values for reporting errors might mask the distribution of errors, potentially hiding significant outliers or variability in performance. This could lead to an overly optimistic view of the system's performance if the errors are not normally distributed or if there are many outliers.

Error Budget Analysis: The error budget shows that incoherent errors constitute around 80% of the total error, which suggest that hardware improvements are the primary focus for error reduction. The authors do not deeply explore potential systematic errors or limitations in current control techniques that could also contribute significantly to this error.

Scaling and Crosstalk: The research notes widespread and nonlocal microwave crosstalk on the 72-qubit processor, which could become increasingly problematic as the system scales up. If crosstalk isn't adequately mitigated, it might severely limit the scalability of the quantum processors while maintaining or improving error rates.

Decoding Performance: Different decoders are used, with varying performance metrics. However, the document does not clearly explain why certain decoders perform better or worse, nor does it discuss the potential biases or limitations of these decoders. Without understanding the underlying reasons for decoder performance, there's a risk of selecting a decoder based on current performance without considering its scalability or robustness against different types of errors.

Stability Over Time: There are mentions of stability optimization strategies to mitigate frequency collisions with TLS (two-level systems) over multiple days of data taking. The long-term stability of the system, especially as it scales, remains unaddressed. If the system's stability deteriorates over time or under different environmental conditions, this will significantly affect the reliability and reproducibility of the results.

Coherence Time Improvements: While T1 and T2 times have improved, the document does not discuss how these improvements might interact with other system parameters or how they might be affected by scaling up the number of qubits. The optimistic assumption is that these coherence times will scale linearly or similarly with system size

Optimization Techniques: The use of the Snake optimizer and other frequency optimization strategies focuses heavily on data qubit coherence, potentially at the expense of other important factors like gate fidelity or measurement accuracy.

Real-time Decoding: The implementation of real-time decoding systems, while admittedly impressive, might introduce new challenges like increased latency, error propagation, or the need for more complex error correction strategies.

Red Flags, Omissions, Potential Shenanigans

Some of these points are extracted from the peer review file.

Selective Use of Data Across Systems: The real-time decoding and error-burst suppression were only tested on one of the two systems (Q72 for real-time decoding and Q105 for larger code distances). This selective testing could potentially skew results or lead to over-optimistic conclusions if these methods do not perform as well on the other system.

Not Really Real-Time Decoding: The claim of "real-time" decoding was debated due to the latency of 63 microseconds, which spans over 50 cycles, suggesting it might be more accurately described as 'bounded' or 'finite' latency decoding.

Unclear Proxy for Error Rate: The use of "detection probability" as a proxy for physical error rate was not clearly justified or quantified.

Scaling Concerns: The potential increase in physical error rates with system size (from d=3 to d=7) was highlighted. If this trend continues, it might hinder scaling up to larger code distances necessary for practical quantum computation. The authors' assertion that these effects might saturate lacks detailed evidence or analysis.

Too Simplified Noise Model: In the context of the discrepancy between the simulated and experimental values of 1/Λ (the inverse error suppression factor) in the surface code experiments:

A 14% discrepancy between simulated and experimental error suppression factors .. The authors attribute this discrepancy to "excess correlations": supporting this claim and characterizing correlated errors seems very important.

Google's response:

We apologize as this section of the paper was quite unclear, and contained an error. ... This corresponds to a 14% overestimate of Λ in the error budget. We have updated the text to clarify this point. Understanding the source of the discrepancy is an ongoing research area. ... We can expect that more accurate noise models will lead to a mixture of the correct Porter-Thomas distribution and a large number of uncorrelated Porter-Thomas distributions.

Willow and NISQ Challenges

Assuming the abovementioned issues are addressed and Google's 2024 Willow research does demonstrate fundamental progress in quantum error correction, how effectively does it address the noisy intermediate-scale quantum (NISQ) challenges regarding noise, as explained in [2] and elsewhere? The core issue is to what extent Willow tackles the limitations of current NISQ devices wrt noise sources and scalability challenges.

The following table compares Google's simplified noise model (Formula 77) [3] with more nuanced analyses presented in other research papers, highlighting the different perspectives on noise in quantum systems:

Noise Source Google's Model (Formula 77) Kalai et al. (2023, 2024) [7] [8] Hirota (2021) [11] Börner et al. (2023) [12] Willow (2024) [1]
Gate Errors Assumed independent Non-stationary behavior, Fourier analysis discrepancies Nonlinear scaling with qubit number Not directly addressed Partially addressed by below-threshold error correction, but correlated errors persist.
Readout Errors Included Asymmetric discrepancies with model Not explicitly modeled Not directly addressed Included in error budget analysis
Leakage Errors Not explicitly modeled Discussed in context of correlated errors Modeled as part of CZ error budget Included in simulations Mitigated with DQLR, but still contributes to the error budget
Crosstalk Not explicitly modeled Observed in experiments Not explicitly modeled Studied in classical limit Included in simulations
Correlated Errors Initially assumed independent, later attempts to incorporate correlations Observed in repetition codes, causing error floor Emergent from collective decoherence (superradiance) Not directly addressed Observed in repetition codes, causing a new error floor
Classical Chaos Not addressed Not addressed Not addressed Observed in classical limit of transmon systems Indirectly addressed through frequency engineering, but scalability of control is unclear

Epilogue

While Willow addresses some aspects of noise, it doesn't fully resolve the challenges posed by correlated errors, leakage, crosstalk, and the potential for chaotic dynamics.

Challenge Addressed by Willow? Extent of Problem Solved Rationale
Noise Sensitivity (Kalai) Partially Significant progress, but error floor remains Below-threshold error correction demonstrates noise suppression, but correlated errors persist, limiting the achievable fidelity.
Nonlinear Errors (Hirota) Indirectly Potentially mitigated by improved coherence and error correction, but further investigation needed Not explicitly addressed, but improved hardware and error correction could reduce the impact. The extent of this mitigation remains to be quantified.
Reachability Deficits (Akshay et al.) Indirectly Improved qubit fidelity and lifetime could enhance VQA performance Not directly addressed, but error correction could improve VQA reach by enabling more complex circuits.
Classical Chaos (Börner et al.) Partially Stable performance observed, but scalability of control unclear Not explicitly addressed, but stable operation suggests some control over chaotic behavior. The long-term stability and scalability of this control remain open questions.

There remains a significant gap between current capabilities and the requirements for fault-tolerant quantum computation.

Metric Google's Sycamore (2019) Google's Willow (2024) Required for Fault Tolerance (Estimate)
Qubit Count 53 105 (maximum) ~1500 (for distance-27 surface code to achieve 10⁻⁶ logical error rate)
Code Distance Not directly comparable 7 (maximum) 27 (for 10⁻⁶ logical error rate)
Error Suppression Factor (Λ) ~1 (near threshold) 2.14 ± 0.02 ~10 (for practical fault-tolerant computation)
Logical Error Rate ~0.002 0.00143 ± 0.00003 10⁻⁶ or lower

Appendix

A. Willow Specifications

  • Processors: Experiments were conducted on two Willow processors: a 72-qubit and a 105-qubit processor, with both showing improved operational fidelities compared to previous Sycamore processors.

  • Coherence Times: T1 (relaxation time) and T2 (dephasing time) have been significantly enhanced. This improvement is attributed to changes in device architecture like increased capacitor size and better control of noise sources.

  • 72-qubit Processor:

    • Average T1: around 75-85 μs
    • Average T2: around 33-106 μs
  • 105-qubit Processor:

    • Average T1: around 37-86 μs
    • Average T2: around 17-78 μs
  • Gate Performance:

  • Single-qubit gates on the 105-qubit processor have a duration of 25 ns, while on the 72-qubit processor, they range from 18 to 35 ns.

  • Two-qubit gates (CZ gates) are performed in 37 ns on the 72-qubit and 42 ns on the 105-qubit processor.

  • The median CZ Pauli error was reported at 2.6 × 10⁻³, with the majority of errors being incoherent.

Improvements in Control Techniques:

  • Frequency Optimization: Utilized the Snake optimizer for better coherence, focusing on error correction circuits and using intermediate-dimensional optimization.

  • Microwave Crosstalk: A protocol was developed to measure and compensate for microwave crosstalk, which is crucial for scaling up to larger quantum circuits.

Error Correction and Decoding:

  • Decoders: Various decoders were employed, including neural network, Libra, Harmony, correlated matching, and real-time decoding systems. Each decoder has its own performance metrics in terms of logical error rates per cycle.

  • Error Budget: A detailed analysis shows that incoherent errors dominate (80%), with coherent errors and leakage each contributing about 10% to the error budget.

Additional Insights:

  • Surface Code Simulation: The research discusses simulations for surface code performance at large code distances

  • Uncertainty Analysis: Logical performance metrics and error per cycle analysis provide a comprehensive view of the system's reliability and areas for improvement.

References

[1]: Google Quantum AI and Collaborators. Quantum error correction below the surface code threshold. Nature (2024). https://doi.org/10.1038/s41586-024-08449-y (accepted).

[2]: Bilar, D. (2024), “Noise and Reachability Deficits: Challenging Quantum Supremacy Claims”. Preprint. DOI: 10.13140/RG.2.2.26243.520023.

[3]: Arute, F., Arya, K., Babbush, R., Bacon, D., Bardin, J. C., Barends, R., ... & Martinis, J. M. (2019). Quantum supremacy using a programmable superconducting processor. Nature, 574(7779), 505–510.

[7]: Kalai, G., Rinott, Y., & Shoham, T. (2023). Questions and concerns about Google’s quantum supremacy claim. arXiv preprint arXiv:2305.01064.

[8]: Kalai, G., Rinott, Y., & Shoham, T. (2024). Quantum Advantage Demonstrations via Random Circuit Sampling: Fourier Expansion and Statistics. arXiv preprint arXiv:2404.00935.

[9]: AbuGhanem, M., & Eleuch, H. (2024). Full quantum tomography study of Google’s Sycamore gate on IBM’s quantum computers. The European Physical Journal Quantum Technology, 10(1), 36.

[11]: Hirota, O. (2021). Introduction to semi-classical analysis for digital errors of qubit in quantum processor. Entropy, 23(12), 1577.

[12]: Börner, S.-D., Berke, C., DiVincenzo, D. P., Trebst, S., & Altland, A. (2023). Classical chaos in quantum computers. arXiv preprint arXiv:2304.14435.