NVIDIA/cuda-quantum

[RFC] Unitary Synthesis

khalatepradnya opened this issue · 4 comments

Describe the feature

Problem

Given a user provided arbitrary quantum unitary, synthesize it into a sequence of quantum gates.

Expectations

  • User provides an arbitrary unitary matrix as a custom quantum operation.
  • The custom operation can be used as a regular CUDA-Q supported quantum operation.
    • Q: Broadcast (same operation on multiple qubits): Out of scope
  • The allowed set of quantum gates for synthesis depends on the backend target.
    • Q: Allow user to specify set of allowed gates: Out of scope
  • CUDA-Q throws error if a unitary cannot be synthesized (reasonably).
    • 'reasonably' to account for time limit (timeout), gate count limit (upper threshold), and how close the synthesized "circuit" is to the input unitary (tolerance)

User API

  • Python
import cudaq 

custom_h = cudaq.register_operation(1. / np.sqrt(2.) *  np.array([[1, 1], [1, -1]])) 
custom_x = cudaq.register_operation(np.array([[0, 1], [1, 0]])) 

@cudaq.kernel 
def bell(): 
  qubits = cudaq.qvector(2) 
  custom_h(qubits[0]) 
  custom_x.ctrl(qubits[0], qubits[1]) 

counts = cudaq.sample(bell) 
counts.dump()

  • C++
// Macro to specify the custom unitary operation
cudaq_register_op("custom_h",
                  {{M_SQRT1_2, M_SQRT1_2}, {M_SQRT1_2, -M_SQRT1_2}});
cudaq_register_op("custom_x", {{0, 1}, {1, 0}});

void custom_operation() __qpu__ {
  cudaq::qvector qubits(2);
  custom_h(qubits[0]);
  custom_x.ctrl(qubits[0], qubits[1]);
}

int main() {
  auto result = cudaq::sample(custom_operation);
  std::cout << result.most_probable() << '\n';
  return 0;
}
  • The user must provide valid unitary matrix (CUDA-Q will not check / enforce this requirement)
  • Ordering: The user provided matrix must be in row-major format

Constraints

  • Size of unitary matrix: limit to 8 qubits, (2^8 = 256), 256 x 256
  • The custom operation must be defined outside of a quantum kernel. (for e.g. call to register_operation cannot be inside a function decorated with @cudaq.kernel)
  • The tolerance for the synthesized circuit and the gate count limit will be default values determined by CUDA-Q
  • The custom operation definition is restricted to qubit (cudaq::qudit<2>).

Workflow

image
  • In simulation, no synthesis will happen.
  • Compiler will automatically synthesize the matrix when targeting hardware.
  • Explicit synthesis mechanism (API or command-line argument) - Out of scope for the first iteration
  • NVQC target behaves same as when running locally

Work items / TO-DOs

  • Support in simulation for Python -
    • Kernel mode
    • Builder mode
    • State vector simulators
    • Tensornet simulators
  • Support in simulation for C++
    • Library mode
    • MLIR mode
  • Add generic synthesis for emulation
  • Error handling: Gracefully handle user errors, feature constraints and runtime errors
  • Comprehensive documentation and useful example(s)
  • Support synthesis per hardware backend