openquantumhardware/qick

Start time of pulses across multiple DACs misaligned

Opened this issue · 8 comments

Hi all,

I have been using the RFSoC 4x2 to generate I and Q inputs for an external IQ mixer using DAC_A and DAC_B on the board. I have written a few calibration scripts to compensate for the frequency-dependent phase offset of the DACs and they have been working well for 'periodic' outputs over a range of frequencies and phase-offsets. However, I am having an issue particularly when producing 'oneshot' pulses (the issue also applies at the beginning of 'periodic' pulses but is less detrimental). The length of the pulses that I am producing are ~10ns.

There exists significant jitter in the start time of pulses. The start times of the outputs of DAC_A and DAC_B are up to 1.6ns misaligned in leading and lagging directions. This variation exists across each pair of pulse instructions (pulsing both DACs using prog.pulse()) within the same program. My suspicion is that this is due to the phase offset between the DAC fabric clocks (both running at f=614.4 MHz => T=1.62ns).

Has anyone had any success in overcoming this behaviour or is it a hardware limitation of the board since each DAC tile is independent?

Included below are a short demo script for reproducing the issue, the corresponding qick asm, and an oscilloscope capture to demonstrate the start time offset.

Thanks
Daniel

prog = QickProgram(soccfg)
for ch in range(2):
    prog.declare_gen(ch=ch, nqz=1)
    prog.set_pulse_registers(ch = ch,
                             gain = 20000,
                             freq = prog.freq2reg(100, gen_ch=ch),
                             phase = prog.deg2reg(0, gen_ch=ch),
                             style = "const",
                             length = prog.us2cycles(1, gen_ch=ch),
                            )
prog.synci(200)
for ch in range(2):
    prog.pulse(ch=ch, t=0)
prog.config_all(soc)
soc.tproc.start()
print(prog)
// Program

  regwi 0, $22, 43690667;                       //freq = 43690667
  regwi 0, $23, 0;                              //phase = 0
  regwi 0, $25, 20000;                          //gain = 20000
  regwi 0, $26, 590438;                         //phrst| stdysel | mode | | outsel = 0b01001 | length = 614 
  regwi 1, $22, 43690667;                       //freq = 43690667
  regwi 1, $23, 0;                              //phase = 0
  regwi 1, $25, 20000;                          //gain = 20000
  regwi 1, $26, 590438;                         //phrst| stdysel | mode | | outsel = 0b01001 | length = 614 
  synci 200;
  regwi 0, $27, 0;                              //t = 0
  set 0, 0, $22, $23, $0, $25, $26, $27;        //ch = 0, pulse @t = $27
  regwi 1, $27, 0;                              //t = 0
  set 1, 1, $22, $23, $0, $25, $26, $27;        //ch = 1, pulse @t = $27

SDS00048

meeg commented

I think your understanding of the issue is fundamentally correct, here are some thoughts:

  1. The standard 4x2 firmware has independent 614.4 MHz fabric clocks for the two generators, and a 409.6 MHz tProc clock. I suspect that the big problem for you is the jitter from the clock domain crossings (CDCs) from 409.6 to 614.4 - because the two 614.4 clocks have different phases, when your tProc sends simultaneous pulses to the two generators they pick up different random delays from their respective CDCs.
  2. If all three clocks were 409.6 MHz, your CDCs still exist but do not introduce jitter, just a phase shift. The phase shift between the generators would be stable until you reload the firmware, and you could calibrate it out. You might try https://s3df.slac.stanford.edu/people/meeg/qick/fw/2023-06-28_4x2_commonclk/ or https://s3df.slac.stanford.edu/people/meeg/qick/fw/2023-09-28_4x2_commonclk_4pt9/.
  3. If the two tProc outputs cross over into a common 614.4 clock, and then cross over to their respective generator clocks, you have a random but matching jitter at the first CDC and a fixed shift at the second CDC, so again this should be usable. This is the structure in the current standard ZCU111 and ZCU216 firmwares, but unfortunately we didn't do this in the 4x2; it is not a major change but we don't have time right now to make new firmware.
  4. Phase alignment across DAC tiles is possible in principle, using what Xilinx calls multi-tile sync, but we have not implemented this in QICK.
  5. Are you sure you need to use an IQ mixer? QICK is meant to work well with cheap double-sideband mixers or with no mixers at all. I would argue that IQ mixers are the best (or only) solution when you need to upconvert a slow IF, but if QICK can generate a fast IF or generate your RF signal directly, why deal with IQ mixers? Of course you may have specific reasons for needing an IQ mixer, I am just reciting the party line from the last two QICK papers. So I am curious what your goal is here that motivates your current direction.

Thanks for your replies! Regarding (5), we were using an IQ LO of 14 GHz, didn't want to have to use additional filters, and already had an IQ mixer to hand. However, given the clock domain crossing jitter and overall less finicky setup, the method of generation that you propose is probably the most sensible.

Nevertheless, I gave the 409.6MHz 4x2 commonclk (2023-06-28) firmware a go this morning and am still unexpectedly encountering the jitter. Not sure why this is the case given that the phase relationship between the DAC fabric and tproc clocks should now be fixed (if I have understood the solution correctly). Any ideas as to why this may be?

While I will most likely implement the single-channel setup, it would still be ideal to have a jitter-free relationship between tiles for synchronising pulse offsets across multiple DAC sources for other experimental purposes.

Included below are captures from a style='arb' program in which both DACs have the same sinusoidal envelope (to more clearly compare the start time of pulses) demonstrating that the difference in start time is not consistent across runs.

QICK configuration:
[...]
	Firmware timestamp: Wed Jun 28 14:48:55 2023

	Global clocks (MHz): tProcessor 409.600, RF reference 409.600

	2 signal generator channels:
	0:	axis_signal_gen_v6 - envelope memory 65536 samples (10.000 us)
		fs=6553.600 MHz, fabric=409.600 MHz, 32-bit DDS, range=6553.600 MHz
		DAC tile 0, blk 0 is DAC_B
	1:	axis_signal_gen_v6 - envelope memory 65536 samples (10.000 us)
		fs=6553.600 MHz, fabric=409.600 MHz, 32-bit DDS, range=6553.600 MHz
		DAC tile 2, blk 0 is DAC_A
[...]

SDS00053
SDS00052

meeg commented

You're not reloading the firmware between runs, right? Meaning - you initialize the QickSoc once, then run your little program multiple times?

Yes, what you see is not what I expected. To be fair, the "textbook" answer is that (3) is the correct way to do this, maybe (2) doesn't work as well as I thought.

The firmware is not being reloaded between runs - I am using Pyro4 so I load the firmware once when the board is initialised and run all other code remotely.

Setting all of the clocks to 409.6 MHz in (2) seems to be 'better' than before in that the jitter is now an integer number of clock cycles, however, it still appears to be random whether or not said time offset occurs. The offset is 1 cycle in a fixed direction for between 80% and 95% of runs and 0 cycles for the remainder (ratio changes each time firmware is reloaded).

I will attempt modifying the firmware to implement (3) myself. I believe it would be worth opening a follow-up issue to include (3) in the widely-distributed firmware at some point as it may be something that others would also find useful for precise alignment of pulses across multiple channels on the 4x2.

meeg commented

That's useful information and makes sense - there is always some jitter on the logic signals, and while the firmware design guarantees that this jitter doesn't cross clock edges within a clock domain, the jitter can cross clock edges at a CDC and get latched into the new clock. You would expect that most of the time, the time offset is what you would have in the absence of jitter; sometimes the jitter will be large enough to push you over the edge, probability depending on how big the jitter is and how close you are to the edge. I had guessed that the jitter would be small enough that the probability would be 0, but clearly I was wrong.

The 2023-09-28 firmware might be better behaved because the clocks are slower - 409.6 MHz is near the maximum tProc clock speed. Since you need to mix your signals up anyway, I suppose you should be OK with the lower DAC sampling frequencies.

We are not going to implement (3) in the current standard 4x2 firmware because we're planning to redo all of the standard firmwares soon. At that point we will make sure to implement (3).

Taking a step back, again - it sounds like you are eventually going to want multiple RF outputs, surely you are not planning to use the 4x2 long-term? If you are playing with the 4x2 as a test drive before you decide whether to move to a 216, you should consider that the 216 has 4 DACs per tile and you may be spending a lot of time on a problem that you won't need to worry about later.

Sure, thanks for the explanation - that all makes sense. The 2023-09-28 firmware does not have any jitter present and 20ns pulses are looking much better aligned (after 800ps time-offset was also applied to the 'arb' envelope). The lower clock speed is certainly fine for now, however, the redone firmware would be great to utilise the board at its maximum clock speeds.

I think that we are going to stick with the 4x2 as we will never need more than 2 RF outputs so the extra cost of the 216 is not justifiable. The original intent of using both outputs was for external IQ mixing, however, we are now using a single channel. Jitter-free pulsing is still important for consistent time-domain alignment with other equipment during experiments.

SDS00057

meeg commented

My thinking was - if you're using a single RF output, what other equipment could you be aligning with? But OK, never mind.