Add ability to enable/disable individual phases on Renesas power controllers
Opened this issue · 11 comments
This is an issue to track the various logic analyzer dumps of what is happening to the controller from PowerNavigator when I do this through the GUI.
A DSLogic U3Pro16 was used to capture all of these. I included a csv decode of the transactions and the raw capture in each of the dumps below. You can get pre-compiled versions of the Dream Source Lab GUI for Windows, Linux, and OS X here
If you are so inclined, the source code for the GUI is theoretically here
All captures done on Gimlet: 0XV1:9130000019:006:BRM44220021
Folder containing all dumps is here
- Starting with a default config, disable Phase 19 of U350 (VDD_VCORE controller):
0x5A_VDD_VCORE_controller_phase19_disable
- Disable Phase 18:
0x5A_VDD_VCORE_controller_phase18_disable
- Disable Phase 17:
0x5A_VDD_VCORE_controller_phase17_disable
- Disable Phase 16:
0x5A_VDD_VCORE_controller_phase16_disable
- Disable Phase 15:
0x5A_VDD_VCORE_controller_phase15_disable
- Disable Phase 14:
0x5A_VDD_VCORE_controller_phase14_disable
- Disable Phase 13:
0x5A_VDD_VCORE_controller_phase13_disable
Now that everything but Phase 12 is disabled, enable the VDD_VCORE regulator in Fixed PWM Mode and then disable
0x5A_VDD_VCORE_fixed_pwm_mode_enable
0x5A_VDD_VCORE_fixed_pwm_mode_disable
Here I disable phase 12, which hopefully doesn't do anything different since there aren't any phases enabled after this is done:
0x5A_VDD_VCORE_controller_phase12_disable
And here are the phase enables for phases 12-19:
0x5A_VDD_VCORE_controller_phase12_enable
0x5A_VDD_VCORE_controller_phase13_enable
0x5A_VDD_VCORE_controller_phase14_enable
0x5A_VDD_VCORE_controller_phase15_enable
0x5A_VDD_VCORE_controller_phase16_enable
0x5A_VDD_VCORE_controller_phase17_enable
0x5A_VDD_VCORE_controller_phase18_enable
0x5A_VDD_VCORE_controller_phase19_enable
Here is the same thing for the other rail on this controller (VDD_MEM_ABCD).
0x5A_VDD_MEM_ABCD_controller_phase3_disable
0x5A_VDD_MEM_ABCD_controller_phase2_disable
0x5A_VDD_MEM_ABCD_controller_phase1_disable
0x5A_VDD_MEM_ABCD_controller_phase0_disable
0x5A_VDD_MEM_ABCD_controller_phase3_enable
0x5A_VDD_MEM_ABCD_controller_phase2_enable
0x5A_VDD_MEM_ABCD_controller_phase1_enable
0x5A_VDD_MEM_ABCD_controller_phase0_enable
0x5A_VDD_MEM_ABCD_fixed_pwm_mode_enable
0x5A_VDD_MEM_ABCD_fixed_pwm_mode_disable
Since there is only one phase per rail on the ISL68224, only the transactions for Enable and Disable of Fixed PWM Mode are captured.
0x5C_VPP_ABCD_fixed_pwm_mode_enable
0x5C_VPP_ABCD_fixed_pwm_mode_disable
0x5C_VPP_EFGH_fixed_pwm_mode_enable
0x5C_VPP_EFGH_fixed_pwm_mode_disable
0x5C_V1P8_SP3_fixed_pwm_mode_enable
0x5C_V1P8_SP3_fixed_pwm_mode_disable
I did some poking at the RAA CSVs, with mixed results.
Here's a Python script that goes from CSV to DMA writes:
Here's a selection of the most plausible registers:
disable18
4:E9C0 <= 000C0FF0
12:E9C1 <= 000C0FFF
21:E9C2 <= 0003F00F
25:E905 <= 00000F0A
disable17
12:E905 <= 00000F02
17:E9C2 <= 0001F00F
20:E9C1 <= 000E0FFF
22:E9C0 <= 000E0FF0
disable16
10:E9C1 <= 000F0FFF
16:E905 <= 00000F00
17:E9C2 <= 0000F00F
25:E9C0 <= 000F0FF0
disable15
8:E9C2 <= 0000700F
21:E9C0 <= 000F8FF0
23:E904 <= 2A0000AA
24:E9C1 <= 000F8FFF
25:E905 <= 00000F00
E904-E905 appears to have a 2-bit field for each phase; I see one bit being cleared from each two-bit region when a phase is disabled, extending into E904 for phase 15.
E9C0 appears to have 1 bit set per phase disabled
E9C1 is the same as E9C1, except there's another F
in the lowest nibble?
E9C2 has one bit cleared each time we disable a phase.
There's a bunch of stuff going on with E9D* that I don't understand:
disable18
2:E9D4 <= 0001B015
3:E9D2 <= 0000C006
5:E9DA <= 00000000
7:E9DB <= 00000000
9:E9D6 <= 0002A024
10:E9D9 <= 0001B015
11:E9DC <= 00000000
13:E9D3 <= 0001400D
15:E9D7 <= 0000C006
16:E9DD <= 00000000
17:E9DF <= 00000000
19:E9D5 <= 0002301C
20:E9DE <= 00000000
24:E9D8 <= 0001400D
disable17
1:E9D4 <= 0001B015
3:E9DD <= 00000000
5:E9D2 <= 0000C006
6:E9D9 <= 00000000
7:E9DB <= 00000000
9:E9D3 <= 0001400D
11:E9DF <= 00000000
13:E9DC <= 00000000
15:E9DA <= 00000000
16:E9DE <= 00000000
19:E9D8 <= 0001B015
21:E9D7 <= 0001400D
24:E9D6 <= 0000C006
26:E9D5 <= 0002301C
disable16
1:E9D3 <= 0001400D
3:E9D4 <= 0001B015
5:E9DA <= 00000000
7:E9D5 <= 0000C006
8:E9D6 <= 0001400D
11:E9DD <= 00000000
12:E9DC <= 00000000
13:E9D2 <= 0000C006
14:E9DB <= 00000000
15:E9DF <= 00000000
20:E9D9 <= 00000000
21:E9D7 <= 0001B015
22:E9DE <= 00000000
26:E9D8 <= 00000000
disable15
2:E9DC <= 00000000
4:E9DE <= 00000000
5:E9D9 <= 00000000
6:E9D4 <= 0000C006
11:E9D7 <= 00000000
12:E9D6 <= 0001B015
13:E9D3 <= 0001400D
14:E9DD <= 00000000
16:E9DB <= 00000000
18:E9DF <= 00000000
19:E9DA <= 00000000
20:E9D2 <= 0000C006
22:E9D5 <= 0001400D
27:E9D8 <= 00000000
This almost looks like it's using it as scratch memory (?), e.g. 0001B015 is written to a bunch of those registers.
disable18
2:E9D4 <= 0001B015
10:E9D9 <= 0001B015
disable17
1:E9D4 <= 0001B015
19:E9D8 <= 0001B015
disable16
3:E9D4 <= 0001B015
21:E9D7 <= 0001B015
disable15
12:E9D6 <= 0001B015
In general, there seems to be no rhyme or reason to the order in which registers are written.
w.r.t. addresses E904 and E905, those are known to be associated with the open pin detection, so I'd bet they are actually open pin detection fault mask registers and not the actual detection bits or maybe it's both. a bit in one place shows that the controller should pay attention to the open pin detection and the other bit shows the actual detection? but based on the fact that the open-pin detection register address they gave us for Gen2 did not change when the open pin state changed, I would lean more towards it being a fault mask.
NOTE
One thing to note when thinking about how we test this in production, if there is an open pin (at least on the phase pins), no rail within the ISL68224 will even attempt to turn on, so we will want to run the open pin detection before we try turning on individual phases, otherwise all 3 rails on the ISL68224 will fail because there will be no output voltage.
An annotated look at the enable / disabling of fixed PWM mode, from
0x5A_VDD_MEM_ABCD_fixed_pwm_mode_enable
0x5A_VDD_MEM_ABCD_fixed_pwm_mode_disable
Fixed PWM enable:
0210
ON_OFF_CONFIG <= undocumented option
EA0D <= 00296230
DMA operation
F0 40212010
LOOPCFG 10202140 (byte swap)
Bit 6: diode emulation enable
Bit 8: minimum phase count = 1
Bit 13: Reserved
Bit 21: reserved
Bit 28: Enable diode emulation for PS0/1
F0 00212010
LOOPCFG 10202100
Bit 8: minimum phase count = 1
Bit 13: Reserved
Bit 21: reserved
Bit 28: Enable diode emulation for PS0/1
F0 00212000
LOOPCFG 00202100
Bit 8: minimum phase count = 1
Bit 13: Reserved
Bit 21: reserved
E9 0600
PEAK_OCUC_COUNT <= 0006
Number of consecutive switch cycles exceeding peak OC limit before fault = 6
Number of consecutive switch cycles exceeding peak UC limit before fault = 0
F0 00212000
LOOPCFG, same as above
EA5B <= 000007FE
DMA register set
36 0080
VIN_OFF <= 8000
Sets V_IN OFF = -327680 mV
E932 <= 0038C5E0
DMA register set
35 0080
VIN_ON <= 8000
Sets V_IN ON = -327680 mV
EA0D <= 00296231
DMA register set
02 00
ON_OFF_CONFIG <= 0
force enables output
EA0D <= 00296231
DMA register set
Fixed PWM disable:
0210
ON_OFF_CONFIG <= undocumented option
EA0D <= 00296235
DMA operation
F0 00212000
LOOPCFG <= 00202100
Bit 8: minimum phase count = 1
Bit 13: Reserved
Bit 21: reserved
F0 40212000
LOOPCFG <= 00202140
Bit 6: diode emulation enable
Bit 8: minimum phase count = 1
Bit 13: Reserved
Bit 21: reserved
F0 40212010
LOOPCFG <= 10202140
Bit 6: diode emulation enable
Bit 8: minimum phase count = 1
Bit 13: Reserved
Bit 21: reserved
Bit 28: Enable diode emulation for PS0/1
E9 0606
PEAK_OCUC_COUNT <= 0606
Number of consecutive switch cycles exceeding peak OC limit before fault = 6
Number of consecutive switch cycles exceeding peak UC limit before fault = 6
(This is the default value)
F0 40212010
LOOPCFG <= 10202140, same as above
EA5B <= 000007FE
DMA operation
E932 <= 003EC5EF
DMA operation
35 BC02
VIN_ON <= 02BC
Sets V_IN ON to 700 mV
36 F401
VIN_OFF <= 01F4
Sets V_IN OFF to 500 mV
02 1E
Use configured TOFF_DELAY and TOFF_FALL settings
Active high enable pin
Enable requires enable pin AND OERATION command
EA0D <= 00296234
DMA operation
The known PMBus operations all seem reasonable; I'm not sure if any of the mystery DMA operations are load-bearing here.
given the DMA accesses have different values between the two runs, I want a response from Renesas on those before we try it
In summary, the address map and required sequence to enable a specific rail is detailed below, with notes about what the registers are doing and why we want to set them this way as the guidance from Renesas was slightly lacking in this level of detail. Anything with a 4-byte address is a DMA access, anything with a 2-byte address is a regular PMBUS transaction.
All addresses can be assumed to be the same for either the RAA229618 or the ISL68224 controller unless specifically noted
NOTE: for any regular PMBUS transactions, make sure to set the page to the correct rail before performing the transaction
=========================================
Enable fixed PWM mode order of operations
=========================================
PWM Pulse Width
Set to 0x133 for 50ns
Rail 0: 0xEA31
Rail 1: 0xEAB1
Rail 2: 0xEB31
0xF0 - loop_cfg - Read-Modify-Write
--disable diode emulation mode everywhere by setting bits 6 and 28 to 0
0x09 - phase_current_limit_count
-set to 0x00 06
--set per-phase undercurrent behavior to limit output current instead of faulting after 6 events and leave output overcurrent limit set to fault after 6 events
--value of over/under current limits are set in registers 0xCD and 0xCE, respectively in 0.1A/LSB in 2's complement. default is 60A and -60A, respectively
--by disabling the output undercurrent fault on phases, we can see if the upper MOSFET is working which shows up as a large negative current when enabled and seems to be mainly caused by either the bootstrap power supply being damaged or one of the three 5V bias pins on the power stage aren't connected correctly
Fixed Pulse Width Enable
Read-Modify-Write 0x1 to enable
Rail 0: 0xEA0D
Rail 1: 0xEA8D
Rail 2: 0xEB0D
--On this register, bit 2 sets the ripple regulator to fixed-frequency PWM and bit 0 enables fixed PWM mode
==========================================
Disable fixed PWM mode order of operations
==========================================
Fixed Pulse Width Disable
Read-Modify-Write 0x4 to disable
Rail 0: 0xEA0D
Rail 1: 0xEA8D
Rail 2: 0xEB0D
0xF0 - loop_cfg - Read-Modify-Write
--enable diode emulation mode everywhere by setting bits 6 and 28 to 1
0x09 - phase_current_limit_count
-set to 0x06 06
--set per-phase undercurrent behavior to fault after 6 over/undercurrent events
The one thing we might have to add, based on testing, is whether we copy some of the rail fault register values, but I can't find where in PN to modify those, so I can't decode all of the bits in the register. The DMA register is 0xE952 for Rail 0 on ISL68224. PN disables input and output voltage faults and sets min and max voltages for things like the VMON pin to their absolute max values so that will never trip either.
If we see that we can't turn on a rail because it's having an output UV fault, we can try to dig into that more
when trying this on a Gimlet, the rail seems to not be happy. @mkeeter enabled VDD_MEM_EFGH phase 0 to be the only phase enabled and I saw there was a blackbox event and pulled the info:
eric@niles ~ $ pfexec humility -t gimlet-b-matt rendmp --blackbox --device 0x5B
humility: attached to 0483:3754:000D00344741500820383733 via ST-Link V3
rail0 uptime: 31.6 sec
rail1 uptime: 31.6 sec
controller fault: 00000000000000000000000100000000 ()
rail0 fault: 00000000000000000000000000000000 ()
rail1 fault: 00000000000000000000000000000000 ()
phase fault uc: 00000000000000000000000000000000 ()
phase fault oc: 00000000000000000000000000000000 ()
adc fault uc: 00000000000000000000000000000000 ()
adc fault oc: 00000000000000000000000000000000 ()
rail0 status: 0001100001000011 MFR_SPECIFIC | POWER_GOOD# | off | CML | none of the above
rail1 status: 0001100001000011 MFR_SPECIFIC | POWER_GOOD# | off | CML | none of the above
status cml: 00001000 VIN_UV_FAULT
status mfr: 00001000 BBEVENT
rail1 status vout: 00000000 ()
rail0 status vout: 00000000 ()
rail1 status iout: 00000000 ()
rail0 status iout: 00000000 ()
rail1 status temperature: 00000000 ()
rail0 status temperature: 00000000 ()
rail1 status input: 00000000 ()
rail0 status input: 00000000 ()
| RAIL 0 | RAIL 1
-----|---------|-----------
VIN | 12.00 V | 12.00 V
VOUT | 0.000 V | 0.000 V
IIN | 0.00 A | 0.00 A
IOUT | 0.0 A | 0.0 A
TEMP | 24°C | 24°C
controller read temperature: 18°C
PHASE | TEMPERATURE | CURRENT
-------|-------------|----------
0 | 0°C | 0.0 A
1 | 0°C | 0.0 A
2 | 0°C | 0.0 A
3 | 0°C | 0.0 A
4 | 0°C | 0.0 A
5 | 0°C | 0.0 A
6 | 0°C | 0.0 A
7 | 0°C | 0.0 A
8 | 0°C | 0.0 A
9 | 0°C | 0.0 A
10 | 0°C | 0.0 A
11 | 0°C | 0.0 A
12 | 0°C | 0.0 A
13 | 0°C | 0.0 A
14 | 0°C | 0.0 A
15 | 0°C | 0.0 A
16 | 0°C | 0.0 A
17 | 0°C | 0.0 A
18 | 0°C | 0.0 A
19 | 24°C | 0.0 A
The VIN_UV_FAULT is interesting because it doesn't show up in the normal fault registers, but in the normal fault registers, there is something called ProcessorFault
that is flagged:
0x7e STATUS_CML 0x88
|
| b7 0b1 = invalid command(s) <= InvalidCommand
| b6 0b0 = no invalid data <= InvalidData
| b5 0b0 = not failed <= PECFailed
| b4 0b0 = no fault <= MemoryFault
| b3 0b1 = fault <= ProcessorFault
| b1 0b0 = no error <= OtherCommunicationError
| b0 0b0 = no error <= OtherMemoryLogicError
+-----------------------------------------------------------------------
This might be related to the mystery setting of railFltEn1_vinUnderVolt
fault setting changes that PN was showing but don't really seem to be able to be changed independently in the GUI
one of the issues was we weren't setting ON_OFF_CONFIG to 0x00 (which would ignore all control from the pin or PMBUS and always have the rail enabled). This was in the output from PN, but I did not understand the significance