Skip to content

Conversation

lz-bro
Copy link
Contributor

@lz-bro lz-bro commented Sep 4, 2025

Due to certain reasons, the DMI scan is busy, leading to a timeout and exit. When restarting OpenCD, a hard reset of the DTM is required.

Due to certain reasons, the DMI scan is busy,
leading to a timeout and exit. When restarting
OpenCD, a hard reset of the DTM is required.
@lz-bro lz-bro force-pushed the hardreset-dtm-when-busy branch from 9675154 to c9193f8 Compare September 4, 2025 09:01
@lz-bro
Copy link
Contributor Author

lz-bro commented Sep 4, 2025

some logs is as follows:

Debug: 241 31 riscv-013.c:1922 examine(): [ACPU.cpu.0] dtmcontrol=0x7ce1
Debug: 242 31 riscv-013.c:1923 examine(): [ACPU.cpu.0] dtmcs=0x7ce1 {version=1_0 dmistat=3 idle=7 errinfo=not_implemented abits=0xe}
Debug: 243 31 riscv-013.c:491 check_dbgbase_exists(): [ACPU.cpu.0] Searching for DM with DMI base address (dbgbase) = 0x0
Debug: 244 31 riscv-013.c:244 get_dm(): [ACPU.cpu.0] Coreid [0] Allocating new DM
Debug: 245 31 batch.c:285 riscv_batch_run_from(): [ACPU.cpu.0] Running batch of scans [0, 2)
Debug: 255 31 batch.c:249 log_batch(): 48b r 00000000 @10 -> b 004003a3 @10; 0i
Debug: 256 31 batch.c:249 log_batch(): 48b - 00000000 @00 -> b 004003a3 @10; 0i
Debug: 266 31 riscv.c:444 dtmcontrol_scan(): [ACPU.cpu.0] DTMCS: 0x10000 -> ?
Debug: 267 31 batch.h:86 riscv_scan_set_delay(): DM access delay is set to 1.
Debug: 268 31 batch.c:128 add_idle_before_batch(): [ACPU.cpu.0] Adding 1 idle cycles before the batch.
Debug: 269 31 batch.c:285 riscv_batch_run_from(): [ACPU.cpu.0] Running batch of scans [0, 2)
Debug: 279 31 batch.c:249 log_batch(): 48b r 00000000 @10 -> b 004003a3 @10; 1i
Debug: 280 31 batch.c:249 log_batch(): 48b - 00000000 @00 -> b 004003a3 @10; 1i
Debug: 241 47 riscv-013.c:2010 examine(): [ACPU.cpu.0] dtmcontrol=0x7ce1
Debug: 242 47 riscv-013.c:2011 examine(): [ACPU.cpu.0] dtmcs=0x7ce1 {version=1_0 dmistat=3 idle=7 errinfo=not_implemented abits=0xe}
Debug: 252 47 riscv.c:463 dtmcs_scan(): TAP ACPU.cpu: DTMCS: 0x20000 -> ?
Debug: 253 47 riscv-013.c:543 check_dbgbase_exists(): [ACPU.cpu.0] Searching for DM with DMI base address (dbgbase) = 0x0
Debug: 254 47 riscv-013.c:296 get_dm(): [ACPU.cpu.0] Coreid [0] Allocating new DM
Debug: 255 47 batch.c:291 riscv_batch_run_from(): [ACPU.cpu.0] Running batch of scans [0, 2)
Debug: 265 47 batch.c:255 log_batch(): 48b r 00000000 @10 -> + 00000000 @00; 0i
Debug: 266 47 batch.c:255 log_batch(): 48b - 00000000 @00 -> b 00000000 @10; 0i
Debug: 276 47 riscv.c:463 dtmcs_scan(): TAP ACPU.cpu: DTMCS: 0x10000 -> ?
Debug: 277 47 batch.h:86 riscv_scan_set_delay(): DM access delay is set to 1.
Debug: 278 47 batch.c:139 add_idle_before_batch(): [ACPU.cpu.0] Adding 1 idle cycles before the batch.
Debug: 279 47 batch.c:291 riscv_batch_run_from(): [ACPU.cpu.0] Running batch of scans [1, 2)
Debug: 289 47 batch.c:255 log_batch(): 48b - 00000000 @00 -> + 00000000 @10; 1i
Debug: 290 47 batch.c:199 log_dmi_decoded(): read: dmcontrol=0 {}
Debug: 291 47 riscv-013.c:1879 reset_dm(): [ACPU.cpu.0] Activating the DM.
Debug: 292 47 batch.c:291 riscv_batch_run_from(): [ACPU.cpu.0] Running batch of scans [0, 2)
Debug: 302 47 batch.c:255 log_batch(): 48b w 00000001 @10 -> + 00000000 @10; 1i
Debug: 303 47 batch.c:199 log_dmi_decoded(): write: dmcontrol=1 {dmactive=active}
Debug: 304 47 batch.c:255 log_batch(): 48b - 00000000 @00 -> b 00000001 @10; 1i
Debug: 314 47 riscv.c:463 dtmcs_scan(): TAP ACPU.cpu: DTMCS: 0x10000 -> ?
Debug: 315 47 batch.h:86 riscv_scan_set_delay(): DM access delay is set to 2.
Debug: 316 47 batch.c:139 add_idle_before_batch(): [ACPU.cpu.0] Adding 1 idle cycles before the batch.
Debug: 317 47 batch.c:291 riscv_batch_run_from(): [ACPU.cpu.0] Running batch of scans [1, 2)
Debug: 327 47 batch.c:255 log_batch(): 48b - 00000000 @00 -> + 00000001 @10; 2i

@MarekVCodasip
Copy link
Collaborator

I don't think this should be done. I am not opposed to having some command which will force a reset if we don't have it already. But this sounds to me that it is a buggy chip. Which chip causes this if you can share it?

@lz-bro
Copy link
Contributor Author

lz-bro commented Sep 18, 2025

@MarekVCodasip Thank you for your reply.

But this sounds to me that it is a buggy chip.

Could you please tell me why you think this is a chip bug?

Which chip causes this if you can share it?

This is the chip we are developing. The chip is a daisy chain structure with an RCPU and an ACPU. The power of ACPU DM requires the RCPU to manipulating hardware registers. Accessing the power-down ACPU DM will cause this problem.

About the usage scenarios of dtmhardreset: In general this should only be used when the Debugger has reason to expect that the outstanding DMI transaction will never complete (e.g. a reset condition caused an inflight DMI transaction to be cancelled).

@MarekVCodasip
Copy link
Collaborator

I understand the usecase, but a hard-reset of the DTM is a big hammer: it aborts every in-flight DMI transaction and loses all debugger state. On large RTL simulations that can hide real bugs or corrupt the debug session, so I’d rather not make it the default recovery path for a plain timeout.

Could we expose it as an explicit TCL variable like riscv reset_on_busy?

@lz-bro
Copy link
Contributor Author

lz-bro commented Sep 22, 2025

Could we expose it as an explicit TCL variable like riscv reset_on_busy?

I agree, and you’d make a optional hard-reset of the DTM for a plain timeout instead of this PR?

@en-sc
Copy link
Collaborator

en-sc commented Sep 22, 2025

@lz-bro, @MarekVCodasip, I'd suggest considering just implementing this as a Tcl script, something like:

proc reset_dtm target {
    set tap [$target cget -chain-position]
    poll disable
    try {
        irscan $tap 0x10
        drscan $tap 17 0 1 1 14 0
    } finally {
        poll enable
    }
}

To achieve the behavior in this patch, this script can be run on examine-start event. The script can include a condition, but IMHO the condition from the patch is not correct, e.g. dtmcs.dmistat = busy indicates that the last operation was run while an operation is already in progress, meaning such status requires an operation to fail.

We could add a target-type-specific event that is triggered on DMI operation timeout, though I'd suggest to wait until the event interface is reworked (see https://review.openocd.org/c/openocd/+/9079/comment/5ef2cfe8_f3d174aa/).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants