[subexp-daq] Report of a possible bug of the CAEN_V560 module

Weber, Guenter Dr. g.weber at hi-jena.gsi.de
Thu Feb 22 16:09:25 CET 2024


Dear Hans,


many thanks! And in particular for all the detailed explanations.


For the VULOM "sleep 1" did not do the trick, but "sleep 10" worked. Is there any chance to ask the VULOM if it feels ready to do the job, instead of using a random waiting time?


Also I noticed that when aksing the VULOM which firmware it is using, we get a slightly different reply than the actual firmware number:

RIO4-MCAL-1 mbsdaq > vulomflash --addr=3 --read
VULOM base address: 0x03000000
hwmap_mapvme.c:398: LOG: Virtual address for VULOM/TRIDI @ VME 0x03000000 is 0x3005e000.
Performing command 'read'...
VOLUM+0 => 0x14091f20
VOLUM+RANGE_REG(0x800000) => 0x0000006a
Released vme ptr.
But the actual firmware number is 1409285e.

For comparison should one look only at the first four hex numbers? Or is there more to take into account?


For the V560 module, misusing the bitmask for the counter resolved the issue. At the end of this mail, I attach the new log. Maybe you find something notable, but to me it looks fine now.


Our next steps would be as follows:


1) Wait for you to implement the bugfixes of the last days into NURDLIB.

2) Setting up the test system with the most recent version of NURDLIB and checking, if our minimal system with VULOM and V560 is now running smoothly.

3) Hammering the V767 TDC into NURDLIB.

4) Once we have achieved this, we would go back to testing the SIS3316 modules.



Best greetings

Günter




10: lwroc_hostname_util.c:108: Host 'lyserv' known as 192.168.1.1 (port: 56583).
Thread has no error buffer yet...
CPUS: 1
delay: 1
10: lwroc_hostname_util.c:108: Host 'lyserv' known as 192.168.1.1 (port: 56583).
Thread has no error buffer yet...
HOST: RIO4-MCAL-1
Token: d6d68d7b (d6d68d7b:d6d68d7b) [/mbsusr/mbsdaq/.drasi_tokens/mcal]
10: lwroc_hostname_util.c:457: Own address: 192.168.1.71/255.255.255.0 (eth1).
10: lwroc_data_pipe.c:145: Data buffer READOUT_PIPE, size 419430400 = 0x19000000, 1 consumers.
10: lwroc_triva_readout.c:66: Silence TRIVA  (HALT)
10: lwroc_net_io.c:167: Started server on port 56583 (data port 39534).
client union size: 244 208 188 508 640 204 204  => 640
10: lwroc_udp_awaken_hints.c:159: UDP awaken hints file: /tmp/drasi.u1001/drasi.hints.u1001.RIO4-MCAL-1:56583
10: lwroc_main.c:706: Log message rate limit not in effect.
10: lwroc_readout.c:112: call readout_init...
10: lwroc_thread_util.c:117: This is the triva control thread!
10: lwroc_thread_util.c:117: This is the net io thread!
10: lwroc_thread_util.c:117: This is the slow_async thread!
10: lwroc_thread_util.c:117: This is the data server thread!
8: lwroc_message_wait.c:86: Waited 1 seconds for msg client.
8: lwroc_triva_state.c:414: Waited 1 seconds for initial slave and EB connection(s):
8: lwroc_triva_state.c:422: [EB lyserv] (state 0)
10: lwroc_message_internal.c:472: Message client connected!
10: lwroc_net_trans.c:1156: [drasi] Transport client connected (data) [192.168.1.1].
10: lwroc_triva_control.c:370: Setup TRIVA  (DISBUS, HALT, MASTER, RESET)
10: lwroc_triva_control.c:418: Minimum event time ctime(5000)+1*rd(686)+3*wr(634)+fctime(1000)=8588 ns (116.442 kHz)
10: lwroc_triva_state.c:1486: (Re)send ident messages...
10: lwroc_triva_control.c:495: START TEST ACQ: HALT, CLEAR=RESET, MT=1
9: lwroc_triva_control.c:507: TEST: GO
10: lwroc_triva_control.c:725: RUN: RESET
10: lwroc_triva_control.c:729: RUN: MT=14
9: lwroc_triva_control.c:737:   GO (1 good test triggers done) (max 116.4 kHz)
10: lwroc_triva_readout.c:376: Trigger 14 seen.
10: config/config.c:181: Will try default cfg path='/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default', can be set with NURDLIB_DEF_PATH.
8: lwroc_triva_state.c:2399: Master: deadtime: 1.  Status: 0x10 (IN_READOUT).  EC: 1
10: lwroc_triva_state.c:2428: [EB lyserv] EB: Status: 0x0.
8: lwroc_triva_state.c:2488: Node(s) busy in readout, waiting...
10: config/parser.c:287: Opened '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/global.cfg' {
10: config/parser.c:299: Closed '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/global.cfg' }
10: config/parser.c:287: Opened './main.cfg' {
10: config/config.c:1299: .Global log level=debug.
10: config/parser.c:287: .Opened '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/crate.cfg' {
10: config/parser.c:299: .Closed '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/crate.cfg' }
10: config/parser.c:287: .Opened '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/gsi_vulom.cfg' {
10: config/parser.c:299: .Closed '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/gsi_vulom.cfg' }
10: config/parser.c:287: .Opened '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/module_log_level.cfg' {
10: config/parser.c:299: .Closed '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/module_log_level.cfg' }
10: config/parser.c:287: .Opened '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/caen_v560.cfg' {
10: config/parser.c:299: .Closed '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/caen_v560.cfg' }
10: config/parser.c:287: .Opened '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/module_log_level.cfg' {
10: config/parser.c:299: .Closed '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/module_log_level.cfg' }
10: config/parser.c:299: Closed './main.cfg' }
10: crate/crate.c:348: crate_create {
10: crate/crate.c:674: crate_create(MCAL) }
10: crate/crate.c:900: crate_init(MCAL) {
10: crate/crate.c:924: .Slow-init module[0]=GSI_VULOM.
LOG: TRLO: MD5SUM: 0x1409285e (CT: 63bb1d44 = 2023-01-08 19:45:08 UTC)
10: crate/crate.c:924: .Slow-init module[1]=CAEN_V560.
10: module/map/map.c:224: ...rd(0x33333300+0xfa/16)=963ns wr(0x33333300+0x50/16)=713ns.
10: crate/crate.c:977: .Fast-init module[0]=GSI_VULOM.
10: crate/crate.c:977: .Fast-init module[1]=CAEN_V560.
10: crate/crate.c:1074: crate_init(MCAL) }
10: ctrl/ctrl.c:788: Control server online.
Thread has no error buffer yet...
10: f_user.c:559: WR ID=0x200.
10: f_user.c:565: TS offset unset. Will not modify stamp.
10: f_user.c:572: TPAT: No.
10: f_user.c:573: Sync-check: No.
10: f_user.c:575: Spill triggers: No.
10: f_user.c:576: LMU: No.
10: f_user.c:577: Timer latches: No.
10: f_user.c:578: Spill shape: No.
10: f_user.c:579: Micro-structure: No.
10: f_user.c:581: Multi-event flag: No.
10: f_user.c:586: UDP destination: None.
GSI_VULOM: diff_module: 0, module_crate_counter: 0, module_event_counter: 0, module_this_minus_crate: 0
CAEN_V560: diff_module: 0, module_crate_counter: 0, module_event_counter: 0, module_this_minus_crate: 0
...


________________________________
Von: Hans Toshihide Törnqvist <hans.tornqvist at chalmers.se>
Gesendet: Donnerstag, 22. Februar 2024 15:26:15
An: Weber, Guenter Dr.; Håkan T Johansson
Betreff: Re: AW: [subexp-daq] Report of a possible bug of the CAEN_V560 module

Dear Günter,

I have some ideas after your two e-mails, logs and config files are
really useful :) I'll shorten the log for my comments. for the tl;dr
version, just skip to the bottom code change suggestion.

On 2024-02-22 14:21, Weber, Guenter Dr. wrote:
> Dear Hans,
>
> here is the output of the DRASI log from the test system. I am using the
> most recent (or almost most recent) versions of the various software
> packages from GITLAB. The system currently only has a VULOM and a V560
> module.
>
> I added some comments to the output. To me it looks, the VULOM has some
> problems at the beginning. And, separate from the VULOM issue, the math
> of the difference in counters does not work out for the V560 module.
>
> The VULOM issue I did not notice before. So, maybe by adding a lot of
> output lines into crate.c and then removing them I have broken
> something. But it also possible that before I simply overlooked these
> error message as in the end it looks like the DAQ is working fine.
>
> Best greetings
>
> Günter
>
> 10: f_user.c:559: WR ID=0x200.
> 10: f_user.c:565: TS offset unset. Will not modify stamp.
> 10: f_user.c:572: TPAT: No.
> 10: f_user.c:573: Sync-check: No.
> 10: f_user.c:575: Spill triggers: No.
> 10: f_user.c:576: LMU: No.
> 10: f_user.c:577: Timer latches: No.
> 10: f_user.c:578: Spill shape: No.
> 10: f_user.c:579: Micro-structure: No.
> 10: f_user.c:581: Multi-event flag: No.
> 10: f_user.c:586: UDP destination: None.
> ***** looks like the VULOM has a problem. I did not notice this before *****
> 5: module/gsi_vulom/gsi_vulom.c:12: .Serial timestamp receiver has had
> sync failure, status=0x000a8000.
> 5: module/gsi_vulom/gsi_vulom.c:12: .Serial timestamp receiver not
> synced, status=0x000a8000.
> 5: crate/crate.c:1250: .MCAL[0]=GSI_VULOM: readout_dt failed = 0x00000004.
> 5: crate/crate.c:1322: .MCAL[0]=GSI_VULOM: Event counter:
> crate=0x00000000/32, this-crate=0x00000000, module=0x00000000/31
> diff=0xdeadbeef, shadow=0x00000000/31 diff=0xdeadbeef.

The sync could be bad if the DAQ starts too fast after setting up the
timestamp source. Try "sleep 1" after setting up the vulom so the
timestamp receiver in the vulom can latch onto its input, even if it's
wired internally.

> ***** here we see the counter mismatch - I did add a 250 ms delay to
> crate.c, before it does readout_dt again to avoid having thousands of
> output lines here******
> *
> CAEN_V560: diff_module: 4294967295, module_crate_counter: 0,
> module_event_counter: 1, module_this_minus_crate: 0
> CAEN_V560: diff_module: 4294967294, module_crate_counter: 0,
> module_event_counter: 2, module_this_minus_crate: 0
> CAEN_V560: diff_module: 4294967293, module_crate_counter: 0,
> module_event_counter: 3, module_this_minus_crate: 0
> CAEN_V560: diff_module: 4294967292, module_crate_counter: 0,
> module_event_counter: 4, module_this_minus_crate: 0
> ***** after four trials of readout_dt of V560, we reach the timeout of 1
> second*****
> 5: crate/crate.c:1287: .MCAL[1]=CAEN_V560: readout_dt timeout.
> 5: crate/crate.c:1322: .MCAL[1]=CAEN_V560: Event counter:
> crate=0x00000000/32, this-crate=0x00000000, module=0x00000004/32
> diff=0xfffffffc, shadow=0x00000000/32 diff=0x00000000.

This is an artifact of the soft counter in the v560, obviously we don't
expect the module to have more accepted events while polling it, but the
real problem comes a bit later.

> 5: crate/crate.c:1394: .MCAL: readout_dt failed!
> 5: crate/crate.c:1501: .MCAL: had problems, re-initializing.
> 10: crate/crate.c:684: .crate_deinit(MCAL) {
> 10: crate/crate.c:708: .crate_deinit(MCAL) }
> 8: lwroc_triva_state.c:2028: Master TRIVA/MI no progress last second,
> and in deadtime.
> 8: lwroc_triva_state.c:2399: Master: deadtime: 1.  Status: 0x10
> (IN_READOUT).  EC: 2
> 10: lwroc_triva_state.c:2428: [EB lyserv] EB: Status: 0x0.
> 8: lwroc_triva_state.c:2488: Node(s) busy in readout, waiting...
> 10: crate/crate.c:900: .crate_init(MCAL) {
> 10: crate/crate.c:924: ..Slow-init module[0]=GSI_VULOM.
> LOG: TRLO: MD5SUM: 0x1409285e (CT: 63bb1d44 = 2023-01-08 19:45:08 UTC)
> 10: crate/crate.c:924: ..Slow-init module[1]=CAEN_V560.
> 10: module/map/map.c:224: ....rd(0x33333300+0xfa/16)=963ns
> wr(0x33333300+0x50/16)=713ns.
> 10: crate/crate.c:977: ..Fast-init module[0]=GSI_VULOM.
> 10: crate/crate.c:977: ..Fast-init module[1]=CAEN_V560.
> 10: crate/crate.c:1074: .crate_init(MCAL) }
> 5: f_user.c:1257: .had readout error, ret=0x14, trigger=14, prev=0

This is the last log from the readout part, it says trigger 14 was
handled. This trigger is always fired in MBS-like DAQ's before starting
the event loop, but no master start is delivered to the modules. No
physical event is associated with it, so we expect no counter increases
and no event data.

In general we read out "everything" for every trigger and rely on
modules reporting their status/content properly. Modules like the v560
ruin this since we always check it for all events and the counter always
increments, and it clearly shouldn't for trigger 14. So, in this case it
really did test the incorrect software logic...

I had a look in the v560 manual once more and only now did I realize
that it is not trigger based, the scalers are only available on-the-fly.
The event counter makes no sense then, so I will concede and suggest you
set the module counter mask to 0.

Have a look in module/gsi_vftx2/gsi_vftx2.c line 79, another module
without an event-counter which skips the whole counting stuff:

vftx2->module.event_counter.mask = 0;

Put something similar in module/caen_v560/caen_v560.c line 41, and feel
free to remove the increment in readout_dt.

Hope the extra info isn't too verbose...

Best regards,
Hans
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.chalmers.se/pipermail/subexp-daq/attachments/20240222/3c9bde3b/attachment-0001.html>


More information about the subexp-daq mailing list