[subexp-daq] Report of a possible bug of the CAEN_V560 module

Hans Toshihide Törnqvist hans.tornqvist at chalmers.se
Thu Feb 22 18:04:07 CET 2024


Dear Günter,

The fixes are up, I hope I didn't forget any. Try it, and in case of 
issues please do a "git diff" so we can see all the things that changed.

Thanks for helping out!

Cheers,
Hans

On 2024-02-22 16:09, Weber, Guenter Dr. wrote:
> Dear Hans,
> 
> 
> many thanks! And in particular for all the detailed explanations.
> 
> 
> For the VULOM "sleep 1" did not do the trick, but "sleep 10" worked. Is 
> there any chance to ask the VULOM if it feels ready to do the job, 
> instead of using a random waiting time?
> 
> 
> Also I noticed that when aksing the VULOM which firmware it is using, we 
> get a slightly different reply than the actual firmware number:
> 
> RIO4-MCAL-1 mbsdaq > vulomflash --addr=3 --read
> VULOM base address: 0x03000000
> hwmap_mapvme.c:398: LOG: Virtual address for VULOM/TRIDI @ VME 
> 0x03000000 is 0x3005e000.
> Performing command 'read'...
> *VOLUM+0 => 0x14091f20*
> VOLUM+RANGE_REG(0x800000) => 0x0000006a
> Released vme ptr.
> But the actual firmware number is *1409285e*.
> 
> For comparison should one look only at the first four hex numbers? Or is 
> there more to take into account?
> 
> 
> For the V560 module, misusing the bitmask for the counter resolved the 
> issue. At the end of this mail, I attach the new log. Maybe you find 
> something notable, but to me it looks fine now.
> 
> 
> Our next steps would be as follows:
> 
> 
> 1) Wait for you to implement the bugfixes of the last days into NURDLIB.
> 
> 2) Setting up the test system with the most recent version of NURDLIB 
> and checking, if our minimal system with VULOM and V560 is now running 
> smoothly.
> 
> 3) Hammering the V767 TDC into NURDLIB.
> 
> 4) Once we have achieved this, we would go back to testing the SIS3316 
> modules.
> 
> 
> 
> Best greetings
> 
> Günter
> 
> 
> 
> 
> 10: lwroc_hostname_util.c:108: Host 'lyserv' known as 192.168.1.1 (port: 
> 56583).
> Thread has no error buffer yet...
> CPUS: 1
> delay: 1
> 10: lwroc_hostname_util.c:108: Host 'lyserv' known as 192.168.1.1 (port: 
> 56583).
> Thread has no error buffer yet...
> HOST: RIO4-MCAL-1
> Token: d6d68d7b (d6d68d7b:d6d68d7b) [/mbsusr/mbsdaq/.drasi_tokens/mcal]
> 10: lwroc_hostname_util.c:457: Own address: 192.168.1.71/255.255.255.0 
> (eth1).
> 10: lwroc_data_pipe.c:145: Data buffer READOUT_PIPE, size 419430400 = 
> 0x19000000, 1 consumers.
> 10: lwroc_triva_readout.c:66: Silence TRIVA  (HALT)
> 10: lwroc_net_io.c:167: Started server on port 56583 (data port 39534).
> client union size: 244 208 188 508 640 204 204  => 640
> 10: lwroc_udp_awaken_hints.c:159: UDP awaken hints file: 
> /tmp/drasi.u1001/drasi.hints.u1001.RIO4-MCAL-1:56583
> 10: lwroc_main.c:706: Log message rate limit not in effect.
> 10: lwroc_readout.c:112: call readout_init...
> 10: lwroc_thread_util.c:117: This is the triva control thread!
> 10: lwroc_thread_util.c:117: This is the net io thread!
> 10: lwroc_thread_util.c:117: This is the slow_async thread!
> 10: lwroc_thread_util.c:117: This is the data server thread!
> 8: lwroc_message_wait.c:86: Waited 1 seconds for msg client.
> 8: lwroc_triva_state.c:414: Waited 1 seconds for initial slave and EB 
> connection(s):
> 8: lwroc_triva_state.c:422: [EB lyserv] (state 0)
> 10: lwroc_message_internal.c:472: Message client connected!
> 10: lwroc_net_trans.c:1156: [drasi] Transport client connected (data) 
> [192.168.1.1].
> 10: lwroc_triva_control.c:370: Setup TRIVA  (DISBUS, HALT, MASTER, RESET)
> 10: lwroc_triva_control.c:418: Minimum event time 
> ctime(5000)+1*rd(686)+3*wr(634)+fctime(1000)=8588 ns (116.442 kHz)
> 10: lwroc_triva_state.c:1486: (Re)send ident messages...
> 10: lwroc_triva_control.c:495: START TEST ACQ: HALT, CLEAR=RESET, MT=1
> 9: lwroc_triva_control.c:507: TEST: GO
> 10: lwroc_triva_control.c:725: RUN: RESET
> 10: lwroc_triva_control.c:729: RUN: MT=14
> 9: lwroc_triva_control.c:737:   GO (1 good test triggers done) (max 
> 116.4 kHz)
> 10: lwroc_triva_readout.c:376: Trigger 14 seen.
> 10: config/config.c:181: Will try default cfg 
> path='/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default', can be set with NURDLIB_DEF_PATH.
> 8: lwroc_triva_state.c:2399: Master: deadtime: 1.  Status: 0x10 
> (IN_READOUT).  EC: 1
> 10: lwroc_triva_state.c:2428: [EB lyserv] EB: Status: 0x0.
> 8: lwroc_triva_state.c:2488: Node(s) busy in readout, waiting...
> 10: config/parser.c:287: Opened 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/global.cfg' {
> 10: config/parser.c:299: Closed 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/global.cfg' }
> 10: config/parser.c:287: Opened './main.cfg' {
> 10: config/config.c:1299: .Global log level=debug.
> 10: config/parser.c:287: .Opened 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/crate.cfg' {
> 10: config/parser.c:299: .Closed 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/crate.cfg' }
> 10: config/parser.c:287: .Opened 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/gsi_vulom.cfg' {
> 10: config/parser.c:299: .Closed 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/gsi_vulom.cfg' }
> 10: config/parser.c:287: .Opened 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/module_log_level.cfg' {
> 10: config/parser.c:299: .Closed 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/module_log_level.cfg' }
> 10: config/parser.c:287: .Opened 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/caen_v560.cfg' {
> 10: config/parser.c:299: .Closed 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/caen_v560.cfg' }
> 10: config/parser.c:287: .Opened 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/module_log_level.cfg' {
> 10: config/parser.c:299: .Closed 
> '/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/module_log_level.cfg' }
> 10: config/parser.c:299: Closed './main.cfg' }
> 10: crate/crate.c:348: crate_create {
> 10: crate/crate.c:674: crate_create(MCAL) }
> 10: crate/crate.c:900: crate_init(MCAL) {
> 10: crate/crate.c:924: .Slow-init module[0]=GSI_VULOM.
> LOG: TRLO: MD5SUM: 0x1409285e (CT: 63bb1d44 = 2023-01-08 19:45:08 UTC)
> 10: crate/crate.c:924: .Slow-init module[1]=CAEN_V560.
> 10: module/map/map.c:224: ...rd(0x33333300+0xfa/16)=963ns 
> wr(0x33333300+0x50/16)=713ns.
> 10: crate/crate.c:977: .Fast-init module[0]=GSI_VULOM.
> 10: crate/crate.c:977: .Fast-init module[1]=CAEN_V560.
> 10: crate/crate.c:1074: crate_init(MCAL) }
> 10: ctrl/ctrl.c:788: Control server online.
> Thread has no error buffer yet...
> 10: f_user.c:559: WR ID=0x200.
> 10: f_user.c:565: TS offset unset. Will not modify stamp.
> 10: f_user.c:572: TPAT: No.
> 10: f_user.c:573: Sync-check: No.
> 10: f_user.c:575: Spill triggers: No.
> 10: f_user.c:576: LMU: No.
> 10: f_user.c:577: Timer latches: No.
> 10: f_user.c:578: Spill shape: No.
> 10: f_user.c:579: Micro-structure: No.
> 10: f_user.c:581: Multi-event flag: No.
> 10: f_user.c:586: UDP destination: None.
> GSI_VULOM: diff_module: 0, module_crate_counter: 0, 
> module_event_counter: 0, module_this_minus_crate: 0
> CAEN_V560: diff_module: 0, module_crate_counter: 0, 
> module_event_counter: 0, module_this_minus_crate: 0
> ...
> 
> 
> 
> ------------------------------------------------------------------------
> *Von:* Hans Toshihide Törnqvist <hans.tornqvist at chalmers.se>
> *Gesendet:* Donnerstag, 22. Februar 2024 15:26:15
> *An:* Weber, Guenter Dr.; Håkan T Johansson
> *Betreff:* Re: AW: [subexp-daq] Report of a possible bug of the 
> CAEN_V560 module
> Dear Günter,
> 
> I have some ideas after your two e-mails, logs and config files are
> really useful :) I'll shorten the log for my comments. for the tl;dr
> version, just skip to the bottom code change suggestion.
> 
> On 2024-02-22 14:21, Weber, Guenter Dr. wrote:
>> Dear Hans,
>> 
>> here is the output of the DRASI log from the test system. I am using the 
>> most recent (or almost most recent) versions of the various software 
>> packages from GITLAB. The system currently only has a VULOM and a V560 
>> module.
>> 
>> I added some comments to the output. To me it looks, the VULOM has some 
>> problems at the beginning. And, separate from the VULOM issue, the math 
>> of the difference in counters does not work out for the V560 module.
>> 
>> The VULOM issue I did not notice before. So, maybe by adding a lot of 
>> output lines into crate.c and then removing them I have broken 
>> something. But it also possible that before I simply overlooked these 
>> error message as in the end it looks like the DAQ is working fine.
>> 
>> Best greetings
>> 
>> Günter
>> 
>> 10: f_user.c:559: WR ID=0x200.
>> 10: f_user.c:565: TS offset unset. Will not modify stamp.
>> 10: f_user.c:572: TPAT: No.
>> 10: f_user.c:573: Sync-check: No.
>> 10: f_user.c:575: Spill triggers: No.
>> 10: f_user.c:576: LMU: No.
>> 10: f_user.c:577: Timer latches: No.
>> 10: f_user.c:578: Spill shape: No.
>> 10: f_user.c:579: Micro-structure: No.
>> 10: f_user.c:581: Multi-event flag: No.
>> 10: f_user.c:586: UDP destination: None.
>> ***** looks like the VULOM has a problem. I did not notice this before *****
>> 5: module/gsi_vulom/gsi_vulom.c:12: .Serial timestamp receiver has had 
>> sync failure, status=0x000a8000.
>> 5: module/gsi_vulom/gsi_vulom.c:12: .Serial timestamp receiver not 
>> synced, status=0x000a8000.
>> 5: crate/crate.c:1250: .MCAL[0]=GSI_VULOM: readout_dt failed = 0x00000004.
>> 5: crate/crate.c:1322: .MCAL[0]=GSI_VULOM: Event counter: 
>> crate=0x00000000/32, this-crate=0x00000000, module=0x00000000/31 
>> diff=0xdeadbeef, shadow=0x00000000/31 diff=0xdeadbeef.
> 
> The sync could be bad if the DAQ starts too fast after setting up the
> timestamp source. Try "sleep 1" after setting up the vulom so the
> timestamp receiver in the vulom can latch onto its input, even if it's
> wired internally.
> 
>> ***** here we see the counter mismatch - I did add a 250 ms delay to 
>> crate.c, before it does readout_dt again to avoid having thousands of 
>> output lines here******
>> *
>> CAEN_V560: diff_module: 4294967295, module_crate_counter: 0, 
>> module_event_counter: 1, module_this_minus_crate: 0
>> CAEN_V560: diff_module: 4294967294, module_crate_counter: 0, 
>> module_event_counter: 2, module_this_minus_crate: 0
>> CAEN_V560: diff_module: 4294967293, module_crate_counter: 0, 
>> module_event_counter: 3, module_this_minus_crate: 0
>> CAEN_V560: diff_module: 4294967292, module_crate_counter: 0, 
>> module_event_counter: 4, module_this_minus_crate: 0
>> ***** after four trials of readout_dt of V560, we reach the timeout of 1 
>> second*****
>> 5: crate/crate.c:1287: .MCAL[1]=CAEN_V560: readout_dt timeout.
>> 5: crate/crate.c:1322: .MCAL[1]=CAEN_V560: Event counter: 
>> crate=0x00000000/32, this-crate=0x00000000, module=0x00000004/32 
>> diff=0xfffffffc, shadow=0x00000000/32 diff=0x00000000.
> 
> This is an artifact of the soft counter in the v560, obviously we don't
> expect the module to have more accepted events while polling it, but the
> real problem comes a bit later.
> 
>> 5: crate/crate.c:1394: .MCAL: readout_dt failed!
>> 5: crate/crate.c:1501: .MCAL: had problems, re-initializing.
>> 10: crate/crate.c:684: .crate_deinit(MCAL) {
>> 10: crate/crate.c:708: .crate_deinit(MCAL) }
>> 8: lwroc_triva_state.c:2028: Master TRIVA/MI no progress last second, 
>> and in deadtime.
>> 8: lwroc_triva_state.c:2399: Master: deadtime: 1.  Status: 0x10 
>> (IN_READOUT).  EC: 2
>> 10: lwroc_triva_state.c:2428: [EB lyserv] EB: Status: 0x0.
>> 8: lwroc_triva_state.c:2488: Node(s) busy in readout, waiting...
>> 10: crate/crate.c:900: .crate_init(MCAL) {
>> 10: crate/crate.c:924: ..Slow-init module[0]=GSI_VULOM.
>> LOG: TRLO: MD5SUM: 0x1409285e (CT: 63bb1d44 = 2023-01-08 19:45:08 UTC)
>> 10: crate/crate.c:924: ..Slow-init module[1]=CAEN_V560.
>> 10: module/map/map.c:224: ....rd(0x33333300+0xfa/16)=963ns 
>> wr(0x33333300+0x50/16)=713ns.
>> 10: crate/crate.c:977: ..Fast-init module[0]=GSI_VULOM.
>> 10: crate/crate.c:977: ..Fast-init module[1]=CAEN_V560.
>> 10: crate/crate.c:1074: .crate_init(MCAL) }
>> 5: f_user.c:1257: .had readout error, ret=0x14, trigger=14, prev=0
> 
> This is the last log from the readout part, it says trigger 14 was
> handled. This trigger is always fired in MBS-like DAQ's before starting
> the event loop, but no master start is delivered to the modules. No
> physical event is associated with it, so we expect no counter increases
> and no event data.
> 
> In general we read out "everything" for every trigger and rely on
> modules reporting their status/content properly. Modules like the v560
> ruin this since we always check it for all events and the counter always
> increments, and it clearly shouldn't for trigger 14. So, in this case it
> really did test the incorrect software logic...
> 
> I had a look in the v560 manual once more and only now did I realize
> that it is not trigger based, the scalers are only available on-the-fly.
> The event counter makes no sense then, so I will concede and suggest you
> set the module counter mask to 0.
> 
> Have a look in module/gsi_vftx2/gsi_vftx2.c line 79, another module
> without an event-counter which skips the whole counting stuff:
> 
> vftx2->module.event_counter.mask = 0;
> 
> Put something similar in module/caen_v560/caen_v560.c line 41, and feel
> free to remove the increment in readout_dt.
> 
> Hope the extra info isn't too verbose...
> 
> Best regards,
> Hans
> 


More information about the subexp-daq mailing list