[subexp-daq] Report of a possible bug of the CAEN_V560 module

Håkan T Johansson f96hajo at chalmers.se
Thu Feb 22 17:06:19 CET 2024


On Thu, 22 Feb 2024, Weber, Guenter Dr. wrote:

> 
> Dear Hans,
> 
> 
> many thanks! And in particular for all the detailed explanations.
> 
> 
> For the VULOM "sleep 1" did not do the trick, but "sleep 10" worked. Is
> there any chance to ask the VULOM if it feels ready to do the job, instead
> of using a random waiting time?

Now there is!  Update trloii and recompile trlo_ctrl,

   --trig-status   will at the end show some extra lines:

Serial timestamp status:(0x000a8004) words:  4 badbits: 0 CHKsum:0x00
Serial timestamp: Sync: no  Bitstr. sync: no, had loss  Data ptn: no, had loss

Where it should say "Sync: ok" when the receiver has locked.

The 'had loss' and bad bits count can be cleared (when locked) by issuing
   "pulse = SERIAL_TSTAMP_FAIL_CLEAR"

> Also I noticed that when aksing the VULOM which firmware it is using, we get
> a slightly different reply than the actual firmware number:
> 
> RIO4-MCAL-1 mbsdaq > vulomflash --addr=3 --read
> VULOM base address: 0x03000000
> hwmap_mapvme.c:398: LOG: Virtual address for VULOM/TRIDI @ VME 0x03000000 is
> 0x3005e000.
> Performing command 'read'...
> VOLUM+0 => 0x14091f20
> VOLUM+RANGE_REG(0x800000) => 0x0000006a
> Released vme ptr.
> But the actual firmware number is 1409285e.
> 
> For comparison should one look only at the first four hex numbers? Or is
> there more to take into account?

Yes, vulomflash --read reads at offset 0, and at that offset is also a 
TRIVA module mimic, which only uses the low 16 bits however.  So the high 
16 bits give part of the firmware hash.

> For the V560 module, misusing the bitmask for the counter resolved the
> issue. At the end of this mail, I attach the new log. Maybe you find
> something notable, but to me it looks fine now.
> 
> 
> Our next steps would be as follows:
> 
> 
> 1) Wait for you to implement the bugfixes of the last days into NURDLIB.
> 
> 2) Setting up the test system with the most recent version of NURDLIB and
> checking, if our minimal system with VULOM and V560 is now running smoothly.
> 
> 3) Hammering the V767 TDC into NURDLIB.
> 
> 4) Once we have achieved this, we would go back to testing the SIS3316
> modules.
> 
> 
> 
> Best greetings
> 
> Günter

Cheers,
Håkan



> 
> 
> 
> 
> 10: lwroc_hostname_util.c:108: Host 'lyserv' known as 192.168.1.1 (port:
> 56583).
> Thread has no error buffer yet...
> CPUS: 1
> delay: 1
> 10: lwroc_hostname_util.c:108: Host 'lyserv' known as 192.168.1.1 (port:
> 56583).
> Thread has no error buffer yet...
> HOST: RIO4-MCAL-1
> Token: d6d68d7b (d6d68d7b:d6d68d7b) [/mbsusr/mbsdaq/.drasi_tokens/mcal]
> 10: lwroc_hostname_util.c:457: Own address: 192.168.1.71/255.255.255.0
> (eth1).
> 10: lwroc_data_pipe.c:145: Data buffer READOUT_PIPE, size 419430400 =
> 0x19000000, 1 consumers.
> 10: lwroc_triva_readout.c:66: Silence TRIVA  (HALT)
> 10: lwroc_net_io.c:167: Started server on port 56583 (data port 39534).
> client union size: 244 208 188 508 640 204 204  => 640
> 10: lwroc_udp_awaken_hints.c:159: UDP awaken hints file:
> /tmp/drasi.u1001/drasi.hints.u1001.RIO4-MCAL-1:56583
> 10: lwroc_main.c:706: Log message rate limit not in effect.
> 10: lwroc_readout.c:112: call readout_init...
> 10: lwroc_thread_util.c:117: This is the triva control thread!
> 10: lwroc_thread_util.c:117: This is the net io thread!
> 10: lwroc_thread_util.c:117: This is the slow_async thread!
> 10: lwroc_thread_util.c:117: This is the data server thread!
> 8: lwroc_message_wait.c:86: Waited 1 seconds for msg client.
> 8: lwroc_triva_state.c:414: Waited 1 seconds for initial slave and EB
> connection(s):
> 8: lwroc_triva_state.c:422: [EB lyserv] (state 0)
> 10: lwroc_message_internal.c:472: Message client connected!
> 10: lwroc_net_trans.c:1156: [drasi] Transport client connected (data)
> [192.168.1.1].
> 10: lwroc_triva_control.c:370: Setup TRIVA  (DISBUS, HALT, MASTER, RESET)
> 10: lwroc_triva_control.c:418: Minimum event time
> ctime(5000)+1*rd(686)+3*wr(634)+fctime(1000)=8588 ns (116.442 kHz)
> 10: lwroc_triva_state.c:1486: (Re)send ident messages...
> 10: lwroc_triva_control.c:495: START TEST ACQ: HALT, CLEAR=RESET, MT=1
> 9: lwroc_triva_control.c:507: TEST: GO
> 10: lwroc_triva_control.c:725: RUN: RESET
> 10: lwroc_triva_control.c:729: RUN: MT=14
> 9: lwroc_triva_control.c:737:   GO (1 good test triggers done) (max 116.4
> kHz)
> 10: lwroc_triva_readout.c:376: Trigger 14 seen.
> 10: config/config.c:181: Will try default cfgpath='/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default
> ', can be set with NURDLIB_DEF_PATH.
> 8: lwroc_triva_state.c:2399: Master: deadtime: 1.  Status: 0x10
> (IN_READOUT).  EC: 1
> 10: lwroc_triva_state.c:2428: [EB lyserv] EB: Status: 0x0.
> 8: lwroc_triva_state.c:2488: Node(s) busy in readout, waiting...
> 10: config/parser.c:287: Opened'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/glob
> al.cfg' {
> 10: config/parser.c:299: Closed'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/glob
> al.cfg' }
> 10: config/parser.c:287: Opened './main.cfg' {
> 10: config/config.c:1299: .Global log level=debug.
> 10: config/parser.c:287: .Opened'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/crat
> e.cfg' {
> 10: config/parser.c:299: .Closed'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/crat
> e.cfg' }
> 10: config/parser.c:287: .Opened'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/gsi_
> vulom.cfg' {
> 10: config/parser.c:299: .Closed'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/gsi_
> vulom.cfg' }
> 10: config/parser.c:287: .Opened'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/modu
> le_log_level.cfg' {
> 10: config/parser.c:299: .Closed'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/modu
> le_log_level.cfg' }
> 10: config/parser.c:287: .Opened'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/caen
> _v560.cfg' {
> 10: config/parser.c:299: .Closed'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/caen
> _v560.cfg' }
> 10: config/parser.c:287: .Opened'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/modu
> le_log_level.cfg' {
> 10: config/parser.c:299: .Closed'/LynxOS/mbsusr/mbsdaq/mbsrun/rio4/2024_mcalstruck/nurdlib/cfg/default/modu
> le_log_level.cfg' }
> 10: config/parser.c:299: Closed './main.cfg' }
> 10: crate/crate.c:348: crate_create {
> 10: crate/crate.c:674: crate_create(MCAL) }
> 10: crate/crate.c:900: crate_init(MCAL) {
> 10: crate/crate.c:924: .Slow-init module[0]=GSI_VULOM.
> LOG: TRLO: MD5SUM: 0x1409285e (CT: 63bb1d44 = 2023-01-08 19:45:08 UTC)
> 10: crate/crate.c:924: .Slow-init module[1]=CAEN_V560.
> 10: module/map/map.c:224: ...rd(0x33333300+0xfa/16)=963ns
> wr(0x33333300+0x50/16)=713ns.
> 10: crate/crate.c:977: .Fast-init module[0]=GSI_VULOM.
> 10: crate/crate.c:977: .Fast-init module[1]=CAEN_V560.
> 10: crate/crate.c:1074: crate_init(MCAL) }
> 10: ctrl/ctrl.c:788: Control server online.
> Thread has no error buffer yet...
> 10: f_user.c:559: WR ID=0x200.
> 10: f_user.c:565: TS offset unset. Will not modify stamp.
> 10: f_user.c:572: TPAT: No.
> 10: f_user.c:573: Sync-check: No.
> 10: f_user.c:575: Spill triggers: No.
> 10: f_user.c:576: LMU: No.
> 10: f_user.c:577: Timer latches: No.
> 10: f_user.c:578: Spill shape: No.
> 10: f_user.c:579: Micro-structure: No.
> 10: f_user.c:581: Multi-event flag: No.
> 10: f_user.c:586: UDP destination: None.
> GSI_VULOM: diff_module: 0, module_crate_counter: 0, module_event_counter: 0,
> module_this_minus_crate: 0
> CAEN_V560: diff_module: 0, module_crate_counter: 0, module_event_counter: 0,
> module_this_minus_crate: 0
> ...
> 
> 
> 
> ____________________________________________________________________________
> Von: Hans Toshihide Törnqvist <hans.tornqvist at chalmers.se>
> Gesendet: Donnerstag, 22. Februar 2024 15:26:15
> An: Weber, Guenter Dr.; Håkan T Johansson
> Betreff: Re: AW: [subexp-daq] Report of a possible bug of the CAEN_V560
> module  
> Dear Günter,
> 
> I have some ideas after your two e-mails, logs and config files are
> really useful :) I'll shorten the log for my comments. for the tl;dr
> version, just skip to the bottom code change suggestion.
> 
> On 2024-02-22 14:21, Weber, Guenter Dr. wrote:
> > Dear Hans,
> >
> > here is the output of the DRASI log from the test system. I am using the
> > most recent (or almost most recent) versions of the various software
> > packages from GITLAB. The system currently only has a VULOM and a V560
> > module.
> >
> > I added some comments to the output. To me it looks, the VULOM has some
> > problems at the beginning. And, separate from the VULOM issue, the math
> > of the difference in counters does not work out for the V560 module.
> >
> > The VULOM issue I did not notice before. So, maybe by adding a lot of
> > output lines into crate.c and then removing them I have broken
> > something. But it also possible that before I simply overlooked these
> > error message as in the end it looks like the DAQ is working fine.
> >
> > Best greetings
> >
> > Günter
> >
> > 10: f_user.c:559: WR ID=0x200.
> > 10: f_user.c:565: TS offset unset. Will not modify stamp.
> > 10: f_user.c:572: TPAT: No.
> > 10: f_user.c:573: Sync-check: No.
> > 10: f_user.c:575: Spill triggers: No.
> > 10: f_user.c:576: LMU: No.
> > 10: f_user.c:577: Timer latches: No.
> > 10: f_user.c:578: Spill shape: No.
> > 10: f_user.c:579: Micro-structure: No.
> > 10: f_user.c:581: Multi-event flag: No.
> > 10: f_user.c:586: UDP destination: None.
> > ***** looks like the VULOM has a problem. I did not notice this before
> *****
> > 5: module/gsi_vulom/gsi_vulom.c:12: .Serial timestamp receiver has had
> > sync failure, status=0x000a8000.
> > 5: module/gsi_vulom/gsi_vulom.c:12: .Serial timestamp receiver not
> > synced, status=0x000a8000.
> > 5: crate/crate.c:1250: .MCAL[0]=GSI_VULOM: readout_dt failed = 0x00000004.
> > 5: crate/crate.c:1322: .MCAL[0]=GSI_VULOM: Event counter:
> > crate=0x00000000/32, this-crate=0x00000000, module=0x00000000/31
> > diff=0xdeadbeef, shadow=0x00000000/31 diff=0xdeadbeef.
> 
> The sync could be bad if the DAQ starts too fast after setting up the
> timestamp source. Try "sleep 1" after setting up the vulom so the
> timestamp receiver in the vulom can latch onto its input, even if it's
> wired internally.
> 
> > ***** here we see the counter mismatch - I did add a 250 ms delay to
> > crate.c, before it does readout_dt again to avoid having thousands of
> > output lines here******
> > *
> > CAEN_V560: diff_module: 4294967295, module_crate_counter: 0,
> > module_event_counter: 1, module_this_minus_crate: 0
> > CAEN_V560: diff_module: 4294967294, module_crate_counter: 0,
> > module_event_counter: 2, module_this_minus_crate: 0
> > CAEN_V560: diff_module: 4294967293, module_crate_counter: 0,
> > module_event_counter: 3, module_this_minus_crate: 0
> > CAEN_V560: diff_module: 4294967292, module_crate_counter: 0,
> > module_event_counter: 4, module_this_minus_crate: 0
> > ***** after four trials of readout_dt of V560, we reach the timeout of 1
> > second*****
> > 5: crate/crate.c:1287: .MCAL[1]=CAEN_V560: readout_dt timeout.
> > 5: crate/crate.c:1322: .MCAL[1]=CAEN_V560: Event counter:
> > crate=0x00000000/32, this-crate=0x00000000, module=0x00000004/32
> > diff=0xfffffffc, shadow=0x00000000/32 diff=0x00000000.
> 
> This is an artifact of the soft counter in the v560, obviously we don't
> expect the module to have more accepted events while polling it, but the
> real problem comes a bit later.
> 
> > 5: crate/crate.c:1394: .MCAL: readout_dt failed!
> > 5: crate/crate.c:1501: .MCAL: had problems, re-initializing.
> > 10: crate/crate.c:684: .crate_deinit(MCAL) {
> > 10: crate/crate.c:708: .crate_deinit(MCAL) }
> > 8: lwroc_triva_state.c:2028: Master TRIVA/MI no progress last second,
> > and in deadtime.
> > 8: lwroc_triva_state.c:2399: Master: deadtime: 1.  Status: 0x10
> > (IN_READOUT).  EC: 2
> > 10: lwroc_triva_state.c:2428: [EB lyserv] EB: Status: 0x0.
> > 8: lwroc_triva_state.c:2488: Node(s) busy in readout, waiting...
> > 10: crate/crate.c:900: .crate_init(MCAL) {
> > 10: crate/crate.c:924: ..Slow-init module[0]=GSI_VULOM.
> > LOG: TRLO: MD5SUM: 0x1409285e (CT: 63bb1d44 = 2023-01-08 19:45:08 UTC)
> > 10: crate/crate.c:924: ..Slow-init module[1]=CAEN_V560.
> > 10: module/map/map.c:224: ....rd(0x33333300+0xfa/16)=963ns
> > wr(0x33333300+0x50/16)=713ns.
> > 10: crate/crate.c:977: ..Fast-init module[0]=GSI_VULOM.
> > 10: crate/crate.c:977: ..Fast-init module[1]=CAEN_V560.
> > 10: crate/crate.c:1074: .crate_init(MCAL) }
> > 5: f_user.c:1257: .had readout error, ret=0x14, trigger=14, prev=0
> 
> This is the last log from the readout part, it says trigger 14 was
> handled. This trigger is always fired in MBS-like DAQ's before starting
> the event loop, but no master start is delivered to the modules. No
> physical event is associated with it, so we expect no counter increases
> and no event data.
> 
> In general we read out "everything" for every trigger and rely on
> modules reporting their status/content properly. Modules like the v560
> ruin this since we always check it for all events and the counter always
> increments, and it clearly shouldn't for trigger 14. So, in this case it
> really did test the incorrect software logic...
> 
> I had a look in the v560 manual once more and only now did I realize
> that it is not trigger based, the scalers are only available on-the-fly.
> The event counter makes no sense then, so I will concede and suggest you
> set the module counter mask to 0.
> 
> Have a look in module/gsi_vftx2/gsi_vftx2.c line 79, another module
> without an event-counter which skips the whole counting stuff:
> 
> vftx2->module.event_counter.mask = 0;
> 
> Put something similar in module/caen_v560/caen_v560.c line 41, and feel
> free to remove the increment in readout_dt.
> 
> Hope the extra info isn't too verbose...
> 
> Best regards,
> Hans
> 
>


More information about the subexp-daq mailing list