[subexp-daq] Report of a possible bug of the CAEN_V560 module
Hans Toshihide Törnqvist
hans.tornqvist at chalmers.se
Wed Feb 21 15:28:01 CET 2024
Dear Günter,
The most important thing is that you get reasonable values with these
reads, the actual values don't mean a whole lot.
One of the manual reads that you did (ofs=0xfa) is what 'map_map' does
for "poke reading". The macros
MAP_POKE_ARGS(fixed_code), or the older
MAP_POKE_ARGS(*v560->read, fixed_code)
tell 'map_map' what address offset to poke, and it depends on each module.
The next thing that happens in 'map_map' is the "poke writing". Could
you try to write to the 'scale_clear' register next? That would be:
rwdump -a0x33333350 -w16,0
---
In case you would like to look deeper in 'map_map', you can find it in
module/map/map.c around line-number 103. It's not a very complicated
function that does the following:
-) Checks user-mapped memory, you don't need to worry about this, it's
mainly for simulating module memory for tests.
-) Performs the poke-read.
-) Performs the poke-write.
-) If it's a BLT mapping, asks the platform-specific code to do that
without further tests.
-) Otherwise times the poke registers many times to get an idea about
the speed of every single-cycle access.
If you want to dig even deeper, you can look in
module/map/map_xpc_3310.c which is what is used in the most recent Linux
Rio4's. It's mainly a wrapper around a proprietary black-box library, so
not scary and scary at the same time.
Best regards,
Hans
On 2024-02-21 14:32, Weber, Guenter Dr. wrote:
> Dear Hans,
>
>
> with the different register addresses it works.
>
>
> RIO4-MCAL-1 mbsdaq > rwdump -a0x333333fa -r16
> Address=0x333333fa
> Raw-read value=0xfaf5
>
>
> RIO4-MCAL-1 mbsdaq > rwdump -a0x333333fc -r16
> Address=0x333333fc
> Raw-read value=0x083a
>
>
> RIO4-MCAL-1 mbsdaq > rwdump -a0x333333fe -r16
> Address=0x333333fe
> Raw-read value=0x01bc
>
> What can we learn from these numbers?
>
>
>
>
> Best greetings
>
> Günter
>
>
>
> ------------------------------------------------------------------------
> *Von:* Hans Toshihide Törnqvist <hans.tornqvist at chalmers.se>
> *Gesendet:* Mittwoch, 21. Februar 2024 12:43:06
> *An:* Weber, Guenter Dr.
> *Betreff:* Re: [subexp-daq] Report of a possible bug of the CAEN_V560
> module
> Hmm, looks like address offset 0 is "not used", could you try
> -a0x333333fa? Or fe and fc at the end,they should be some read-only
> registers.
>
>
> "Weber, Guenter Dr." <g.weber at hi-jena.gsi.de> skrev: (21 februari 2024
> 12:06:00 CET)
>
> Different VME slot of the V560 module, same result. :-(
>
> ------------------------------------------------------------------------
> *Von:* subexp-daq <subexp-daq-bounces at lists.chalmers.se> im Auftrag
> von Weber, Guenter Dr. <g.weber at hi-jena.gsi.de>
> *Gesendet:* Mittwoch, 21. Februar 2024 11:40:25
> *An:* Hans Toshihide Törnqvist; Discuss use of Nurdlib, TRLO II,
> drasi and UCESB.
> *Betreff:* Re: [subexp-daq] Report of a possible bug of the
> CAEN_V560 module
>
> Dear Hans,
>
>
> the output from manual reading of the module indeed shows a problem:
>
>
> RIO4-MCAL-1 mbsdaq > rwdump -a0x33333300 -r16
> Address=0x33333300
> Raw-read value=rwdump: line 28: 593 Bus error
> $PREFIX $f "$@"
>
>
> The module was working with this address in the other DAQ system (as
> we did not know the order of the individual switches, we set them
> all to "3"). But I can take it our and put it in again at a
> different slot, if maybe this particular slot has a hardware
> problem. (But I never heard of such thing.)
>
>
>
>
> Best greetings
>
> Günter
>
>
> ------------------------------------------------------------------------
> *Von:* Hans Toshihide Törnqvist <hans.tornqvist at chalmers.se>
> *Gesendet:* Mittwoch, 21. Februar 2024 11:14:44
> *An:* Discuss use of Nurdlib, TRLO II, drasi and UCESB.; Weber,
> Guenter Dr.
> *Betreff:* Re: [subexp-daq] Report of a possible bug of the
> CAEN_V560 module
> Dear Günter,
>
> map_map before mapping tries to read and write some given registers
> with a "safe" but slower method of accessing registers, which is
> called "poking" in nurdlib. Maybe the method of access on the rio4
> you have is not safe enough and one of the two pokes fails horribly...
>
> Could you please double check the module address? Could you also try
> using bin/rwdump to read any register in the v560 to see if it's
> accessible at all and not a problem with the module implementation
> in nurdlib?
>
> Something like bin/rwdump -a0x33333300 -r16
>
> Actually the address 0x33333300 looks weird to me, maybe it should
> be 0x33330000?
> Also for reading, try register offsets fa, fc, fe, with 16 bits
> accesseses, they should have some interesting values.
>
> Cheers,
> Hans
>
>
> "Weber, Guenter Dr." <g.weber at hi-jena.gsi.de> skrev: (21 februari
> 2024 10:18:29 CET)
>
> Dear Håkan,
>
>
> thanks for the hint to flush and sleep. Indeed, I now see that
> the crash happens in init_slow of V560 at this line:
>
>
> v560->sicy_map=map_map(v560->address, MAP_SIZE, KW_NOBLT,
> 0, 0, MAP_POKE_ARGS(fixed_code), MAP_POKE_ARGS(scale_clear));
>
>
> Maybe the code is accessing/writing into a memory location that
> it should better not touch?
>
> This problematic line is then followed by:
>
>
> id=MAP_READ(v560->sicy_map, fixed_code);
>
> The corresponding line in the V560 code on the system that was
> running with this module looks like this:
>
>
> v560->sicy_map=map_map(v560->address, MAP_SIZE_MAX(*v560), KW_NOBLT,
> 0, 0, MAP_POKE_ARGS(*v560->read, fixed_code),
> MAP_POKE_ARGS(*v560->write, scale_clear));
>
> And is followed by:
>
>
> mapped_ptr =map_get_mapped_ptr(v560->sicy_map);
> v560->read=mapped_ptr;
> v560->write=mapped_ptr;
>
> Maybe you already have an idea what causes the problem here?
>
>
> I will now go to the system that was running with V560 and make
> a push of the NURDLIB.
>
>
>
>
> Best greetings
>
> Günter
>
>
>
> ------------------------------------------------------------------------
> *Von:* subexp-daq <subexp-daq-bounces at lists.chalmers.se> im
> Auftrag von Håkan T Johansson <f96hajo at chalmers.se>
> *Gesendet:* Dienstag, 20. Februar 2024 20:13:32
> *An:* Discuss use of Nurdlib, TRLO II, drasi and UCESB.
> *Betreff:* Re: [subexp-daq] Report of a possible bug of the
> CAEN_V560 module
>
> Dear Günter,
>
> I took the files you provided and for comparison put them in a
> branch
> 'old_caen_v560'.
>
> git diff origin/old_caen_v560..origin/master
>
> however does not show anything which is suspicious to me.
> Perhaps Hans
> can spot something.
>
> Otherwise, the only idea I can come up with is to continue to
> bisect the
> code inside slow init.
>
> However, before that, I would suggest to add
>
> fflush(stdout); sleep(1);
>
> after each printf statement, such that one can be quite sure
> that the
> printout is not eaten when the RIO crash happens. I.e. that it
> actually
> had gotten further than shown by the prints.
>
> Best regards,
> Håkan
>
>
>
>
> On Tue, 20 Feb 2024, Weber, Guenter Dr. wrote:
>
> >
> > Dear friends,
> >
> >
> > I now had a look at the system where the V560 was running. It was also setup
> > by Bastian. And there the code for the V560 module is slightly different
> > from the one included in the NURDLIB branch that I am using on the test
> > system.
> >
> >
> > Maybe you can have a look at it.
> >
> >
> > I also could push the complete NURDLIB from this system, if this helps.
> >
> >
> >
> >
> > Best greetings
> >
> > Günter
> >
> >
> >
> >
> > ____________________________________________________________________________
> > Von: subexp-daq <subexp-daq-bounces at lists.chalmers.se> im Auftrag von Weber,
> > Guenter Dr. <g.weber at hi-jena.gsi.de>
> > Gesendet: Dienstag, 20. Februar 2024 10:58:27
> > An: Discuss use of Nurdlib, TRLO II, drasi and UCESB.
> > Betreff: [subexp-daq] Report of a possible bug of the CAEN_V560 module
> >
> > Dear friends,
> >
> >
> > I now grabbed a V560 module that was working fine in another DAQ system and
> > put it into our test system.
> >
> >
> > The main.cfg looks like this:
> >
> >
> > log_level=spam # info, verbose, debug, spam
> >
> > CRATE("MCAL") {
> > GSI_VULOM(0x03000000) {
> > timestamp = true # needed to get timestamps in the data output
> > # ecl=0..15
> > }
> > BARRIER
> > CAEN_V560(0x333333300) {
> > use_veto = true
> > }
> > # CAEN_V767A(0x03100000) {
> > # }
> > }
> >
> > Starting the DAQ now results in a freeze of the RIO4. A reset of the crate
> > is necessary to talk to it again.
> >
> >
> > The problem occurs in the first slow init of the V560 module. To find the
> > exact line, I added some output to CRATE.C:
> >
> >
> > 10: crate/crate.c:923: .Slow-init module[0]=GSI_VULOM.
> > before push_log_level(module)
> > before a_crate->module_init_id = module->id
> > before module->props->init_slow(a_crate, module)
> > LOG: TRLO: MD5SUM: 0x1409285e (CT: 63bb1d44 = 2023-01-08 19:45:08 UTC)
> > before module_init_id_mark(a_crate, module)
> > before pop_log_level(module)
> > 10: crate/crate.c:923: .Slow-init module[1]=CAEN_V560.
> > before push_log_level(module)
> > before a_crate->module_init_id = module->id
> > before module->props->init_slow(a_crate, module)
> >
> >
> > The CRATE.C code now looks like this:
> >
> >
> > TAILQ_FOREACH(module, &a_crate->module_list, next) {
> > if (NULL == module->props) {
> > continue;
> > }
> > LOGF(info)(LOGL, "Slow-init module[%u]=%s.", module->id,
> > keyword_get_string(module->type));
> > printf("before push_log_level(module) \n");
> > push_log_level(module);
> > printf("before a_crate->module_init_id = module->id \n");
> > a_crate->module_init_id = module->id;
> > printf("before module->props->init_slow(a_crate, module) \n");
> > if (!module->props->init_slow(a_crate, module)) {
> > printf("before pop_log_level(module) \n");
> > pop_log_level(module);
> > printf("before goto crate_init_done \n");
> > goto crate_init_done;
> > }
> > printf("before module_init_id_mark(a_crate, module) \n");
> > module_init_id_mark(a_crate, module);
> > printf("before pop_log_level(module) \n");
> > pop_log_level(module);
> > }
> >
> > Thus, to me it looks like the check "if (!module->props->init_slow(a_crate,
> > module)) ..." is doing something quite horrible to the RIO4.
> >
> >
> > This is unfortunate, because my original aim was to show that there is also
> > a bug/mistake in readout_dt of the V560 module. But I did not come this far.
> >
> >
> > Do you have any idea what might cause the freezing of the RIO4?
> >
> >
> >
> >
> > Best greetings and many thanks
> >
> > Günter
> >
> >
> >
> >
> >
> >
>
>
More information about the subexp-daq
mailing list