[subexp-daq] problem when running two DAQs in parallel on one server

Håkan T Johansson f96hajo at chalmers.se
Mon May 13 17:22:28 CEST 2024


Dear Günter,

I think the issue with the master-eb connection ("Waited ... seconds for 
initial slave and EB connection(s)"), i.e. the lines on the readout:

   --server=drasi,dest=lyserv:8000 \
   --eb=lyserv:8000 \

or on the EB:

   --eb-master=rio4l-1 \
   --drasi=rio4l-1 \

Questions:

what is the 'names' of the two different readout computers as seen from 
the EB machine?  I see rio4l-1, is that the 2nd (not-yet working node)?

How does the EB script look for the first (running) DAQ?

If this does not help, please also provide the log of the EB process.

Cheers,
Håkan



On Mon, 13 May 2024, Weber, Guenter Dr. wrote:

> 
> Dear friends,
> 
> 
> I am trying to run two DAQ systems from a single server. For this I first copied the experiment folder, than compiled everything one the second
> RIO and finally did some adjustments to the settings.
> 
> 
> The first system is started with the follwing command and runs just fine:
> 
> 
> ../r3bfuser/build_cc_ppc-linux_4.2.2_debug/m_read_meb.drasi \
>     --triva=master, at 0x02,fctime=10,ctime=300 \
>     --log-no-rate-limit \
>     --server=stream:8003 \
>     --server=drasi,dest=lyserv:7000 \
>     --buf=size=200Mi \
>     --max-ev-size=0x100000 \
>     --eb=lyserv:7000 \
>     --subev=crate=0,type=88,subtype=8800,control=0,procid=12 \
>     "$@"
> 
> For the second system (which is very similar) I tried to just increase the port numbers by 1000:
> 
> 
> ../r3bfuser/build_cc_ppc-linux_4.2.2_debug/m_read_meb.drasi \
>     --triva=master, at 0x02,fctime=10,ctime=300 \
>     --log-no-rate-limit \
>     --server=stream:9003 \
>     --server=drasi,dest=lyserv:8000 \
>     --buf=size=200Mi \
>     --max-ev-size=0x100000 \
>     --eb=lyserv:8000 \
>     --subev=crate=0,type=88,subtype=8800,control=0,procid=12 \
>     "$@"
> 
> 
> In the various auxiliary scripts I also updated the ports numbers. For example in the event builder:
> 
> 
> ../drasi/bin/lwrocmerge \
>     --label=EB \
>     --port=8000 \
>     --merge-mode=event \
>     --server=trans \
>     --server=stream,flush=1 \
>     --buf=size=500Mi \
>     --max-ev-size=1Mi \
>     --eb-master=rio4l-1 \
>     --drasi=rio4l-1 \
>     --file-writer \
>     "$@"
> 
> However, it seems that something is missing because the DAQ fails at startup:
> 
> 
> Executing 'main'.
> CPUS: 1
> delay: 1
> 10: lwroc_hostname_util.c:109: Host 'lyserv' known as 192.168.1.1 (port: 8000).
> Message not logged - thread has no error buffer yet...
> CPUS: 1
> delay: 1
> 10: lwroc_hostname_util.c:109: Host 'lyserv' known as 192.168.1.1 (port: 8000).
> Message not logged - thread has no error buffer yet...
> HOST: RIO4L-1
> Token: 21301afe (21301afe:21301afe) [/mbsusr/mbsdaq/.drasi_tokens/blub]
> 10: lwroc_hostname_util.c:460: Own address: 192.168.1.71/255.255.255.0 (eth1).
> cfg: 'master, at 0x02,fctime=10,ctime=300' => 33554432
> 10: lwroc_data_pipe.c:146: Data buffer READOUT_PIPE, fmt LMD, size 209715200 = 0x0c800000, 3 consumers.
> 10: lwroc_triva_readout.c:66: Silence TRIVA  (HALT)
> 10: lwroc_net_io.c:169: Started server on port 56583 (data port 34116).
> 10: lwroc_net_trans.c:1808: [stream:9003] Started stream server on port 9003, data 56265.
> client union size: 244 240 188 508 640 204 204  => 640
> 10: lwroc_udp_awaken_hints.c:159: UDP awaken hints file: /tmp/drasi.u1001/drasi.hints.u1001.RIO4L-1:56583
> 10: lwroc_main.c:706: Log message rate limit not in effect.
> 10: lwroc_readout.c:112: call readout_init...
> 10: lwroc_thread_util.c:118: This is the triva control thread!
> 10: lwroc_thread_util.c:118: This is the net io thread!
> 10: lwroc_thread_util.c:118: This is the slow_async thread!
> 10: lwroc_thread_util.c:118: This is the data server thread!
> 8: lwroc_message_wait.c:86: Waited 1 seconds for msg client.
> 8: lwroc_triva_state.c:414: Waited 1 seconds for initial slave and EB connection(s):
> 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> 10: lwroc_message_internal.c:485: Message client connected!
> 8: lwroc_triva_state.c:414: Waited 5 seconds for initial slave and EB connection(s):
> 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 8: lwroc_triva_state.c:414: Waited 10 seconds for initial slave and EB connection(s):
> 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 8: lwroc_triva_state.c:414: Waited 20 seconds for initial slave and EB connection(s):
> 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 8: lwroc_triva_state.c:414: Waited 40 seconds for initial slave and EB connection(s):
> 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for outgoing link establishment.
> ^C8: lwroc_main.c:105: SIGINT received.
> 10: lwroc_thread_util.c:62: Set terminate first!  (main)
> 10: lwroc_thread_util.c:82: main thread done!  (Next term: data server)
> 10: lwroc_thread_util.c:82: data server thread done!  (Next term: slow_async)
> 10: lwroc_thread_util.c:82: slow_async thread done!  (Next term: net io)
> 10: lwroc_thread_util.c:82: net io thread done!  (Next term: triva control)
> Performing hardware cleanup (TRIVA HALT, RESET)...
> 
> I would really appreciate if you could give me a hint what is going on. Many thanks!
> 
> 
> 
> 
> 
> Best greetings
> 
> Günter
> 
> 
> 
> 
> 
> 
>


More information about the subexp-daq mailing list