[subexp-daq] problem when running two DAQs in parallel on one server

Håkan T Johansson f96hajo at chalmers.se
Tue May 14 12:55:27 CEST 2024


Dear Günter,

On Tue, 14 May 2024, Weber, Guenter Dr. wrote:

> 
> 
> Dear Håkan,
> 
> 
> thank you very much for the reply. The explanation makes sense.
> 
> 
> This is the EB command for the DAQ already running:
> 
> 
> ../drasi/bin/lwrocmerge \
>     --label=EB \
>     --port=7000 \
>     --merge-mode=event \
>     --server=trans \
>     --server=stream,flush=1 \
>     --buf=size=500Mi \
>     --max-ev-size=1Mi \
>     --eb-master=rio4l-2 \
>     --drasi=rio4l-2 \
>     --file-writer \
>     "$@"
> 
> So, the only difference is the name "rio4l-2" in the old system vs.
> "rio4l-1" in the new system. And the port number "7000" vs. "8000".

> If
> "--server=trans" implicitly assumes a standard port, then this will be the
> same for both commands.

A TCP port is an exclusive resource and can only be bound by one process 
(socket file descriptor) at a time.  That should have given some error 
messages in the second started process like:

5: lwroc_net_io_util.c:85: Failure binding trans pmap socket to port 7234.
8: lwroc_net_io_util.c:87: Trying again in 10 s, attempt 3/20.

or

5: lwroc_net_io_util.c:85: Failure binding stream socket to port 6002.
8: lwroc_net_io_util.c:87: Trying again in 10 s, attempt 3/20.

> Is there somewhere an illustration/explanation how exactly LWROCMERGE and
> M_READ_MEB work together? This would help a lot.

Not an illustration, these may be the closest descriptions:

https://fy.chalmers.se/~f96hajo/drasi/doc/drasi_getting_started.html#single-node-operation-with-trigger-module-and-event-builder

and

https://fy.chalmers.se/~f96hajo/drasi/doc/drasi_security.html#tcp-port-verification

where I added a table at least.

Agreed, a figure would be nice...

Cheers,
Håkan


> 
> 
> 
> 
> 
> 
> 
> Best greetings
> 
> Günter
> 
> 
> 
> 
> ____________________________________________________________________________
> Von: subexp-daq <subexp-daq-bounces at lists.chalmers.se> im Auftrag von Håkan
> T Johansson <f96hajo at chalmers.se>
> Gesendet: Montag, 13. Mai 2024 17:24:19
> An: Discuss use of Nurdlib, TRLO II, drasi and UCESB.
> Betreff: Re: [subexp-daq] problem when running two DAQs in parallel on one
> server  
> 
> Possily, your second EB has not started if the first is already running,
> the
> 
> --server=trans  and  --server=stream,...
> 
> lines below use the default ports.  And if already in use by the first
> process, it cannot start.  Try to add port=...
> 
> > ../drasi/bin/lwrocmerge \
> >     --label=EB \
> >     --port=8000 \
> >     --merge-mode=event \
> >     --server=trans \
> >     --server=stream,flush=1 \
> >     --buf=size=500Mi \
> >     --max-ev-size=1Mi \
> >     --eb-master=rio4l-1 \
> >     --drasi=rio4l-1 \
> >     --file-writer \
> >     "$@"
> 
> Cheers,
> Håkan
> 
> >
> > However, it seems that something is missing because the DAQ fails at
> startup:
> >
> >
> > Executing 'main'.
> > CPUS: 1
> > delay: 1
> > 10: lwroc_hostname_util.c:109: Host 'lyserv' known as 192.168.1.1 (port:
> 8000).
> > Message not logged - thread has no error buffer yet...
> > CPUS: 1
> > delay: 1
> > 10: lwroc_hostname_util.c:109: Host 'lyserv' known as 192.168.1.1 (port:
> 8000).
> > Message not logged - thread has no error buffer yet...
> > HOST: RIO4L-1
> > Token: 21301afe (21301afe:21301afe) [/mbsusr/mbsdaq/.drasi_tokens/blub]
> > 10: lwroc_hostname_util.c:460: Own address: 192.168.1.71/255.255.255.0
> (eth1).
> > cfg: 'master, at 0x02,fctime=10,ctime=300' => 33554432
> > 10: lwroc_data_pipe.c:146: Data buffer READOUT_PIPE, fmt LMD, size
> 209715200 = 0x0c800000, 3 consumers.
> > 10: lwroc_triva_readout.c:66: Silence TRIVA  (HALT)
> > 10: lwroc_net_io.c:169: Started server on port 56583 (data port 34116).
> > 10: lwroc_net_trans.c:1808: [stream:9003] Started stream server on port
> 9003, data 56265.
> > client union size: 244 240 188 508 640 204 204  => 640
> > 10: lwroc_udp_awaken_hints.c:159: UDP awaken hints file:
> /tmp/drasi.u1001/drasi.hints.u1001.RIO4L-1:56583
> > 10: lwroc_main.c:706: Log message rate limit not in effect.
> > 10: lwroc_readout.c:112: call readout_init...
> > 10: lwroc_thread_util.c:118: This is the triva control thread!
> > 10: lwroc_thread_util.c:118: This is the net io thread!
> > 10: lwroc_thread_util.c:118: This is the slow_async thread!
> > 10: lwroc_thread_util.c:118: This is the data server thread!
> > 8: lwroc_message_wait.c:86: Waited 1 seconds for msg client.
> > 8: lwroc_triva_state.c:414: Waited 1 seconds for initial slave and EB
> connection(s):
> > 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> > 10: lwroc_message_internal.c:485: Message client connected!
> > 8: lwroc_triva_state.c:414: Waited 5 seconds for initial slave and EB
> connection(s):
> > 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 8: lwroc_triva_state.c:414: Waited 10 seconds for initial slave and EB
> connection(s):
> > 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 8: lwroc_triva_state.c:414: Waited 20 seconds for initial slave and EB
> connection(s):
> > 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 8: lwroc_triva_state.c:414: Waited 40 seconds for initial slave and EB
> connection(s):
> > 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > ^C8: lwroc_main.c:105: SIGINT received.
> > 10: lwroc_thread_util.c:62: Set terminate first!  (main)
> > 10: lwroc_thread_util.c:82: main thread done!  (Next term: data server)
> > 10: lwroc_thread_util.c:82: data server thread done!  (Next term:
> slow_async)
> > 10: lwroc_thread_util.c:82: slow_async thread done!  (Next term: net io)
> > 10: lwroc_thread_util.c:82: net io thread done!  (Next term: triva
> control)
> > Performing hardware cleanup (TRIVA HALT, RESET)...
> >
> > I would really appreciate if you could give me a hint what is going on.
> Many thanks!
> >
> >
> >
> >
> >
> > Best greetings
> >
> > Günter
> >
> >
> >
> >
> >
> >
> >
> 
>


More information about the subexp-daq mailing list