[subexp-daq] problem when running two DAQs in parallel on one server

Håkan T Johansson f96hajo at chalmers.se
Tue May 14 14:51:05 CEST 2024


On Tue, 14 May 2024, Weber, Guenter Dr. wrote:

> 
> Dear all,
> 
> 
> I now found in an old folder from Bastian something helpful that I modified
> a bit:
> 
> 
> name=2024_Na22
> rio=rio4l-1
> EXP_NAME=2023_na22
> EXP_PATH=/LynxOS/mbsusr/mbsdaq/daq/${EXP_NAME}
> MAIN_STREAM_PORT=8010
> EB_TRANS_PORT=8011
> EB_STREAM_PORT=8012
> SERV_PORT=8013
> EB_DRASI_PORT=7010
> EB_MAIN=${rio}
> 
> Starting DRASI now looks like this:
> 
> 
> ../r3bfuser/build_cc_ppc-linux_4.2.2_debug/m_read_meb.drasi \
>     --triva=master, at 0x02,fctime=10,ctime=300 \
>     --log-no-rate-limit \

--log-no-rate-limit  should not really be used.  And things have 
changed so I'd suggest to remove it.  If a DAQ produces messages, then 
typically something is wrong...  And the DAQ will now limit the speed of 
produced messages.

>     --server=stream:${SERV_PORT} \

This above is just used to 'spy' on the data, so not strictly needed.

>     --server=drasi,dest=lyserv:${EB_DRASI_PORT} \
>     --buf=size=200Mi \
>     --max-ev-size=0x100000 \
>     --eb=lyserv:${EB_DRASI_PORT} \
>     --subev=crate=0,type=88,subtype=8800,control=0,procid=12 \
>     "$@"
> 
> The EVENT BUILDER:
> 
> 
> ../drasi/bin/lwrocmerge \
>     --label=EB \
>     --port=${EB_DRASI_PORT} \
>     --merge-mode=event \
>     --server=trans:${EB_TRANS_PORT} \
>     --server=stream:${EB_STREAM_PORT},flush=1 \
>     --buf=size=500Mi \
>     --max-ev-size=1Mi \
>     --eb-master=${EB_MAIN} \
>     --drasi=${EB_MAIN} \
>     --file-writer \
>     "$@"
> 
> The LOGGING:
> 
> 
> $EXP_PATH/drasi/bin/lwrocmon ${EB_MAIN} localhost:${EB_DRASI_PORT} --log
> 
> 
> The SERVER:
> 
> 
> while : ; do
> ../ucesb/empty/empty \
>     stream://localhost:${EB_STREAM_PORT} \
>            --server=stream:${SERV_PORT},flush=1
> sleep 5
> done
> 
> It looks like the DAQ is running now. However, unfortunately, I still do not
> fully understand what I am doing here. Fore example, the MAIN_STREAM_PORT
> defined above is not used.

I guess it was used as the port for the stream server from the readout 
node (the one for 'spy' use above).

> And why is DRASI using the SERV_PORT as well as
> the UCESB SERVER?

Up to the user :-)

---

With several systems running partially on the same machine, it is probably 
a good idea to define the ports explicitly for all systems.

Cheers,
Håkan




> 
> 
> 
> Sorry for all the question. And thank you so much!
> 
> 
> 
> 
> 
> 
> Best greetings
> 
> Günter
> 
> 
> 
> 
> ____________________________________________________________________________
> Von: subexp-daq <subexp-daq-bounces at lists.chalmers.se> im Auftrag von Weber,
> Guenter Dr. <g.weber at hi-jena.gsi.de>
> Gesendet: Dienstag, 14. Mai 2024 12:13:01
> An: Discuss use of Nurdlib, TRLO II, drasi and UCESB.
> Betreff: Re: [subexp-daq] problem when running two DAQs in parallel on one
> server  
> 
> 
> Dear Håkan,
> 
> 
> thank you very much for the reply. The explanation makes sense.
> 
> 
> This is the EB command for the DAQ already running:
> 
> 
> ../drasi/bin/lwrocmerge \
>     --label=EB \
>     --port=7000 \
>     --merge-mode=event \
>     --server=trans \
>     --server=stream,flush=1 \
>     --buf=size=500Mi \
>     --max-ev-size=1Mi \
>     --eb-master=rio4l-2 \
>     --drasi=rio4l-2 \
>     --file-writer \
>     "$@"
> 
> So, the only difference is the name "rio4l-2" in the old system vs.
> "rio4l-1" in the new system. And the port number "7000" vs. "8000". If
> "--server=trans" implicitly assumes a standard port, then this will be the
> same for both commands.
> 
> 
> Is there somewhere an illustration/explanation how exactly LWROCMERGE and
> M_READ_MEB work together? This would help a lot.
> 
> 
> 
> 
> 
> 
> 
> Best greetings
> 
> Günter
> 
> 
> 
> 
> ____________________________________________________________________________
> Von: subexp-daq <subexp-daq-bounces at lists.chalmers.se> im Auftrag von Håkan
> T Johansson <f96hajo at chalmers.se>
> Gesendet: Montag, 13. Mai 2024 17:24:19
> An: Discuss use of Nurdlib, TRLO II, drasi and UCESB.
> Betreff: Re: [subexp-daq] problem when running two DAQs in parallel on one
> server  
> 
> Possily, your second EB has not started if the first is already running,
> the
> 
> --server=trans  and  --server=stream,...
> 
> lines below use the default ports.  And if already in use by the first
> process, it cannot start.  Try to add port=...
> 
> > ../drasi/bin/lwrocmerge \
> >     --label=EB \
> >     --port=8000 \
> >     --merge-mode=event \
> >     --server=trans \
> >     --server=stream,flush=1 \
> >     --buf=size=500Mi \
> >     --max-ev-size=1Mi \
> >     --eb-master=rio4l-1 \
> >     --drasi=rio4l-1 \
> >     --file-writer \
> >     "$@"
> 
> Cheers,
> Håkan
> 
> >
> > However, it seems that something is missing because the DAQ fails at
> startup:
> >
> >
> > Executing 'main'.
> > CPUS: 1
> > delay: 1
> > 10: lwroc_hostname_util.c:109: Host 'lyserv' known as 192.168.1.1 (port:
> 8000).
> > Message not logged - thread has no error buffer yet...
> > CPUS: 1
> > delay: 1
> > 10: lwroc_hostname_util.c:109: Host 'lyserv' known as 192.168.1.1 (port:
> 8000).
> > Message not logged - thread has no error buffer yet...
> > HOST: RIO4L-1
> > Token: 21301afe (21301afe:21301afe) [/mbsusr/mbsdaq/.drasi_tokens/blub]
> > 10: lwroc_hostname_util.c:460: Own address: 192.168.1.71/255.255.255.0
> (eth1).
> > cfg: 'master, at 0x02,fctime=10,ctime=300' => 33554432
> > 10: lwroc_data_pipe.c:146: Data buffer READOUT_PIPE, fmt LMD, size
> 209715200 = 0x0c800000, 3 consumers.
> > 10: lwroc_triva_readout.c:66: Silence TRIVA  (HALT)
> > 10: lwroc_net_io.c:169: Started server on port 56583 (data port 34116).
> > 10: lwroc_net_trans.c:1808: [stream:9003] Started stream server on port
> 9003, data 56265.
> > client union size: 244 240 188 508 640 204 204  => 640
> > 10: lwroc_udp_awaken_hints.c:159: UDP awaken hints file:
> /tmp/drasi.u1001/drasi.hints.u1001.RIO4L-1:56583
> > 10: lwroc_main.c:706: Log message rate limit not in effect.
> > 10: lwroc_readout.c:112: call readout_init...
> > 10: lwroc_thread_util.c:118: This is the triva control thread!
> > 10: lwroc_thread_util.c:118: This is the net io thread!
> > 10: lwroc_thread_util.c:118: This is the slow_async thread!
> > 10: lwroc_thread_util.c:118: This is the data server thread!
> > 8: lwroc_message_wait.c:86: Waited 1 seconds for msg client.
> > 8: lwroc_triva_state.c:414: Waited 1 seconds for initial slave and EB
> connection(s):
> > 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> > 10: lwroc_message_internal.c:485: Message client connected!
> > 8: lwroc_triva_state.c:414: Waited 5 seconds for initial slave and EB
> connection(s):
> > 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 8: lwroc_triva_state.c:414: Waited 10 seconds for initial slave and EB
> connection(s):
> > 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 8: lwroc_triva_state.c:414: Waited 20 seconds for initial slave and EB
> connection(s):
> > 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 8: lwroc_triva_state.c:414: Waited 40 seconds for initial slave and EB
> connection(s):
> > 8: lwroc_triva_state.c:422: [EB lyserv:8000] (state 0)
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > 10: lwroc_net_outgoing.c:383: [revlink: lyserv:8000] Timeout waiting for
> outgoing link establishment.
> > ^C8: lwroc_main.c:105: SIGINT received.
> > 10: lwroc_thread_util.c:62: Set terminate first!  (main)
> > 10: lwroc_thread_util.c:82: main thread done!  (Next term: data server)
> > 10: lwroc_thread_util.c:82: data server thread done!  (Next term:
> slow_async)
> > 10: lwroc_thread_util.c:82: slow_async thread done!  (Next term: net io)
> > 10: lwroc_thread_util.c:82: net io thread done!  (Next term: triva
> control)
> > Performing hardware cleanup (TRIVA HALT, RESET)...
> >
> > I would really appreciate if you could give me a hint what is going on.
> Many thanks!
> >
> >
> >
> >
> >
> > Best greetings
> >
> > Günter
> >
> >
> >
> >
> >
> >
> >
> 
>


More information about the subexp-daq mailing list