Main Web>MatiasAlejandroBonaventura>TDAQSimulation>TDAQSimulationGroupCommunication>MeetingSummaries>Simulation_meetings_jornFelixCharacterization (2016-11-16, MatiasAlejandroBonaventura)

EditAttachPDF

Characterization of Felix server traffic

Meeting summary 15/11/2016

Participants: Jorn, Andres, Matias

Goal: Sync-up knowledge in order to configure the simulation with 'realistic' traffic patterns, approximating the expected output of each felix server

Characterization of Felix server traffic

The detector_data_rates.pdf contains the expected rates of the differet subsystems

Felix server inputs (GBT links)

Felix servers will be connected to the detector through GBT links => a maximum of 24 GBT links per felix card. The actual number of GBT links will depend on the traffic (so as not to exceed PCIe card and network capacities). "For LAr slice ~8 GBT links"

Each GBT link will can be subdivided into many logical eLinks (also called 'data links'). A maximum of 9 eLinks per GBT link.
Some subdetectors (LAr and L1Calo) will operate in 'Full Mode', which means there is no logical separation of GBT links (1 GBT <--> 1 eLink)

Each eLink will receive packets at 100 KHz (comming from L1). The average size of each message will depend and vary depending (see excel detector_data_rates.pdf).
There is considerable difference between subdetectors, for example: LAr will have messages of ~3-5KB, NSW will have messages of ~50B. This impacts the amount of data out of the felix, and felix:sw-rod relation (1:1 or 2:1)

Publish (Felix) - Subscriber (SWRODs)

Each Felix will publish which eLinks it handles, and each SWROD will subscribe to the eLinks which it is interested in.

At the time of subscription the subscribers can choose between a lowLatency or highBandwidth mode (see next section for differences in how NetIO handles this modes)

Implementation: there might be 2 options, 1) using a Mapper node which will have the information of which eLinks are in which Felix 2) Felix will do a periodic broadcast of their eLinks.

Most probably this publish-subscriber mechanism will not de modeled in the simulation for the time being.

Felix internal handling of messages (NetIO

Felix will have multiple threads running. Each thread will handle a few eLinks (~8).

Each thread will receive message from the GBT eLinks, do some processing and forward the message to all the subscribed clients. This is handled by the library NetIO.
So, in the case of Ethernet (not infiniband), there will be a TCP connection per <eLink, subscriber>.

For forwarding messages to subscribers, NetIO offers 2 modes:

highBandwidth: buffers incomming messages and when the buffer is full it sends the complete buffer to the OS. To avoid starvation there is also a timeout which sends the buffer even if it is not full. This is to avoid sending very small messages. There is one separete buffer per connection.
The message buffer and the timeout will be configurable, right now the buffer=1MB and the timeout=2s.
This is the preferable mode, and it is the mode that SWRODs will use
lowLatency: in this mode all intermediate buffering is avoided to increase latency.
This is the mode that DCS will use.

Felix generator applications

Discussing the idea that we wanted to validate the simulation, Jorn mention they used 2 applications to generate traffic and get performance measurements:

Felix Generator: mocks incomming traffic to the felix, so all the felix card is in use. This requires the card to be installed in the machine
Software felix generator: this is a software mock of a felix card. It does not require the felix card to be installed. It can be configured with #elinks, #GBT links, etc. It uses NetIO.

Ideas on how to use this information (abstraction level)

One idea is to model only the output traffic of the felix servers into the network (no GBT links, no internal handling, buffers, threads, etc, etc).

To get an approximate representation it can be done by calculating how much time it would take to fill up the buffer: give the BW of the elink and the buffer capacity => we can calculate how long it would take to get filled. Also take into account the timeout.

-- MatiasAlejandroBonaventura - 2016-11-15

Attachments

Topic attachments
I	Attachment	History	Action	Size	Date	Who	Comment
pdf	detector_data_rates.pdf	r1	manage	32.5 K	2016-11-16 - 16:10	MatiasAlejandroBonaventura

Topic revision: r2 - 2016-11-16 - MatiasAlejandroBonaventura

Main

Webs

Public webs

Welcome Guest

- Cern Search
- TWiki Search
- Google Search
Main All webs

Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback