The D0 online monitoring and automatic DAQ recovery Page: 3 of 8
This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to UNT Digital Library by the UNT Libraries Government Documents Department.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
Computing in High Energy and Nuclear Physics, La Jolla Ca, March 24-28, 2003
ACESvcHandler object. This object manages a TCP/IP
stream and dispatches data to an input queue as it arrives.
There is an individual thread that picks off the data and
processes it.
Display Handler Display Handler Display Handler
Data To Displays
Dispatcher
Client Handler Client Handler Client Handler
To Client To Client To Client
Figure 3 : Block diagram of the MS's object
structure. The display handlers feed requests to a processing
queue. The dispatcher takes the requests off the queue,
parses them, and sends them to the clients for processing.
Once all the data has been received back by the dispatcher,
the data is sent back to the displays.
The display handlers, dispatcher, and receiver parts of the
client handlers all have an associated thread. The display
requests are linearly queued. The dispatcher removes them
one at a time from the queue and parses the XML. As it
parses the display's request, the dispatcher builds a request
for each client. Once built, the requests are handed off to the
client handlers. After the client handlers have assembled all
the requested data and handed it back to the dispatcher, the
dispatcher builds the complete reply message and sends it
directly back to the waiting display. If the client can't find
the data in the cache, it will request it directly from the
client. Most requests take less than 150 ms to complete,
much less if they involve only cache hits.
All components of the monitor system make connections
to the MS. If the MS is not available the client or display
will keep attempting to reconnect.
The protocol for the display is very simple. After making
the connection to the MS, XML formatted requests are sent,
similar to Figure 2 After the MS retrieves the data from its
cache or requests it from the clients, it returns a similarly
formatted XML document that contains both the data and
also all the machines that are of that monitor type. If the data
requested is from a machine type with many copies - like a
l3xnode - then a copy of the data will be returned for each
machine. Data from a specific machine can be requested.
The client communication protocol is very similar. The
MS will send the client an XML request very similar in
format to the display's request. The XML is designed such
that the client can just fill in the monitor items one at a time
and reply with that information (using the XML Document
Object Model (DOM)).There are numerous timeouts in the system to keep it well
behaved even when a client or display misbehaves. If the
request cannot be queued by the display handler the display
gets an error message. If the request sits on the internal
queue longer than one second a timeout message is sent back
to the display. A client has 3 seconds to reply to a request for
data. If it fails to reply 10 times in a row within 3 seconds, it
is disconnected. The dispatcher thread allows 2 seconds for
all clients to return their data, and if a client is busy
processing the previous request when it starts, it will mark
that client as having timed out in the display's reply.
In order to correctly put monitor data in the cache the MS
must parse the reply from the client. This is done with a
high-speed, zero-copy, hand coded parser.
2.3. Client and Display Design
It was recognized early in the monitor system project that
simple interfaces would make for wider adoption. The
TCP/IP client and display protocols were designed with this
in mind: we have also written API's and libraries to
implement the protocols. We currently have API's
implemented in C++ and python for the client-side protocol,
and implementations in C++, python, java, and C# for the
display-side protocol.
When a client first connects it advertises its type and
machine name by sending an initial XML message. Clients
must have a thread listening to the port for incoming
messages and must serve them as fast as possible. If the
client takes longer than 3 seconds to respond, the MS will
flag an error. Repeated failure to respond in time will cause
the MS to drop the client's connection.
We have clients in the system that implement the TCP/IP
and XML protocol directly. We also have a collection of
objects that will take care of all required XML parsing and
data conversion. In fact, it is possible to declare an arbitrary
instance of a data type to be monitored. Using the common
C++ template traits technique the underlying code will
render the data to XML whenever a request for the data
arrives. Integer counters, for example, can be declared as a
template and then used as normal integer in most cases. We
have also written a python compiled module that uses a
simple name-value pairing to set monitoring variables. The
servicing of monitor requests from the MS is invisible to the
user. Both API implementations use the xerces XML parser
[3].
We have created a similar set of libraries for the display
writer. The request to the MS is usually part of the display's
main program loop. Various displays often vary what MS
items they are requesting depending upon the view the user
has chosen to display. The libraries all incorporate XML
parsing of one sort or another, though further parsing of
complex monitor data item is left entirely up to the display
writer. The package most appropriate to the language the
user is using is generally used.
A small set of monitor displays are also clients. These
frequently collate large amounts of information and publish
it back in a collated form. This reduces the amount of data
that has to be sent over the wire especially to a display onTHGT004
3
Upcoming Pages
Here’s what’s next.
Search Inside
This article can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Article.
al., A. Haas et. The D0 online monitoring and automatic DAQ recovery, article, April 6, 2004; Batavia, Illinois. (https://digital.library.unt.edu/ark:/67531/metadc779188/m1/3/: accessed April 24, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.