Troubleshooting the FVTX in Run-12

Some links:
  • FVTX elog
  • Run-12 elog
  • FVTX email archive

Online monitoring - DAQ summary:

The data monitor shift person watches all the detectors, and much of the time the first sign of trouble is the DAQ summary of our data, shown here. What is shown is normal: parity errors at the 10-3 level. The bars are:
  • Good
  • Bad
  • GL1Clk
  • FEMClk
  • Parity
  • EvtNo
  • DcmChkSum
  • Length
  • DCMFemPar
[note: during noise runs, 'normal' means xxxxx]

  • If there are GL1Clk errors, for the record, look at the DAQ -> FVTX plots, to see which packets (cages) were down, and report this to the elog (later).

    Look at the Expert GUI:

    • If the outer ring is ~all red → do a Run Control -> Master Cleanup
    • If the outer ring is ~all green AND the FEM Latch ring (3rd) is all red→ 1) Stop the run, 2) do a Run Control -> VME start run 3) Update status - all should be green now.

    (we call it GL1 sync on the expert GUI) error, go to Online Monitoring DAQ-> FVTX page, where you can find which packet is bad. Usually a whole cage is bad at the same time.
    go to the expert GUI, follow FEM IB Glink error instructions below.

  • If FEMClk errors: this error now rarely occurs, no known cause/procedure

  • If you see Parity errors > 10-3, check Online Monitoring DAQ → FVTX, and look at Parity. Normally, there are a few packets with small number of errors, but if many packets show parity errors, 1) don't stop this run, 1) note in elog, 2) if still happens in next run, call Aaron or Jin

  • If evt # error, this is a big deal. 1) writ to elog, calla discussion

  • dcmchsum, length, dcmfempar occur never or very rarely, not known what causes them.

Expert GUI:

  1. FEM IB Glink: if red, glink between the FEM IB and the GTM is lost.
    • go to the Expert gui -> Run Control menu -> Master cleanup
    • to check: Update Status

  2. FEM n presence: green if the FEM board is alive. Red: FEM is dead or missing.

  3. FEM latch: indicates the latch of the data fiber from the ROC to the FEM.
    - A 'Cleanup' or 'Master Cleanup' will turn these all red.
    - Some are red because they are turned off in the Latch Database (if you click on the Latch DB tab, the number in that section is <16)

    Other problems could be, investigate in this order:
    • ROC is not talking
      • If ROC FO sync is green -> VME start run
      • If ROC FO sync is red -> do a Cleanup. If cleanup fails, cycle ROC power, do Cleanup again
    • FEM not responding -> power cycle FEM crate -> do a Cleanup, check status.
    • fiber is broken

  4. ROC FO sync:
    • Just one ROC has FO sync loss: click on roc, and on menu do fo sync if not, same as above
    • All ROCs have FO sync loss: cleanup like above, cycle ...

DCM and SEB problems:

SEB problems If no error conditions were seen on Exp GUI: could be DCM, SEB problem. First thing to try
xxx
This is a normal picture for the North. You get phone calls when one or more of these turn mostly red. Likely massive trips are due to a thermocouple or cooling liquid flow interlock trip. See this page. Talk to the VC shift person, to check the thermocouple status. If outages are not widespread (whole quadrants), use the diagnostics above to isolate and fix the problem.

Same for South

This is how it looks now. In the past, there were holes in this distribution, solved now. If there are big sections missing, check for cooling system trip, same as above.

Same for the South
Low bins (inefficient package) may be due to wedge come-and-go issue.


Hubert Van Hecke
Last modified: Sun Jun 10 00:06:55 EDT 2012