Speaker
Description
Modern light sources produce too many signals for a small operations team to monitor in real time. As a result, recovering from faults can require long downtimes, or even worse subtle performance issues may persist undiscovered. Existing automated methods tend to rely on pre-set limits which either miss subtle problems or produce too many false positives. AI methods can solve both problems, but deep learning techniques typically require extensive labeled training sets, which may not exist for anomaly detection tasks. Here we will show work on unsupervised AI methods developed to find problems at the Linac Coherent Light Source (LCLS). Whereas most unsupervised AI methods are based on distance or density metrics, we will describe a coincidence-based method that identifies faults through simultaneous changes in sub-system and beam behavior. We have applied the method to radio-frequency (RF) stations faults — the most common cause of lost beam at LCLS — and find that the proposed method can be fully automated while identifying 50% more events with 6x fewer false positives than the existing alarm system. I will also show work on a general outlier detection method, including an example of finding a previously unknown beam-scraping event.