USRC’s machine learning research is somewhat related to the data analytics research area. This work is generally focused on applied machine learning techniques for developing insight about supercomputers and their usage.
USRC’s machine learning research is lead by Lissa Baseman.
Below are the staff considered to be in the “Machine Learning” group.
Sean works on kernel level support of systems in research and production at Los Alamos. At USRC he researches Soft-error Resilience and in Scalable System Software.
Lissa is an applied machine learning researcher and data scientist working on the resilience and fault-tolerance team. At USRC, her work spans using statistical relational models for fault characterization and mitigation as well as developing anomaly detection techniques for large-scale monitoring of supercomputing facilities. Before joining USRC, Lissa contributed to quantum algorithms for machine learning at LANL’s Center for Nonlinear Studies. Her background, including work on social network analysis with the Human Language Technology group at MIT Lincoln Laboratory and a short time at a startup back in Massachusetts, is primarily in the development and application of probabilistic graphical models to new relational and/or temporal domains. Lissa received her MS in Computer Science from the University of Massachusetts Amherst and her BA, also in Computer Science, from Amherst College.
Dr. Song Fu
Assistant Professor in Computer Science and Engineering, University of North Texas
Song Fu is an Assistant Professor in the Department of Computer Science and Engineering at the University of North Texas. His research focuses on reliability and energy efficiency of parallel and distributed systems. Song works with Nathan DeBardeleben, Mike Lang, and the USRC Systems Group on resilience, fault tolerance, and power management of ultra-scale computers. The goal is to reduce the vulnerability of HPC applications and systems to soft errors and failures and to improve power utilization to maximize machine room throughput.
Post Bachelor, Los Alamos National Laboratory
Olena graduated with a B.S. in Computer Science from FIU's School of Computing and Information Sciences in Miami. At FIU she did research at the VISA lab as a URA working on masquerading network traffic for Mission Critical Cloud Computing, and isolation benchmarking of containers. While working at LANL as a PostBac she designed an application model (IMCSim) of the implicit Monte Carlo particle code IMC using the Performance Prediction Toolkit (PPT), a discrete-event simulation-based modeling framework for predicting code performance on a large range of parallel platforms. At USRC she is currently working on predicting DRAM fault locations in HPC systems using structured learning and various ML techniques. Her research interests include HPC, ML, and fault prediction/mitigation.
PhD Student, Computer Science, North Carolina State University
Abida will be a PhD student at North Carolina State University in computer science. She has a bachelor's degree in mathematics from Carnegie Mellon University and a master's degree in computer science from Georgia Tech.During her time at USRC, Abida will help with the project Latent Anomaly Detection for Supercomputing System Performance.
Undergraduate Student, Rollins College
Alexandra is working on a bachelors degree in Computer Science and Mathematics at Rollins College. At the USRC she is working on creating a model of system logs from high performance computers. The model will later be used in anomaly detection.
PhD Student, New Mexico State University
Ashley is currently a PhD student studying Computer Science at New Mexico State University. She has a BS and MS in Electrical Engineering also from NMSU. Her research is on multivariate time series prediction and segmentation. At USRC she is working on on an anomaly detection project focused on detecting anomalies in energy data.