USRC’s data analytics research is focused more on humans looking at raw data to gain insights where USRC’s machine learning research is focused on using tools to discover knowledge.We consider the distinction to be that the analytics research somewhat more about data exploration.
The data analytics research is lead jointly by Dr. Nathan DeBardeleben and Lissa Baseman.
Below are the staff considered to be in the “Data Analytics” group.
Sean works on kernel level support of systems in research and production at Los Alamos. At USRC he researches Soft-error Resilience and in Scalable System Software.
Hugh participated in the design and implementation of the Linux Noise Detective. The Linux Noise detective is a Linux kernel module and a GUI to collect process data directly from the kernel (on multiple cluster nodes simultaneously) and analyze the data to determine the sources of system noise. He also participated in the design and the development of the XGet file transfer software. XGet scalably transfers files to nodes within a cluster by building a tree of participants and delegating serving duties to optimal slave nodes. He participated in the development of the XCPU cluster management system. XCPU keeps the state of the cluster distributed across all nodes, allowing easy configuration of hot-spare management nodes and graceful failover that doesn't require canceling the running jobs in case of head node failure.
Lissa is an applied machine learning researcher and data scientist working on the resilience and fault-tolerance team. At USRC, her work spans using statistical relational models for fault characterization and mitigation as well as developing anomaly detection techniques for large-scale monitoring of supercomputing facilities. Before joining USRC, Lissa contributed to quantum algorithms for machine learning at LANL’s Center for Nonlinear Studies. Her background, including work on social network analysis with the Human Language Technology group at MIT Lincoln Laboratory and a short time at a startup back in Massachusetts, is primarily in the development and application of probabilistic graphical models to new relational and/or temporal domains. Lissa received her MS in Computer Science from the University of Massachusetts Amherst and her BA, also in Computer Science, from Amherst College.
Dr. Song Fu
Assistant Professor in Computer Science and Engineering, University of North Texas
Song Fu is an Assistant Professor in the Department of Computer Science and Engineering at the University of North Texas. His research focuses on reliability and energy efficiency of parallel and distributed systems. Song works with Nathan DeBardeleben, Mike Lang, and the USRC Systems Group on resilience, fault tolerance, and power management of ultra-scale computers. The goal is to reduce the vulnerability of HPC applications and systems to soft errors and failures and to improve power utilization to maximize machine room throughput.
Post Bachelor, Los Alamos National Laboratory
Olena graduated with a B.S. in Computer Science from FIU's School of Computing and Information Sciences in Miami. At FIU she did research at the VISA lab as a URA working on masquerading network traffic for Mission Critical Cloud Computing, and isolation benchmarking of containers. While working at LANL as a PostBac she designed an application model (IMCSim) of the implicit Monte Carlo particle code IMC using the Performance Prediction Toolkit (PPT), a discrete-event simulation-based modeling framework for predicting code performance on a large range of parallel platforms. At USRC she is currently working on predicting DRAM fault locations in HPC systems using structured learning and various ML techniques. Her research interests include HPC, ML, and fault prediction/mitigation.
Graduate Student, Ohio State University
Scott is a graduate student studying network errors on LANL's Trinity supercomputer. While obtaining his BS in Computer Science with a minor in Applied Mathematics from Coastal Carolina University, Scott worked on various projects for the USRC, ranging from fault injection studies with F-SEFI to analyzing ECC of interest to the team. In the fall, Scott will begin the direct PhD track at The Ohio State University.
PhD Student, Computer Science, North Carolina State University
Abida will be a PhD student at North Carolina State University in computer science. She has a bachelor's degree in mathematics from Carnegie Mellon University and a master's degree in computer science from Georgia Tech.During her time at USRC, Abida will help with the project Latent Anomaly Detection for Supercomputing System Performance.
Undergraduate Student, Rollins College
Alexandra is working on a bachelors degree in Computer Science and Mathematics at Rollins College. At the USRC she is working on creating a model of system logs from high performance computers. The model will later be used in anomaly detection.
PhD Student, New Mexico State University
Ashley is currently a PhD student studying Computer Science at New Mexico State University. She has a BS and MS in Electrical Engineering also from NMSU. Her research is on multivariate time series prediction and segmentation. At USRC she is working on on an anomaly detection project focused on detecting anomalies in energy data.
Post Bachelor, Los Alamos National Laboratory
Heather graduated from the University of Georgia with a B.S. in Computer Science. At USRC, she will be working on looking at faults that occur in computer memory.