Гущин Михаил Иванович

Факультет компьютерных наук

Профиль на hse.ru ↗ тел.: 27261

Публикаций

313

Языков

Наград

Конференций

Профиль Публикации (313) Курсы (8)

Профессиональные интересы

анализ данныхфизика высоких энергий

Должности

Заместитель заведующего лабораторией — Факультет компьютерных наук, Институт искусственного интеллекта и цифровых наук, Научно-учебная лаборатория методов анализа больших данных
Ведущий научный сотрудник — Факультет компьютерных наук, Институт искусственного интеллекта и цифровых наук, Научно-учебная лаборатория методов анализа больших данных
Доцент — Факультет компьютерных наук, Департамент больших данных и информационного поиска

Био

· Начал работать в НИУ ВШЭ в 2017 году.
· Научно-педагогический стаж: 8 лет.

Образование

2020 · Кандидат наук: Московский физико-технический институт (национальный исследовательский университет)
2019 · Аспирантура: Московский физико-технический институт (национальный исследовательский университет), специальность «Информатика и вычислительная техника»
2015 · Магистратура: Московский физико-технический институт (государственный университет), специальность «Прикладные математика и физика», квалификация «Магистр»
2013 · Бакалавриат: Московский физико-технический институт (государственный университет), специальность «Прикладные математика и физика», квалификация «Бакалавр»

Опыт работы

· 2014 - 2017: Исследователь-разработчик в OOO "Яндекс"

Награды и поощрения

· Благодарность первого проректора НИУ ВШЭ (август 2024)
· Благодарность НИУ ВШЭ (май 2024)
· Благодарность проректора НИУ ВШЭ (сентябрь 2022)
· Благодарность факультета компьютерных наук НИУ ВШЭ (август 2022)
· Надбавка за публикацию в журнале из Списка А (и приравненном к нему научном издании) (2025–2026, 2024–2025, 2023–2024)
· Надбавка за публикацию в международном рецензируемом научном издании (2022–2023, 2021–2022, 2020–2022, 2018–2019)
· Лучший преподаватель — 2024

Гранты и проекты

— · на соискание учёной степени кандидата наук

Конференции (1)

Показать все

· 2021: ACAT 2021 (Daejeon). Доклад: Robust Neural Particle Identification Models

Идентификаторы исследователя

ORCID: 0000-0002-8894-6292
ResearcherID: V-4864-2019
SPIN РИНЦ: 3997-5907
Google Scholar: https://scholar.google.ru/citations?user=RfWYT08AAAAJ&hl=ru
Scopus AuthorID: 57208118316

Публикации (313)

Measurement of the Branching Fraction Ratios 𝑅(𝐷+) and 𝑅(𝐷*+) Using Muonic 𝜏 Decays

2025 · ARTICLE · en

The branching fraction ratios of ¯𝐵0→𝐷+⁢𝜏−⁢¯𝜈𝜏 and ¯𝐵0→𝐷*+⁢𝜏−⁢¯𝜈𝜏 decays are measured with respect to their muonic counterparts, using a data sample corresponding to an integrated luminosity of 2.0 fb−1 collected by the LHCb experiment in proton-proton collisions at √𝑠 =13 TeV. The reconstructed final states are formed by combining 𝐷+ mesons with 𝜏−→𝜇−⁢¯𝜈𝜇⁢𝜈𝜏 candidates, where the 𝐷+ is reconstructed via the 𝐷+ →𝐾−⁢𝜋+⁢𝜋+ decay. The results are 𝑅⁡(𝐷+)=0.249±0.043±0.047, 𝑅⁡(𝐷*+)=0.402±0.081±0.085, where the first uncertainties are statistical and the second systematic. The two measurements have a correlation coefficient of −0.39 and are compatible with the standard model.

DOI ↗ PDF ↗

Long-lived particle reconstruction downstream of the LHCb magnet

2025 · ARTICLE · en

Charged-particle trajectories are usually reconstructed with the LHCb detector using combined information from the tracking devices placed upstream and downstream of the 4 T m dipole magnet. Trajectories reconstructed using only information from the tracker downstream of the dipole magnet, which are referred to as T tracks, have not been used for physics analysis to date. The challenges of the reconstruction of long-lived particles with T tracks for physics use are discussed and solutions are proposed. The feasibility and the tracking performance are studied using samples of long-lived and hadrons decaying between 6.0 and 7.6 m downstream of the proton–proton collision point, thereby traversing most of the magnetic field region and providing maximal sensitivity to magnetic and electric dipole moments. The reconstruction can be expanded upstream to about 2.5 m for use in direct searches of exotic long-lived particles. The data used in this analysis have been recorded between 2015 and 2018 and correspond to an integrated luminosity of 6 . The results obtained demonstrate the possibility to further extend the decay volume and the physics reach of the LHCb experiment.

DOI ↗ PDF ↗

First Determination of the Spin-Parity of Ξ𝑐⁢(3055)+,0 Baryons

2025 · ARTICLE · en

The Ξ0⁢(−)𝑏→Ξ𝑐⁢(3055)+(0)⁢(→𝐷+(0)⁢Λ)⁢𝜋− decay chains are observed, and the spin-parity of Ξ𝑐⁢(3055)+(0) baryons is determined for the first time. The measurement is performed using proton-proton collision data at a center-of-mass energy of √𝑠 =13 TeV, corresponding to an integrated luminosity of 5.4 fb−1, recorded by the LHCb experiment between 2016 and 2018. The spin-parity of the Ξ𝑐⁢(3055)+(0) baryons is determined to be 3/2+ with a significance of more than 6.5⁢𝜎 (3.5⁢𝜎) compared to all other tested hypotheses. The up-down asymmetries of the Ξ0⁢(−)𝑏→Ξ𝑐⁢(3055)+(0)⁢𝜋− transitions are measured to be −0.92±0.10±0.05 (−0.92±0.16±0.22), consistent with maximal parity violation, where the first uncertainty is statistical and the second is systematic. These results support the hypothesis that the Ξ𝑐⁢(3055)+(0) baryons correspond to the first 𝐷-wave 𝜆-mode excitation of the Ξ𝑐 flavor triplet.

DOI ↗ PDF ↗

Digital Twin for Predictive Anomaly Detection in Data Center Cooling Systems: A Modelica-Based Approach with Synthetic Data Generation

2025 · CHAPTER · en

Data centers are critical energy-intensive infrastructures where cooling systems account for up to 40% of total energy consumption. While refrigerant leaks are a known issue, gradual air-side fouling and clogging of condensers present a more insidious and costly challenge, leading to persistent energy waste and a significant carbon footprint. The development of predictive machine learning (ML) models for early detection of such anomalies is hampered by the scarcity of labeled real-world performance degradation data. This paper addresses this challenge by proposing a digital twin framework for a DC cooling system. The core of the twin is a high-fidelity dynamic model developed in OpenModelica using the DLR ThermoFluidStream library, capable of simulating both standard operational regimes and the slow progression of condenser fouling. We present an efficient two-stage methodology for generating realistic synthetic training data: Initial simulation in OpenModelica is followed by a Python-based postprocessing stage where sensor noise is synthesized and applied via identified system transfer functions. This approach achieves a 300-fold speedup compared to native stochastic cosimulation, enabling the rapid creation of large, labeled datasets. The resulting digital twin serves as a versatile tool for training robust ML models to detect efficiency loss, testing energy-optimizing control algorithms, and conducting safe "what-if" analyses during the design phase, ultimately enhancing DC resilience and energy efficiency.

DOI ↗ PDF ↗

Refrigerant Leak Detection in Data Centers Using Topologically Determined Graph Neural Networks

2025 · CHAPTER · en

This paper investigates the problem of detecting slow refrigerant leaks in a data center cooling system using a graph neural network. The study addresses the challenge of early fault identification, proposing a method for constructing a topological graph based on the engineering diagram, the physical layout, and the cause-and-effect relationships in the cooling system. This graph structure effectively captures the spatial and functional dependencies between system components. Comparative testing of the GConvGRU model with topological, fully connected and correlation graphs, as well as the classic LSTM, was conducted on a real dataset from an industrial container-based data center. The experiments showed that the topological graph approach demonstrates superiority in all metrics: accuracy, F1, and detection time. Furthermore, the model proves effective even with limited labeled anomaly data, highlighting its robustness and practical applicability for real-world monitoring systems. The results confirm that incorporating domain knowledge of the system’s physics can significantly improve the quality of slow anomaly detection, reducing time to detection while minimizing false positives.

DOI ↗ PDF ↗

Диффузионные модели для генерации синтетических табличных данных

2025 · ARTICLE · ru

Задача генерации высококачественных синтетических данных имеет ключевое значение для многих задач, связанных с наукой о данных. Сгенерированный набор данных может сократить затраты на дополнение существующих данных дополнительными, например в физике, или помочь с защитой конфиденциальности, например в банковской сфере. Однако генерация табличных данных является сложной задачей, поскольку данные содержат как числовые, так и категориальные признаки. В этой статье мы исследуем современные подходы к генерации табличных данных, оцениваем несколько модификаций современной модели и то, влияют ли они на качество синтезируемых данных. Модификации включают использование моделей гауссовой диффузии как для генерации числовых, так и для генерации категориальных признаков, а также гауссовского шума для регуляризации во время обучения. Комплексные эксперименты и оценка показателей качества генерации табличных данных на пяти общедоступных наборах данных доказывают, что предложенная модифицированная модель сохраняет аналогичное качество синтезированных данных по сравнению с исходной моделью, но требуя при этом меньше времени для генерации синтетических данных.

DOI ↗ PDF ↗

Astronomical Data Approximation Based on Neural Network Models

2024 · ARTICLE · en

In this study, we apply shallow neural networks, bayesian neural networks, and normalizing flows to approximate light curves of astronomical objects. The study shows that the approximation quality of the proposed methods outperform the existing ap- proaches based on Gaussian processes. We assess the quality of solution using two physics-motivated analyses: supernovae type Ia classification and bolometric intensity peak estimation. For both problems, convolutional neural networks are trained on ap- proximated light curves. The results show that the proposed methods help to improve the quality of supernovae type identification and increase the accuracy of the intensity peak estimation compared to the Gaussian processes model.

PDF ↗

Prediction of Industrial Cyber Attacks Using Normalizing Flows

2024 · ARTICLE · en

This paper presents the development and evaluation of methods for detecting cyberattacks on industrial systems using neural network approaches. The focus is on the task of detecting anomalies in multivariate time series, where the diversity and complexity of potential attack scenarios require the use of advanced models. To address these challenges, a transformer-based autoencoder architecture was used, which was further enhanced by transitioning to a variational autoencoder (VAE) and integrating normalizing flows. These modifications allowed the model to better capture the data distribution, enabling effective anomaly detection, including those not present in the training set. As a result, high performance was achieved, with an F1 score of 0.93 and a ROC-AUC of 0.87. The results underscore the effectiveness of the proposed methodology and provide valuable contributions to the field of anomaly detection and cybersecurity in industrial systems.

DOI ↗ PDF ↗

Comprehensive analysis of local and nonlocal amplitudes in the B0 → K*0μ+μ− decay

2024 · ARTICLE · en

A comprehensive study of the local and nonlocal amplitudes contributing to the decay B0 → K*0(→ K+π−)μ+μ− is performed by analysing the phase-space distribution of the decay products. The analysis is based on pp collision data corresponding to an integrated luminosity of 8.4 fb−1 collected by the LHCb experiment. This measurement employs for the first time a model of both one-particle and two-particle nonlocal amplitudes, and utilises the complete dimuon mass spectrum without any veto regions around the narrow charmonium resonances. In this way it is possible to explicitly isolate the local and nonlocal contributions and capture the interference between them. The results show that interference with nonlocal contributions, although larger than predicted, only has a minor impact on the Wilson Coefficients determined from the fit to the data. For the local contributions, the Wilson Coefficient C9, responsible for vector dimuon currents, exhibits a 2.1σ deviation from the Standard Model expectation. The Wilson Coefficients C10, C9′ and C10′ are all in better agreement than C9 with the Standard Model and the global significance is at the level of 1.5σ. The model used also accounts for nonlocal contributions from B0 → K*0[τ+τ − → μ+μ−] rescattering, resulting in the first direct measurement of the bsττ vector effective-coupling C9τ.

DOI ↗

Study of $b-$hadron decays to $\Lambda_c^+h^-h^{\prime -}$ final states

2024 · ARTICLE · en

Decays of Ξb− and Ωb− baryons to Λc+h−h′− final states, with h−h′− being π−π−, K−π− and K−K− meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of 8.7 fb−1 of pp collisions collected at centre-of-mass energies s = 7, 8 and 13 TeV. The products of the relative branching fractions and fragmentation fractions for each signal mode, relative to the B−→Λc+p¯π− mode, are measured, with Ξb−→Λc+K−π−, Ξb−→Λc+K−K− and Ωb−→Λc+K−K− decays being observed at over 5 σ significance. The Ξb−→Λc+K−π− mode is also used to measure the Ξb− production asymmetry, which is found to be consistent with zero. In addition, the B−→Λc+p¯K− decay is observed for the first time, and its branching fraction is measured relative to that of the B−→Λc+p¯π− mode.

DOI ↗

Курсы (8)

Глубинное обучение · 3 раза

2025/2026, 2024/2025, 2023/2024 · Магистратура / Маго-лего · рус
Машинное обучение 1 · 3 раза

2025/2026, 2024/2025, 2023/2024 · Бакалавриат · рус
Генеративные модели в машинном обучении

2024/2025 · Магистратура / Магистратура направление: 01.04.02 Прикладная математика и информатика / Маго-лего · рус
Основы глубинного обучения · 2 раза

2023/2024, 2022/2023 · Майнор · рус
Машинное обучение

2022/2023 · Бакалавриат · рус
Research Seminar "Data Analysis in the Natural Sciences"

2022/2023 · Бакалавриат · Анг
Научно-исследовательский семинар "Прикладные задачи анализа данных"

2022/2023 · Магистратура · рус
Прикладные задачи анализа данных

2022/2023 · Майнор · рус