Макаров Илья Андреевич

Факультет компьютерных наук

Профиль на hse.ru ↗ тел.: +7(495) 772-95-90*27282 | +7(915)152-4532

Публикаций

117

Языков

Наград

Конференций

Профиль Публикации (117) Курсы (7)

Профессиональные интересы

27.03.19 Математическая логика27.15.00 Теория чисел28.23.00 Искусственный интеллект28.17.33 Компьютерное моделирование реальности. Виртуальная реальность

Должности

Доцент — Факультет компьютерных наук, Департамент анализа данных и искусственного интеллекта

Био

· Начал работать в НИУ ВШЭ в 2011 году.
· Научно-педагогический стаж: 14 лет.

Образование

2021 · PhD: Университет Любляны
2015 · Аспирантура: Московский государственный университет им. М.В. Ломоносова, факультет: Механико-математический
2011 · Специалитет: Московский государственный университет им. М.В. Ломоносова, факультет: Механико-математический, специальность «Математика», квалификация «Математик»

Опыт работы

· 2011: НИУ ВШЭ, Департамент анализа данных и искусственного интеллекта – старший преподаватель, научный сотрудник ( настоящее время), заместитель руководителя (2012-2017)

Награды и поощрения

· Номинант на "Лучшие преподаватели 2014" (июль 2014)
· Надбавка за академическую работу (2017–2018)
· Надбавка за публикацию в журнале из Списка А (и приравненном к нему научном издании) (2025–2026, 2024–2025, 2023–2024)
· Надбавка за публикацию в международном рецензируемом научном издании (2022–2023, 2021–2022, 2018–2020)
· Надбавка за статью в зарубежном рецензируемом научном издании (2016–2017)
· Лучший преподаватель — 2022, 2017–2018
· Группа высокого профессионального потенциала (кадровый резерв НИУ ВШЭ)Категория "Новые преподаватели" (2013–2014)Категория "Будущие преподаватели" (2012)

Гранты и проекты

— · Грант президента РФ МК-5016.2012.1 "Многомерные диофантовы приближения" (2012) - исполнитель
— · Грант РНФ 17-11-01294 "Представление, обнаружение и обработка знаний: логический подход"

Конференции (10)

Показать все

· 2016: The 5th international conference on Analysis of Images, Social Networks, and Texts (AIST) (Екатеринбург). Доклад: Smoothing Voronoi-based Path with Minimized Length and Visibility using Composite Bezier Curves
· 2016: Third International Workshop on Experimental Economics and Machine Learning (EEML 2016) (Москва). Доклад: Modelling Human-like Behavior through Reward-based Approach in a First-Person Shooter Game
· 2016: The 6th International Conference on Network Analysis (Nizhny Novgorod). Доклад: Co-author Recommender System
· 2016: ACM Multimedia 2016 (Амстердам). Доклад: First-Person Shooter Game for Virtual Reality Headset with Advanced Multi-Agent Intelligent System
· 2015: The 4th international conference on Analysis of Images, Social Networks, and Texts (AIST) (Екатеринбург). Доклад: Imitation of human behavior in 3D-shooter game
· 2015: 10th Panhellenic Logic Symposium (Karlovasi, Samos). Доклад: Total Equivalence Systems for Classes of 3-valued Projection Logic whose Projections Equal to the Class of Linear Boolean Functions
· 2015: 10th Panhellenic Logic Symposium (Karlovasi, Samos). Доклад: Logical Generalized Continued Fractions
· 2015: 5th World Congress on Universal Logic (Istanbul). Доклад: Separator Method for Constructing Canonical Types of Formulas
· 2014: Конференция научно-педагогических работников Национального исследовательского университета «Высшая школа экономики» (Москва). Доклад: Выборы Ученого Совета НИУ ВШЭ
· 2012: Ломоносовские чтения - 2012 (Москва). Доклад: О некоторых свойствах внутренних полиэдров Клейна

Идентификаторы исследователя

ORCID: 0000-0002-3308-8825
ResearcherID: G-9195-2015
SPIN РИНЦ: 3151-9176
Google Scholar: https://scholar.google.com/citations?user=cFpDMzIAAAAJ&hl=en
Scopus AuthorID: 57203060623

Публикации (117)

CA-SER: Cross-Attention Feature Fusion for Speech Emotion Recognition

2024 · CHAPTER · en

In this paper, we introduce a novel tool for speech emotion recognition, CA-SER, that borrows self-supervised learning to extract semantic speech representations from a pre-trained wav2vec 2.0 model and combine them with spectral audio features to improve speech emotion recognition. Our approach involves a self-attention encoder on MFCC features to capture meaningful patterns in audio sequences. These MFCC features are combined with high-level representations using a multi-head cross-attention mechanism. Evaluation of speech emotion recognition on the IEMOCAP dataset shows that our system achieves a weighted accuracy of 74.6%, outperforming most existing techniques.

DOI ↗ PDF ↗

Plug-and-play unsupervised fault detection and diagnosis for complex industrial monitoring

2024 · CHAPTER · en

Today industrial facilities are equipped with lots of sensors throughout all the production line for monitoring means. Gathered data can be used to detect and predict failures; however, manual labeling of large amounts of data for supervised learning is complicated. This paper introduces an innovative approach to unsupervised fault detection and diagnosis tailored for monitoring industrial chemical processes. We showcase the efficacy of our model using two publicly accessible datasets from the Tennessee Eastman Process, each containing various faults. Furthermore, we illustrate that by fine-tuning the model on a limited amount of labeled data, it achieves performance close to that of a state-of-the-art model trained on the entire dataset.

DOI ↗ PDF ↗

VIA AI: Reliable Deep Reinforcement Learning for Traffic Signal Control

2024 · CHAPTER · en

Traffic signal control optimization is an integral part of any modern transportation system. However, modern traffic signal control systems often rely on predetermined fixed rules to adjust traffic signal timings. This paper presents VIA AI - an intelligent traffic signal control system that leverages deep reinforcement learning (RL) applied to count-based traffic data. Our solution offers additional adaptability and flexibility by allowing the system to learn and adjust its strategies based on real-time feedback and environmental changes. We test our approach using real-world traffic data and show that it outperforms classical methods of intersection control.

DOI ↗

Benchmarking and Data Synthesis for Colorization of Manga Sequential Pages for Augmented Reality

2024 · CHAPTER · en

This paper introduces an innovative approach to manga colorization within augmented reality (AR) environments, focusing on the unique challenges posed by colorizing photos of manga books. We present a novel method using diffusion models to generate a synthetic dataset that accurately replicates photographed manga pages. Additionally, we have compiled a dataset of real manga photographs, capturing diverse environmental conditions. Integrating these datasets, we established a comprehensive benchmark to evaluate colorization models in scenarios that simulate AR applications. This benchmark was validated through a human study, confirming the accuracy of our metrics across both datasets. We also showed that domain adaptation may improve model performance. Paving the way for practical applications, our framework enables the creation of an AR application designed to execute manga colorization effectively.

DOI ↗ PDF ↗

Do You Remember the Future? Weak-to-Strong Generalization in 3D Object Detection

2024 · CHAPTER · en

This paper demonstrates a novel method for LiDAR-based 3D object detection, addressing major field challenges: sparsity and occlusion. Our approach leverages temporal point cloud sequences to generate frames that provide comprehensive views of objects from multiple angles. To address the challenge of generating these frames in real-time, we employ Knowledge Distillation within a Teacher-Student framework, allowing the Student model to emulate the Teacher's advanced perception. We pioneered the application of weak-to-strong generalization in computer vision by training our Teacher model on enriched, object-complete data. In this demo, we showcase the exceptional quality of labels produced by the X-Ray Teacher on object-complete frames, showing our method distilling its knowledge to enhance object 3D detection models.

DOI ↗ PDF ↗

SwiftDepth++: An Efficient and Lightweight Model for Accurate Depth Estimation.

2024 в печати · ARTICLE · en

Depth estimation is a crucial task across various domains, but the high cost of collecting labeled depth data has led to growing interest in self-supervised monocular depth estimation methods. In this paper, we introduce SwiftDepth++, a lightweight depth estimation model that delivers competitive results while maintaining a low computational budget. The core innovation of SwiftDepth++ lies in its novel depth decoder, which enhances efficiency by rapidly compressing features while preserving essential information. Additionally, we incorporate a teacher-student knowledge distillation framework that guides the student model in refining its predictions.

DOI ↗

GEEF: A neural network model for automatic essay feedback generation by integrating writing skills assessment

2024 · ARTICLE · en

Оценка способности студентов к написанию сочинений всегда была важной составляющей исследований в области автоматической оценки эссе. Существующие методы обычно сосредотачиваются на выдаче численных оценок, и мало исследований, касающихся предоставления обратной связи о качестве письма на основе техник генерации текста. Мы считаем, что генерация качественной обратной связи при написании эссе является более ценной в качестве справочного материала. В этом исследовании мы решаем проблему генерации обратной связи по эссе, предлагая модель нейронной сети с кодировщиком-декодером под названием GEEF (Generate Essay Feedback), и предполагаем, что обратная связь пишется на основе текста исходного эссе и оценки важных навыков письма. Помимо текста входного эссе, наша модель также учитывает дополнительные характеристики, включая плавность, связность, богатство и литературный талант. Мы создаем корпус обратной связи по эссе вместе с ресурсами, помеченными человеком, для облегчения исследования. Экспериментальные результаты демонстрируют, что предложенный подход показывает многообещающую эффективность по сравнению с другими базовыми методами на автоматических и человеческих метриках оценки. Сгенерированные предложения обратной связи по различным аспектам написания эссе могут помочь студентам понять их сильные и слабые стороны в навыках письма. Кроме того, потенциальное применение этого исследования заключается в том, что оно может помочь экспертам писать более сложную обратную связь на этой основе, так как часто бывает трудно запомнить многие общепринятые шаблоны обратной связи при написании комментариев.

DOI ↗ PDF ↗

Refining the ONCE Benchmark With Hyperparameter Tuning

2024 · ARTICLE · en

In response to the growing demand for 3D object detection in applications such as autonomous driving, robotics, and augmented reality, this work focuses on the evaluation of semi-supervised learning approaches for point cloud data. The point cloud representation provides reliable and consistent observations regardless of lighting conditions, thanks to advances in LiDAR sensors. Data annotation is of paramount importance in the context of LiDAR applications, and automating 3D data annotation with semi-supervised methods is a pivotal challenge that promises to reduce the associated workload and facilitate the emergence of cost-effective LiDAR solutions. Nevertheless, the task of semi-supervised learning in the context of unordered point cloud data remains formidable due to the inherent sparsity and incomplete shapes that hinder the generation of accurate pseudo-labels. In this study, we consider these challenges by posing the question: “To what extent does unlabelled data contribute to the enhancement of model performance?” We show that improvements from previous semi-supervised methods may not be as profound as previously thought. Our results suggest that simple grid search hyperparameter tuning applied to a supervised model can lead to state-of-the-art performance on the ONCE dataset, while the contribution of unlabelled data appears to be comparatively less exceptional.

DOI ↗ PDF ↗

Adversarial Attacks and Defenses in Fault Detection and Diagnosis: A Comprehensive Benchmark on the Tennessee Eastman Process

2024 в печати · ARTICLE · en

Integrating machine learning into Automated Control Systems (ACS) enhances decision-making in industrial process management. One of the limitations to the widespread adoption of these technologies in industry is the vulnerability of neural networks to adversarial attacks. This study explores the threats in deploying deep learning models for Fault Detection and Diagnosis (FDD) in ACS using the Tennessee Eastman Process dataset. By evaluating three neural networks with different architectures, we subject them to six types of adversarial attacks and explore five different defense methods. Our results highlight the strong vulnerability of models to adversarial samples and the varying effectiveness of defense strategies. We also propose a new defense strategy based on combining adversarial training and data quantization. This research contributes several insights into securing machine learning within ACS, ensuring robust FDD in industrial processes.

DOI ↗

Weak-to-Strong 3D Object Detection with X-Ray Distillation

2024 · CHAPTER · en

PDF ↗

Курсы (7)

Research Seminar in Financial Economics

2025/2026 · Магистратура · Анг
Литература Древнего Египта

2024/2025 · Бакалавриат · рус
Visual geometry and 3D image processing

2022/2023 · Маго-лего / Нижний Новгород · Анг
Network Science

2021/2022 · Магистратура · Анг
Project Seminar ''Intelligent Systems and Structural Analysis''

2021/2022 · Магистратура · Анг
Social Networks

2021/2022 · Магистратура · Анг
Structural Analysis and Visualization of Networks

2021/2022 · Магистратура · Анг