Кузнецов Сергей Олегович
Факультет компьютерных наук
Профессиональные интересы
Должности
- Директор центра — Факультет компьютерных наук, Центр языковых и семантических технологий
- Профессор — Факультет компьютерных наук, Департамент анализа данных и искусственного интеллекта
- Заведующий лабораторией — Факультет компьютерных наук, Международная лаборатория интеллектуальных систем и структурного анализа
- Академический руководитель образовательной программы — Науки о данных (Data Science)
Био
- · Начал работать в НИУ ВШЭ в 2006 году.
- · Научно-педагогический стаж: 46 лет.
Образование
- 2002 · Доктор физико-математических наук: Вычислительный центр им. А.А. Дородницына РАН, специальность 05.13.17 «Теоретические основы информатики», тема диссертации: Теория машинного обучения в решетках формальных понятий
- 1990 · Кандидат наук: Всероссийский институт научной и технической информации РАН, специальность 05.13.17 «Теоретические основы информатики», тема диссертации: Повышение обоснованности гипотез в интеллектуальных системах специального типа
- 1985 · Специалитет: Московский физико-технический институт, специальность «Автоматические системы управления», квалификация «Инженер-физик»
Опыт работы
- · 2014: наст. время - Руководитель департамента анализа данных и икусственного интеллекта, заведующий международной научно-учебной лабораторией «Интеллектуальные системы и структурный анализ»
- · 2006-2014: Заведующий отделением прикладной математики и информатики НИУ ВШЭ, заведующий кафедрой анализа данных и искусственного интеллекта, заведующий научно-учебной лабораторией «Интеллектуальные системы и структурный анализ»
- · 2006-2011: (по совместительству) Заведующий кафедрой «Распознавание изображений и обработка текста» (базовая кафедра ABBYY), Факультет инноваций и высоких технологий, Московский Физико-Технический Институт (МФТИ)
- · 2003-2004: научный сотрудник института алгебры политехнического университета Дрездена (по стипендии фонда Александра фон Гумбольдта)
- · 1985 -2006: ст. лаборант, м.н.с., н.с., с.н.с., вед. н.с. Всероссийского (до
- · 1991 г.: Всесоюзного) Института Научной и Технической Информации (ВИНИТИ РАН)
- · 1995-1999: (по совместительству) Старший переводчик, научный редактор журнала International Journal of Computer and System Sciences, МАИК Наука - Интерпериодика
Награды и поощрения
- · Благодарность проректора НИУ ВШЭ (октябрь 2024)
- · Благодарность проректора НИУ ВШЭ (ноябрь 2023)
- · Медаль "Признание - 15 лет успешной работы" НИУ ВШЭ (декабрь 2022)
- · Почетная грамота Министерства науки и высшего образования Российской Федерации (июль 2022)
- · Благодарственное письмо проректора НИУ ВШЭ (ноябрь 2021)
- · Благодарственное письмо ректора НИУ ВШЭ (март 2021)
- · Благодарственное письмо ректора НИУ ВШЭ (ноябрь 2020)
- · Благодарность проректора НИУ ВШЭ (май 2019)
- · Почетная грамота Высшей школы экономики (апрель 2019)
- · Благодарность Министра экономического развития Российской Федерации (сентябрь 2017)
- · Почетная грамота Высшей школы экономики (ноябрь 2013)
- · Почетное звание "Почетный работник науки и техники Российской Федерации" (ноябрь 2012)
- · Благодарность Высшей школы экономики (апрель 2012)
- · Надбавка за академическую работу (2009–2010, 2008–2009, 2007–2008)
- · Надбавка за публикацию в международном рецензируемом научном издании (2020–2021, 2018–2020)
- · Надбавка за регулярные публикации в международных рецензируемых научных изданиях (2021–2026)
- · Надбавка за статью в зарубежном рецензируемом журнале (2014–2016, 2012–2014, 2010–2012)
- · Надбавка за статью в зарубежном рецензируемом научном издании (2016–2018)
- · Лауреат премии "Золотая Вышка" 2017 в номинации Успех педагога
- · Лучший академический руководитель в номинации «Цифровые навыки студентов» — 2024–2025
- · Лучший академический руководитель в номинации «Прием иностранных студентов» — 2023–2024
- · Лучший академический руководитель в номинации «Привлечение студентов» — 2024
Гранты и проекты
- — · на соискание учёной степени кандидата наук
Конференции (8)
Показать все
- · 2018: The 14th International Conference on Concept Lattices and Their Applications (Оломоуц). Доклад: A First Study on What MDL Can Do for FCA
- · 2017: 14th International Conference on Formal Concept Analysis, ICFCA 2017 (Ренн). Доклад: On Overfitting of Classifiers Making a Lattice
- · 2016: The 13th International Conference on Concept Lattices and Their Applications (CLA2016) (Москва). Доклад: Stability for triadic concepts
- · 2015: 13th International Conference on Formal Concept Analysis, ICFCA 2015 (Nerja). Доклад: Revisiting Pattern Structure Projection
- · 2015: Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015 (Porto). Доклад: Fast Generation of Best Interval Patterns for Nonmonotonic Constraint
- · 2015: 4th International Workshop "What can FCA do for Artificial Intelligence?", FCA4AI 2015 (Buenos Aires). Доклад: SOFIA: How to Make FCA Polynomial?
- · 2015: Russian and South African Workshop on Knowledge Discovery Techniques Based on Formal Concept Analysis (RuZA 2015) (Stellenbosch). Доклад: Browsing publication data using tag clouds over concept lattices constructed by key-phrase extraction
- · 2013: 11th International Conference on Formal Concept Analysis (Дрезден). Доклад: FCART: A New FCA-based System for Data Analysis and Knowledge Discovery
Идентификаторы исследователя
- ORCID:
0000-0003-3284-9001 - ResearcherID:
I-9058-2012 - SPIN РИНЦ:
4864-2308 - Google Scholar: https://scholar.google.ru/citations?user=YcvE4i4AAAAJ&hl=en
- Scopus AuthorID:
7202573378
Публикации (82)
Explainable Document Classification via Concept Whitening and Stable Graph Patterns
2025 · ARTICLE · en
This paper proposes a novel explainable document classification framework that integrates Concept Whitening (CW) with graph concepts that are derived from stable graph patterns, and extracted via methods based on Formal Concept Analysis (FCA) and pattern structures. Document graphs are constructed using Abstract Meaning Representation (AMR) graphs, from which graph concepts are extracted and aligned with the latent space axes of Graph Neural Networks (GNNs) using CW. We investigate four types of graph concepts for their effect on concept alignment: frequent subgraphs, graph pattern concepts, filtered equivalence classes, and closed subgraphs. A novel filtration mechanism based on support, along with a custom penalty metric, is proposed to refine graph concepts for maximizing concept alignment. Experiments on the 10 Newsgroups and BBC Sport datasets show that our document graphs effectively capture both structural and semantic information, thereby supporting competitive classification performance across multiple GNN model architectures and configurations. For the 10 Newsgroups dataset, GNN models equipped with a CW module show an average increase of 0.7599 in the macro-averaged F1 score of the Concept Alignment Performance (CAP) metric, with an average drop of only 0.0025 in the document classification macro-averaged F1 score. Similarly, on the BBC Sport dataset, the average CAP improvement is 0.6998, with an average drop of 0.0894 in document classification performance. Additionally, concept gradient importance analyses and concept similarity heatmaps provide insights into the interpretability and structural separability of the GNN’s latent representations, achieved using CW.
Recovery degree constrained equiconcept/pseudo-equiconcept reduction in symmetric formal contexts
2025 · ARTICLE · en
In Formal Concept Analysis (FCA), concept reduction serves as an important means of simplification. The application scenarios of concept reduction cover various aspects such as data mining, knowledge discovery, strategic decision-making, and rule learning. For symmetric formal contexts, a specialized class of concept reduction exists that can fully recover all knowledge. However, most existing concept reduction algorithms are designed to recover complete knowledge, which poses limitations in various real-world applications. To this end, this paper proposes a basic recovery degree-constrained equiconcept reduction algorithm, termed RdER, enabling knowledge recovery to a specified extent. Additionally, to reduce its running time, an evolutionary algorithm, termed RdER+, is further developed. Meanwhile, given the simplicity of pseudo-equiconcepts, we also develop a recovery degree-constrained pseudo-equiconcept reduction algorithm, termed RdPR. A large number of experiments have demonstrated that RdER+ and RdPR have significantly reduced the running time while maintaining a relatively low redundancy and ensuring the average recovery degree. Particularly, when the dimension reaches 17, the running speeds of RdER+ and RdPR are, on average, 150 times faster than that of RdER. Moreover, as the dimension continues to increase, this advantage in speed will become even more pronounced.
Binary relations-preserving incremental pseudo-equiconcept reduction for symmetric formal context
2025 · ARTICLE · en
Concept reduct refers to the minimal subset of concepts that preserves the binary relation of the binary data table (formal context). Importantly, it reduces the complexity of problem-solving and improves the efficiency of concept-cognition using formal concept analysis (FCA). Particularly, for a symmetric formal context, there exists a significant class of concept reducts given by equiconcepts. The existing concept reduction algorithms suffer from low efficiency. To this end, this paper proposes an efficient incremental equiconcept-driven concept reduction algorithm. Here we introduce pseudo-equiconcept reduction that preserves the binary relation of the context and present an incremental pseudo-equiconcept reduction algorithm. Extensive experiments demonstrate that the performance of the resulting pseudo-equiconcept reduct is better than that of the concept reduct formed by equiconcepts, in terms of redundancy, average correlation, and running time.
Traffic Prediction Based on Formal Concept-Enhanced Federated Graph Learning
2025 · ARTICLE · en
Aiming to improve the efficiency of urban traffic management, previous studies have achieved considerable traffic prediction accuracy. For example, methods based on time series analysis perform well in short-term traffic prediction, and neural networks show strong capabilities in processing complex nonlinear relationships within traffic data. However, previous studies also have the following two limitations: 1) a large amount of complex traffic data will increase the complexity of the model during training and further reduce the accuracy of the training results; 2) the large-scale distribution of traffic data leads to incomplete model training and data security issues. To address these issues, we propose a Formal Concept-enhanced Federated Graph Convolutional Network (FC-FedGCN), which adopts formal concept analysis to fully mine graph data and improve the training accuracy of the GCNs model. Under federated learning, the GCNs model can be trained independently on different clients, and the local model is optimized by sharing model parameters. Coupled with the premise of protecting data privacy, the integrity of the data is guaranteed and the training accuracy of the GCNs model is improved. We compare our model with various baseline models based on the PEMS datasets, and the results demonstrate that FC-FedGCN has significant advantages in traffic prediction, outperforming the comparison methods in multiple indicators.
Atomic Patterns for Efficient Computation with Pattern Structures
2025 · CHAPTER · en
Pattern Structures is a framework in FCA allowing objects to have complex descriptions, only requiring that the set of descriptions forms a complete meet-semi-lattice. However, some particular descrip tions or patterns, such as subgraphs and subsequences, do not necessarily ensure that every pair of descriptions has a unique infimum and ask for additional operations, e.g., anti-chain completion. Moreover, meet-based approaches struggle to generate non-trivial implications for complex data since, in general, they only output closed descriptions. For overcoming such limitations, we introduce in this paper an alternative view of pat tern structures based on the join operation and the so-called “atomic patterns”. Such atomic patterns correspond to join-irreducible descrip tions in the join-semi-lattice of all possible descriptions. They enable an efficient traversal of the description space and the computation of closures, minimal generators, pseudo-intents, implications among others, while showing very good computational performance.
Data complexity: An FCA-based approach
2024 · ARTICLE · en
In this paper we propose different indices for measuring the complexity of a dataset in terms of Formal Concept Analysis (FCA). We extend the lines of the research about the “closure structure” and the “closure index” based on minimum generators of intents (aka closed itemsets). We would try to capture statistical properties of a dataset, not just extremal characteristics, such as the size of a passkey. For doing so we introduce an alternative approach where we measure the complexity of a dataset w.r.t. five significant elements that can be computed in a concept lattice, namely intents (closed sets of attributes), pseudo-intents, proper premises, keys (minimal generators), and passkeys (minimum generators). Then we define several original indices allowing us to estimate the complexity of a dataset. Moreover we study the distribution of all these different elements and indices in various real-world and synthetic datasets. Finally, we investigate the relations existing between these significant elements and indices, and as well the relations with implications and association rules.
Fairness-Aware Maximal Cliques Identification in Attributed Social Networks With Concept-Cognitive Learning
2024 · ARTICLE · en
Attributed social networks are pervasive in real life and play a crucial role in shaping various aspects of society. These networks not only capture the connections between individuals but also encompass the associated attributes and characteristics. Analyzing and understanding these attributes provide insights into social behaviors, information diffusion patterns, and the formation of influential communities. Consequently, we propose a novel algorithm for detecting fairness-aware maximal cliques in the attributed social networks. We extract the concept lattice of attributed social networks and quantify these concepts using the concept stability and fairness measures defined in this article. By utilizing the proposed fairness-aware distance, we identify fairness-aware maximal cliques within attributed social networks. The effectiveness of the algorithm is then validated using five real-world network datasets. Experimental results fully demonstrate the effectiveness and scalability of our approach in identifying key structures, analyzing attribute networks, and promoting the development of responsible computational systems.
Clustering with Stable Pattern Concepts
2024 · CHAPTER · en
Clustering aims at finding disjoint groups of similar objects in data and is one major task in Machine Learning. It is also gaining more attention in Formal Concept Analysis community in these last years. This paper proposes an original approach to the clustering of complex data based on Formal Concept Analysis (FCA) and Pattern Structures. Stable concepts are considered as cluster candidates and the SOFIA algorithm is used to discover the set of stable concepts in linear time. Then an algorithm inspired by a rare itemset mining algorithm is designed to build a clustering with good properties, i.e., high internal cohesion within a cluster and high external separation between the clusters. Some interestingness measures allowing us to choose the best clustering are discussed. Finally the present approach is compared to some other well-known algorithms such as KMeans, DBScan, and Optic.
Introductory Remarks to the Special Issue Devoted to DAMDID/RCDL-2023
2024 · ARTICLE · en
This special issue of Automation and Remote Control contains some scientific results presented at the 25th International Conference on Data Analytics and Management in Data-Intensive Domains (DAMDID/RCDL-2023). The conference was held on October 24–27, 2023, at National Research University Higher School of Economics (HSE) in Moscow, Russia. The reader is offered the full texts of selected DAMDID/RCDL-2023 papers relevant to the journal’s topics.
Document Classification via Stable Graph Patterns and Conceptual AMR Graphs
2024 · ARTICLE · en
This paper proposes an approach and an associated system based on pattern structures, aimed at the classification of documents represented as graphs. The representation of documents relies on Abstract Meaning Representation (AMR) document graphs. Given a set of AMR document graphs, the system learns characteristic graph patterns, that can be reused by an aggregate rule classifier to predict the class of a document. The selection of the most stable graph patterns is based on the gSOFIA algorithm and the Δ−stability measure. In the experiments, two document datasets are considered for validating the approach. The first includes documents belonging to 10 different newsgroups and the second contains sports news articles belonging to 5 topical areas. The results in terms of the macro-averaged F1 scores, are quite satisfactory and show that the approach is well-founded and useful.
Курсы (7)
-
Research Seminar "Data Analysis in Applied Research"
2025/2026 · Бакалавриат · Анг
-
Research Seminar "Data Analysis and Artificial Intelligence" · 3 раза
2025/2026, 2024/2025, 2023/2024 · Бакалавриат · Анг
-
Mentor's Seminar · 3 раза
2025/2026, 2024/2025, 2023/2024 · Магистратура · Анг
-
Ordered Sets in Data Analysis · 5 раза
2025/2026, 2024/2025, 2023/2024, 2022/2023, 2021/2022 · Магистратура / Маго-лего · Анг
-
Postgraduate Seminar
2024/2025 · Аспирантура направление: 00.00.00. Аспирантура · Анг
-
Reading the Best Doctoral Dissertations in Computer Science
2024/2025 · Аспирантура · Анг
-
00.00.00. Аспирантура
2023/2024 · Аспирантура направление: 00.00.00. Аспирантура, направление: 00.00.00. Аспирантура · Анг