DSA Faculty
API
← к списку преподавателей

Кузнецов Сергей Олегович

Факультет компьютерных наук

Профиль на hse.ru ↗ тел.: +7 (495) 772-95-90 | 27318
Публикаций
71
Языков
3
Наград
22
Конференций
8
Профиль Публикации (82) Курсы (7)

Профессиональные интересы

анализ формальных понятий27.00.00 Математика27.47.23 Математические проблемы искусственного интеллекта28.23.17 Логика в искусственном интеллекте

Должности

  • Директор центраФакультет компьютерных наук, Центр языковых и семантических технологий
  • ПрофессорФакультет компьютерных наук, Департамент анализа данных и искусственного интеллекта
  • Заведующий лабораториейФакультет компьютерных наук, Международная лаборатория интеллектуальных систем и структурного анализа
  • Академический руководитель образовательной программыНауки о данных (Data Science)

Био

  • · Начал работать в НИУ ВШЭ в 2006 году.
  • · Научно-педагогический стаж: 46 лет.

Образование

  • 2002 · Доктор физико-математических наук: Вычислительный центр им. А.А. Дородницына РАН, специальность 05.13.17 «Теоретические основы информатики», тема диссертации: Теория машинного обучения в решетках формальных понятий
  • 1990 · Кандидат наук: Всероссийский институт научной и технической информации РАН, специальность 05.13.17 «Теоретические основы информатики», тема диссертации: Повышение обоснованности гипотез в интеллектуальных системах специального типа
  • 1985 · Специалитет: Московский физико-технический институт, специальность «Автоматические системы управления», квалификация «Инженер-физик»

Опыт работы

  • · 2014: наст. время - Руководитель департамента анализа данных и икусственного интеллекта, заведующий международной научно-учебной лабораторией «Интеллектуальные системы и структурный анализ»
  • · 2006-2014: Заведующий отделением прикладной математики и информатики НИУ ВШЭ, заведующий кафедрой анализа данных и искусственного интеллекта, заведующий научно-учебной лабораторией «Интеллектуальные системы и структурный анализ»
  • · 2006-2011: (по совместительству) Заведующий кафедрой «Распознавание изображений и обработка текста» (базовая кафедра ABBYY), Факультет инноваций и высоких технологий, Московский Физико-Технический Институт (МФТИ)
  • · 2003-2004: научный сотрудник института алгебры политехнического университета Дрездена (по стипендии фонда Александра фон Гумбольдта)
  • · 1985 -2006: ст. лаборант, м.н.с., н.с., с.н.с., вед. н.с. Всероссийского (до
  • · 1991 г.: Всесоюзного) Института Научной и Технической Информации (ВИНИТИ РАН)
  • · 1995-1999: (по совместительству) Старший переводчик, научный редактор журнала International Journal of Computer and System Sciences, МАИК Наука - Интерпериодика

Награды и поощрения

  • · Благодарность проректора НИУ ВШЭ (октябрь 2024)
  • · Благодарность проректора НИУ ВШЭ (ноябрь 2023)
  • · Медаль "Признание - 15 лет успешной работы" НИУ ВШЭ (декабрь 2022)
  • · Почетная грамота Министерства науки и высшего образования Российской Федерации (июль 2022)
  • · Благодарственное письмо проректора НИУ ВШЭ (ноябрь 2021)
  • · Благодарственное письмо ректора НИУ ВШЭ (март 2021)
  • · Благодарственное письмо ректора НИУ ВШЭ (ноябрь 2020)
  • · Благодарность проректора НИУ ВШЭ (май 2019)
  • · Почетная грамота Высшей школы экономики (апрель 2019)
  • · Благодарность Министра экономического развития Российской Федерации (сентябрь 2017)
  • · Почетная грамота Высшей школы экономики (ноябрь 2013)
  • · Почетное звание "Почетный работник науки и техники Российской Федерации" (ноябрь 2012)
  • · Благодарность Высшей школы экономики (апрель 2012)
  • · Надбавка за академическую работу (2009–2010, 2008–2009, 2007–2008)
  • · Надбавка за публикацию в международном рецензируемом научном издании (2020–2021, 2018–2020)
  • · Надбавка за регулярные публикации в международных рецензируемых научных изданиях (2021–2026)
  • · Надбавка за статью в зарубежном рецензируемом журнале (2014–2016, 2012–2014, 2010–2012)
  • · Надбавка за статью в зарубежном рецензируемом научном издании (2016–2018)
  • · Лауреат премии "Золотая Вышка" 2017 в номинации Успех педагога
  • · Лучший академический руководитель в номинации «Цифровые навыки студентов» — 2024–2025
  • · Лучший академический руководитель в номинации «Прием иностранных студентов» — 2023–2024
  • · Лучший академический руководитель в номинации «Привлечение студентов» — 2024

Гранты и проекты

  • · на соискание учёной степени кандидата наук

Конференции (8)

Показать все
  • · 2018: The 14th International Conference on Concept Lattices and Their Applications (Оломоуц). Доклад: A First Study on What MDL Can Do for FCA
  • · 2017: 14th International Conference on Formal Concept Analysis, ICFCA 2017 (Ренн). Доклад: On Overfitting of Classifiers Making a Lattice
  • · 2016: The 13th International Conference on Concept Lattices and Their Applications (CLA2016) (Москва). Доклад: Stability for triadic concepts
  • · 2015: 13th International Conference on Formal Concept Analysis, ICFCA 2015 (Nerja). Доклад: Revisiting Pattern Structure Projection
  • · 2015: Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015 (Porto). Доклад: Fast Generation of Best Interval Patterns for Nonmonotonic Constraint
  • · 2015: 4th International Workshop "What can FCA do for Artificial Intelligence?", FCA4AI 2015 (Buenos Aires). Доклад: SOFIA: How to Make FCA Polynomial?
  • · 2015: Russian and South African Workshop on Knowledge Discovery Techniques Based on Formal Concept Analysis (RuZA 2015) (Stellenbosch). Доклад: Browsing publication data using tag clouds over concept lattices constructed by key-phrase extraction
  • · 2013: 11th International Conference on Formal Concept Analysis (Дрезден). Доклад: FCART: A New FCA-based System for Data Analysis and Knowledge Discovery

Идентификаторы исследователя

Публикации (82)

Data Analytics and Management in Data Intensive Domains: 25th International Conference, DAMDID/RCDL 2023, Moscow, Russia, October 24–27, 2023, Revised Selected Papers

2024 · BOOK · en

This book constitutes the post-conference proceedings of the 25th International Conference on Data Analytics and Management in Data Intensive Domains, DAMDID/RCDL 2023, held in Moscow, Russia, during 24-27 October 2023. The 21 papers presented here were carefully reviewed and selected from 75 submissions. These papers are organized in the following topical sections: Data Models and Knowledge Graphs; Databases in Data Intensive Domains; Machine learning methods and applications; Data Analysis in Astronomy & Information extraction from text. Papers from keynote talks have also been included in this book.

Interactive Research Toolbox for Chemical Compounds Analysis Based on Well-interpretable ML Methods

2024 · CHAPTER · en

This paper provides an overview of tool for computational experiments, which are utilized in the realm of compound property analysis. Moreover, it outlines the initial outcomes of developing a novel research toolbox system that concentrates on exploring the interpretability and explainability of machine learning outcomes. The proposed solution is based on open-source tools and offers a convenient approach to address specific material science issues. The efficacy of this solution is currently being tested on problems related to compound synthesis optimization, antibacterial activity analysis, and other related areas.

Description Quivers for Compact Representation of Concept Lattices and Ensembles of Decision Trees

2023 · CHAPTER · en

In this paper we introduce and study description quivers as compact representations of concept lattices and respective ensembles of decision trees. Formally, description quivers are directed multigraphs where vertices represent concept intents and (multiple) edges represent generators of intents. We study some properties of description quivers and shed light on their use for describing state-of-the-art symbolic machine learning models based on decision trees. We also argue that a concept lattice can be considered as a cornerstone in constructing an efficient machine learning model. We show that the proposed description quivers allow us to fuse decision trees just as we can sum linear regressions, while proposing a way to select the most important rules in decision models, just as we can select the most important coefficients in regressions.

Constructing decision quivers

2023 · CHAPTER · en

Rule Learning and Formal Concept Analysis (FCA) are two fields of science that study similar topic yet speak in a very different terms. This paper describes rule-based machine learning models with FCA-based terminology which results in decision quiver model. A decision quiver, discussed in the paper, is a supervised machine learning model that is based on intents, generators of intents, and predictions for each intent (or generator). We show that the finding of the optimal set of intents is a cornerstone task in constructing a decision quiver (and thus, any rule-based model). The paper finishes with the baseline algorithm to construct decision quivers. The algorithm produces machine learning models that are much smaller than the state-of-the-art ensembles of decision trees, yet that offer the similar quality of predictions.

Formal Concept Analysis for Evaluating Intrinsic Dimension of a Natural Language

2023 · ARTICLE · en

Some results of a computational experiment for determining the intrinsic dimension of linguistic varieties for the Bengali and Russian languages are presented. At the same time, both sets of words and sets of bigrams in these languages were considered separately. The method used to solve this problem was based on formal concept analysis algorithms. It was found that the intrinsic dimensions of these languages are significantly less than the dimensions used in popular neural network models in natural language processing.

Explainable Document Classification via Pattern Structures

2023 · ARTICLE · en

Inherently explainable Machine Learning (ML) models are able to provide explanations for their predictions by virtue of their construction. The explanations of a ML model are more comprehensible if they are expressed in terms of its input features. Our paper proposes an inherently explainable pipeline for document classification using pattern structures and Abstract Meaning Representation (AMR) graphs. The pipeline generates two kinds of explanations: intermediate and final ones, that justify its classifications. Intermediate explanations are represented as significant subgraphs found in the document graphs of test documents. Final explanations are the sentences of the test documents, that correspond to the significant subgraphs.

2022 IEEE International Conference on Data Mining (ICDM)

2022 · BOOK · en

In this paper, we revisit pattern mining and study the distribution underlying a binary dataset thanks to the closure structure which is based on passkeys, i.e., minimum generators in equivalence classes robust to noise. We introduce △-closedness, a generalization of the closure operator, where △ measures how a closed set differs from its upper neighbors in the partial order induced by closure. A △-class of equivalence includes minimum and maximum elements and allows us to characterize the distribution underlying the data. Moreover, the set of △-classes of equivalence can be partitioned into the so-called △-closure structure. In particular, a △-class of equivalence with a high △ is supported by more observations and thus is more stable. In the experiments, we study the △-closure structure of several real-world datasets and show that this structure is very stable for large △ and does not substantially depend on the data sampling used for the analysis.

Mint: MDL-based approach for Mining INTeresting Numerical Pattern Sets

2022 · ARTICLE · en

Pattern mining is well established in data mining research, especially for mining binary datasets. Surprisingly, there is much less work about numerical pattern mining and this research area remains under-explored. In this paper we propose MINT, an efficient MDL-based algorithm for mining numerical datasets. The MDL principle is a robust and reliable framework widely used in pattern mining, and as well in subgroup discovery. In MINT we reuse MDL for discovering useful patterns and returning a set of non-redundant overlapping patterns with well-defined boundaries and covering meaningful groups of objects. MINT is not alone in the category of numerical pattern miners based on MDL. In the experiments presented in the paper we show that MINT outperforms competitors among which IPD, REALKRIMP, and SLIM.

Pattern Structures for Knowledge Processing and Information Retrieval

2022 · CHAPTER · en

Introducing the closure structure and the GDPM algorithm for mining and understanding a tabular dataset

2022 · BOOK · en

Pattern mining is one of the most studied fields in data mining. Being mostly motivated by practitioners, pattern mining algorithms are often based on heuristics and are lacking suitable formalization. In this paper, we are revisiting pattern mining, and especially itemset mining, which allows one to analyze binary datasets in searching for interesting and meaningful itemsets and respective association rules. We introduce a concise representation –the closure structure– based on closed itemsets and their minimum generators (called “passkeys”) for capturing the intrinsic content of a dataset. The closure structure allows one to understand the content of the dataset in terms of closed sets and equivalence classes of itemsets. We discuss theoretical properties of passkeys which are concise representatives of closed itemsets. We propose a formalization of the closure structure and passkeys in terms of Formal Concept Analysis, which is well adapted to studying such elements. Besides theoretical results, we present the GDPM algorithm for enumerating passkeys and discovering the closure structure. GDPM is rather unique as it returns a characterization of a dataset content in terms of complexity levels, highlighting the diversity and the distribution of the itemsets. Finally, some experiments show how the GDPM algorithm and the closure structure can be practically used.

Курсы (7)