Кузнецов Сергей Олегович

Факультет компьютерных наук

Профиль на hse.ru ↗ тел.: +7 (495) 772-95-90 | 27318

Публикаций

Языков

Наград

Конференций

Профиль Публикации (82) Курсы (7)

Профессиональные интересы

анализ формальных понятий27.00.00 Математика27.47.23 Математические проблемы искусственного интеллекта28.23.17 Логика в искусственном интеллекте

Должности

Директор центра — Факультет компьютерных наук, Центр языковых и семантических технологий
Профессор — Факультет компьютерных наук, Департамент анализа данных и искусственного интеллекта
Заведующий лабораторией — Факультет компьютерных наук, Международная лаборатория интеллектуальных систем и структурного анализа
Академический руководитель образовательной программы — Науки о данных (Data Science)

Био

· Начал работать в НИУ ВШЭ в 2006 году.
· Научно-педагогический стаж: 46 лет.

Образование

2002 · Доктор физико-математических наук: Вычислительный центр им. А.А. Дородницына РАН, специальность 05.13.17 «Теоретические основы информатики», тема диссертации: Теория машинного обучения в решетках формальных понятий
1990 · Кандидат наук: Всероссийский институт научной и технической информации РАН, специальность 05.13.17 «Теоретические основы информатики», тема диссертации: Повышение обоснованности гипотез в интеллектуальных системах специального типа
1985 · Специалитет: Московский физико-технический институт, специальность «Автоматические системы управления», квалификация «Инженер-физик»

Опыт работы

· 2014: наст. время - Руководитель департамента анализа данных и икусственного интеллекта, заведующий международной научно-учебной лабораторией «Интеллектуальные системы и структурный анализ»
· 2006-2014: Заведующий отделением прикладной математики и информатики НИУ ВШЭ, заведующий кафедрой анализа данных и искусственного интеллекта, заведующий научно-учебной лабораторией «Интеллектуальные системы и структурный анализ»
· 2006-2011: (по совместительству) Заведующий кафедрой «Распознавание изображений и обработка текста» (базовая кафедра ABBYY), Факультет инноваций и высоких технологий, Московский Физико-Технический Институт (МФТИ)
· 2003-2004: научный сотрудник института алгебры политехнического университета Дрездена (по стипендии фонда Александра фон Гумбольдта)
· 1985 -2006: ст. лаборант, м.н.с., н.с., с.н.с., вед. н.с. Всероссийского (до
· 1991 г.: Всесоюзного) Института Научной и Технической Информации (ВИНИТИ РАН)
· 1995-1999: (по совместительству) Старший переводчик, научный редактор журнала International Journal of Computer and System Sciences, МАИК Наука - Интерпериодика

Награды и поощрения

· Благодарность проректора НИУ ВШЭ (октябрь 2024)
· Благодарность проректора НИУ ВШЭ (ноябрь 2023)
· Медаль "Признание - 15 лет успешной работы" НИУ ВШЭ (декабрь 2022)
· Почетная грамота Министерства науки и высшего образования Российской Федерации (июль 2022)
· Благодарственное письмо проректора НИУ ВШЭ (ноябрь 2021)
· Благодарственное письмо ректора НИУ ВШЭ (март 2021)
· Благодарственное письмо ректора НИУ ВШЭ (ноябрь 2020)
· Благодарность проректора НИУ ВШЭ (май 2019)
· Почетная грамота Высшей школы экономики (апрель 2019)
· Благодарность Министра экономического развития Российской Федерации (сентябрь 2017)
· Почетная грамота Высшей школы экономики (ноябрь 2013)
· Почетное звание "Почетный работник науки и техники Российской Федерации" (ноябрь 2012)
· Благодарность Высшей школы экономики (апрель 2012)
· Надбавка за академическую работу (2009–2010, 2008–2009, 2007–2008)
· Надбавка за публикацию в международном рецензируемом научном издании (2020–2021, 2018–2020)
· Надбавка за регулярные публикации в международных рецензируемых научных изданиях (2021–2026)
· Надбавка за статью в зарубежном рецензируемом журнале (2014–2016, 2012–2014, 2010–2012)
· Надбавка за статью в зарубежном рецензируемом научном издании (2016–2018)
· Лауреат премии "Золотая Вышка" 2017 в номинации Успех педагога
· Лучший академический руководитель в номинации «Цифровые навыки студентов» — 2024–2025
· Лучший академический руководитель в номинации «Прием иностранных студентов» — 2023–2024
· Лучший академический руководитель в номинации «Привлечение студентов» — 2024

Гранты и проекты

— · на соискание учёной степени кандидата наук

Конференции (8)

Показать все

· 2018: The 14th International Conference on Concept Lattices and Their Applications (Оломоуц). Доклад: A First Study on What MDL Can Do for FCA
· 2017: 14th International Conference on Formal Concept Analysis, ICFCA 2017 (Ренн). Доклад: On Overfitting of Classifiers Making a Lattice
· 2016: The 13th International Conference on Concept Lattices and Their Applications (CLA2016) (Москва). Доклад: Stability for triadic concepts
· 2015: 13th International Conference on Formal Concept Analysis, ICFCA 2015 (Nerja). Доклад: Revisiting Pattern Structure Projection
· 2015: Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015 (Porto). Доклад: Fast Generation of Best Interval Patterns for Nonmonotonic Constraint
· 2015: 4th International Workshop "What can FCA do for Artificial Intelligence?", FCA4AI 2015 (Buenos Aires). Доклад: SOFIA: How to Make FCA Polynomial?
· 2015: Russian and South African Workshop on Knowledge Discovery Techniques Based on Formal Concept Analysis (RuZA 2015) (Stellenbosch). Доклад: Browsing publication data using tag clouds over concept lattices constructed by key-phrase extraction
· 2013: 11th International Conference on Formal Concept Analysis (Дрезден). Доклад: FCART: A New FCA-based System for Data Analysis and Knowledge Discovery

Идентификаторы исследователя

ORCID: 0000-0003-3284-9001
ResearcherID: I-9058-2012
SPIN РИНЦ: 4864-2308
Google Scholar: https://scholar.google.ru/citations?user=YcvE4i4AAAAJ&hl=en
Scopus AuthorID: 7202573378

Публикации (82)

A First Study on What MDL Can Do for FCA

2018 · CHAPTER · en

Dualization in lattices given by ordered sets of irreducibles

2017 · ARTICLE · en

Dualization of a monotone Boolean function on a finite lattice can be represented by transforming the set of its minimal 1 values to the set of its maximal 0 values. In this paper we consider finite lattices given by ordered sets of their meet and join irreducibles (i.e., as a concept lattice of a formal context). We show that in this case dualization is equivalent to the enumeration of so-called minimal hypotheses. In contrast to usual dualization setting, where a lattice is given by the ordered set of its elements, dualization in this case is shown to be impossible in output polynomial time unless P = NP. However, if the lattice is distributive, dualization is shown to be possible in subexponential time.

DOI ↗ PDF ↗

Efficient Mining of Subsample-Stable Graph Patterns

2017 · CHAPTER · en

A scalable method for mining graph patterns stable under subsampling is proposed. The existing subsample stability and robustness measures are not antimonotonic according to definitions known so far. We study a broader notion of antimonotonicity for graph patterns, so that measures of subsample stability become antimonotonic. Then we propose gSOFIA for mining the most subsample-stable graph patterns. The experiments on numerous graph datasets show that gSOFIA is very efficient for discovering subsample-stable graph patterns.

DOI ↗ PDF ↗

Pattern Structures for Risk Group Identification

2017 · CHAPTER · en

Today personalized medicine is one of the most popular interdisciplinary research field, risk group identification being one of its most important tasks. Even though the first attempts to estimate the effect of patient’s characteristics on the outcome were proposed in statistics in the middle of the twentieth century, it is still an open question how to explore such effects properly. In this paper we propose a trial version of the approach to risk group specification based on pattern structures and competing risk estimation, and discuss further steps of research on its performance and specificity.

PDF ↗

On Overfitting of Classifiers Making a Lattice

2017 · CHAPTER · en

Mining convex polygon patterns with formal Concept Analysis

2017 · CHAPTER · en

Pattern mining is an important task in AI for eliciting hypotheses from the data. When it comes to spatial data, the geo-coordinates are often considered independently as two different attributes. Consequently, rectangular shapes are searched for. Such an arbitrary form is not able to capture interesting regions in general. We thus introduce convex polygons, a good trade-off between expressiv-ity and algorithmic complexity. Our contribution is threefold: (i) We formally introduce such patterns in Formal Concept Analysis (FCA), (ii) we give all the basic bricks for mining convex polygons with exhaustive search and pattern sampling, and (iii) we design several algorithms, which we compare experimentally.

DOI ↗ PDF ↗

On mining complex sequential data by means of FCA and pattern structures

2016 · ARTICLE · en

Nowadays data-sets are available in very complex and heterogeneous ways. Mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of “complex” sequential data by means of interesting sequential patterns. We approach the problem using the elegant mathematical framework of formal concept analysis and its extension based on “pattern structures”. Pattern structures are used for mining complex data (such as sequences or graphs) and are based on a subsumption operation, which in our case is defined with respect to the partial order on sequences. We show how pattern structures along with projections (i.e. a data reduction of sequential structures) are able to enumerate more meaningful patterns and increase the computing efficiency of the approach. Finally, we show the applicability of the presented method for discovering and analysing interesting patient patterns from a French healthcare data-set on cancer. The quantitative and qualitative results (with annotations and analysis from a physician) are reported in this use-case which is the main motivation for this work.

DOI ↗ PDF ↗

Style and Genre Classification by Means of Deep Textual Parsing

2016 · CHAPTER · en

InIn this paper we show that using deep textual parsing, which is finding complex features such as syntactic and discourse structures of the text, helps to improve the quality of style and genre classification. These results confirm achievements of many researches that have many times stated that using syntactic or morphological pattern for style and genre classification results in poor precision and recall. The best practice so far is to use n-gram patterns for this type of text classification problem. Syntactic and discourse structures allow however to capture some style of genre specific pattern of texts and to reach average precision higher than 95% on binary multi-genre classification.

PDF ↗

Global Optimization in Learning with Important Data: an FCA-Based Approach

2016 · CHAPTER · en

Nowadays decision tree learning is one of the most popular classification and regression techniques. Though decision trees are not accurate on their own, they make very good base learners for advanced tree-based methods such as random forests and gradient boosted trees. However, applying ensembles of trees deteriorates interpretability of the final model. Another problem is that decision tree learning can be seen as a greedy search for a good classification hypothesis in terms of some information-based criterion such as Gini impurity or information gain. But in case of small data sets the global search might be possible. In this paper, we propose an FCA-based lazy classification technique where each test instance is classified with a set of the best (in terms of some information-based criterion) rules. In a set of benchmarking experiments, the proposed strategy is compared with decision tree and nearest neighbor learning.

PDF ↗

Interval Pattern Concept Lattice as a Classifier Ensemble

2016 · CHAPTER · en

Decision tree learning is one of the most popular classifica- tion techniques. However, by its nature it is a greedy approach to finding a classification hypothesis that optimizes some information-based crite- rion. It is very fast but may lead to finding suboptimal classification hy- potheses. Moreover, in spite of decision trees being easily interpretable, ensembles of trees (random forests and gradient-boosted trees) are not, which is crucial in some domains, like medical diagnostics or bank credit scoring. In case of such “small, but important-data” problems one is not obliged to perform a greedy search for classification hypotheses, and therefore alternatives to decision tree learning techniques may be con- sidered. In this paper, we propose an FCA-based classification technique where each test instance is classified with a set of the best (in terms of some information-based criterion) classification rules. In a set of bench- marking experiments, the proposed strategy is compared with decision tree and nearest neighbor learning.

PDF ↗

Курсы (7)

Research Seminar "Data Analysis in Applied Research"

2025/2026 · Бакалавриат · Анг
Research Seminar "Data Analysis and Artificial Intelligence" · 3 раза

2025/2026, 2024/2025, 2023/2024 · Бакалавриат · Анг
Mentor's Seminar · 3 раза

2025/2026, 2024/2025, 2023/2024 · Магистратура · Анг
Ordered Sets in Data Analysis · 5 раза

2025/2026, 2024/2025, 2023/2024, 2022/2023, 2021/2022 · Магистратура / Маго-лего · Анг
Postgraduate Seminar

2024/2025 · Аспирантура направление: 00.00.00. Аспирантура · Анг
Reading the Best Doctoral Dissertations in Computer Science

2024/2025 · Аспирантура · Анг
00.00.00. Аспирантура

2023/2024 · Аспирантура направление: 00.00.00. Аспирантура, направление: 00.00.00. Аспирантура · Анг