Кузнецов Сергей Олегович

Факультет компьютерных наук

Профиль на hse.ru ↗ тел.: +7 (495) 772-95-90 | 27318

Публикаций

Языков

Наград

Конференций

Профиль Публикации (82) Курсы (7)

Профессиональные интересы

анализ формальных понятий27.00.00 Математика27.47.23 Математические проблемы искусственного интеллекта28.23.17 Логика в искусственном интеллекте

Должности

Директор центра — Факультет компьютерных наук, Центр языковых и семантических технологий
Профессор — Факультет компьютерных наук, Департамент анализа данных и искусственного интеллекта
Заведующий лабораторией — Факультет компьютерных наук, Международная лаборатория интеллектуальных систем и структурного анализа
Академический руководитель образовательной программы — Науки о данных (Data Science)

Био

· Начал работать в НИУ ВШЭ в 2006 году.
· Научно-педагогический стаж: 46 лет.

Образование

2002 · Доктор физико-математических наук: Вычислительный центр им. А.А. Дородницына РАН, специальность 05.13.17 «Теоретические основы информатики», тема диссертации: Теория машинного обучения в решетках формальных понятий
1990 · Кандидат наук: Всероссийский институт научной и технической информации РАН, специальность 05.13.17 «Теоретические основы информатики», тема диссертации: Повышение обоснованности гипотез в интеллектуальных системах специального типа
1985 · Специалитет: Московский физико-технический институт, специальность «Автоматические системы управления», квалификация «Инженер-физик»

Опыт работы

· 2014: наст. время - Руководитель департамента анализа данных и икусственного интеллекта, заведующий международной научно-учебной лабораторией «Интеллектуальные системы и структурный анализ»
· 2006-2014: Заведующий отделением прикладной математики и информатики НИУ ВШЭ, заведующий кафедрой анализа данных и искусственного интеллекта, заведующий научно-учебной лабораторией «Интеллектуальные системы и структурный анализ»
· 2006-2011: (по совместительству) Заведующий кафедрой «Распознавание изображений и обработка текста» (базовая кафедра ABBYY), Факультет инноваций и высоких технологий, Московский Физико-Технический Институт (МФТИ)
· 2003-2004: научный сотрудник института алгебры политехнического университета Дрездена (по стипендии фонда Александра фон Гумбольдта)
· 1985 -2006: ст. лаборант, м.н.с., н.с., с.н.с., вед. н.с. Всероссийского (до
· 1991 г.: Всесоюзного) Института Научной и Технической Информации (ВИНИТИ РАН)
· 1995-1999: (по совместительству) Старший переводчик, научный редактор журнала International Journal of Computer and System Sciences, МАИК Наука - Интерпериодика

Награды и поощрения

· Благодарность проректора НИУ ВШЭ (октябрь 2024)
· Благодарность проректора НИУ ВШЭ (ноябрь 2023)
· Медаль "Признание - 15 лет успешной работы" НИУ ВШЭ (декабрь 2022)
· Почетная грамота Министерства науки и высшего образования Российской Федерации (июль 2022)
· Благодарственное письмо проректора НИУ ВШЭ (ноябрь 2021)
· Благодарственное письмо ректора НИУ ВШЭ (март 2021)
· Благодарственное письмо ректора НИУ ВШЭ (ноябрь 2020)
· Благодарность проректора НИУ ВШЭ (май 2019)
· Почетная грамота Высшей школы экономики (апрель 2019)
· Благодарность Министра экономического развития Российской Федерации (сентябрь 2017)
· Почетная грамота Высшей школы экономики (ноябрь 2013)
· Почетное звание "Почетный работник науки и техники Российской Федерации" (ноябрь 2012)
· Благодарность Высшей школы экономики (апрель 2012)
· Надбавка за академическую работу (2009–2010, 2008–2009, 2007–2008)
· Надбавка за публикацию в международном рецензируемом научном издании (2020–2021, 2018–2020)
· Надбавка за регулярные публикации в международных рецензируемых научных изданиях (2021–2026)
· Надбавка за статью в зарубежном рецензируемом журнале (2014–2016, 2012–2014, 2010–2012)
· Надбавка за статью в зарубежном рецензируемом научном издании (2016–2018)
· Лауреат премии "Золотая Вышка" 2017 в номинации Успех педагога
· Лучший академический руководитель в номинации «Цифровые навыки студентов» — 2024–2025
· Лучший академический руководитель в номинации «Прием иностранных студентов» — 2023–2024
· Лучший академический руководитель в номинации «Привлечение студентов» — 2024

Гранты и проекты

— · на соискание учёной степени кандидата наук

Конференции (8)

Показать все

· 2018: The 14th International Conference on Concept Lattices and Their Applications (Оломоуц). Доклад: A First Study on What MDL Can Do for FCA
· 2017: 14th International Conference on Formal Concept Analysis, ICFCA 2017 (Ренн). Доклад: On Overfitting of Classifiers Making a Lattice
· 2016: The 13th International Conference on Concept Lattices and Their Applications (CLA2016) (Москва). Доклад: Stability for triadic concepts
· 2015: 13th International Conference on Formal Concept Analysis, ICFCA 2015 (Nerja). Доклад: Revisiting Pattern Structure Projection
· 2015: Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015 (Porto). Доклад: Fast Generation of Best Interval Patterns for Nonmonotonic Constraint
· 2015: 4th International Workshop "What can FCA do for Artificial Intelligence?", FCA4AI 2015 (Buenos Aires). Доклад: SOFIA: How to Make FCA Polynomial?
· 2015: Russian and South African Workshop on Knowledge Discovery Techniques Based on Formal Concept Analysis (RuZA 2015) (Stellenbosch). Доклад: Browsing publication data using tag clouds over concept lattices constructed by key-phrase extraction
· 2013: 11th International Conference on Formal Concept Analysis (Дрезден). Доклад: FCART: A New FCA-based System for Data Analysis and Knowledge Discovery

Идентификаторы исследователя

ORCID: 0000-0003-3284-9001
ResearcherID: I-9058-2012
SPIN РИНЦ: 4864-2308
Google Scholar: https://scholar.google.ru/citations?user=YcvE4i4AAAAJ&hl=en
Scopus AuthorID: 7202573378

Публикации (82)

Introducing the closure structure and the GDPM algorithm for mining and understanding a tabular dataset

2022 · ARTICLE · en

Поиск паттернов является одной из наиболее развитых областей анализа данных. Такие алгоритмы часто содержат много евристик и недостаточно формализовны. В этой статье мы переспатриваем направление поиска паттернов, в частности поиск itemset'ов, позволяющих анализировать бинарные данные и находит осмысленные наборы признаков и ассоциативные правила. В статье описывается специальная структура замыканий на основании замкнутых паттернов и их, так называемых, отмычек. Эта структура определяет паттерны в терминах классов эквивалентностей. Кроме этой теоретический работы в статье вводится и описывает практический алгоритм GDPM для расчёта такой структуры. Особенностью работы этого алгоритма является то, что в качестве результата своей работы он позвляет получить характеристику выборки данных как целого, а не отдельные паттерны. Результаты работы алгоритма проверены на реальных выборках данных.

DOI ↗

Human knowledge models: Learning applied knowledge from the data

2022 · ARTICLE · en

Artificial intelligence and machine learning have demonstrated remarkable results in science and applied work. However, present AI models, developed to be run on computers but used in human-driven applications, create a visible disconnect between AI forms of processing and human ways of discovering and using knowledge. In this work, we introduce a new concept of “Human Knowledge Models” (HKMs), designed to reproduce human computational abilities. Departing from a vast body of cognitive research, we formalized the definition of HKMs into a new form of machine learning. Then, by training the models with human processing capabilities, we learned human-like knowledge, that humans can not only understand, but also compute, modify, and apply. We used several datasets from different applied fields to demonstrate the advantages of HKMs, including their high predictive power and resistance to noise and overfitting. Our results proved that HKMs can efficiently mine knowledge directly from the data and can compete with complex AI models in explaining the main data patterns. As a result, our study reveals the great potential of HKMs, particularly in the decision-making applications where “black box” models cannot be accepted. Moreover, this improves our understanding of how well human decision-making, modeled by HKMs, can approach the ideal solutions in real-life problems.

DOI ↗ PDF ↗

Towards Fast Finding Optimal Short Classifiers

2022 · CHAPTER · en

Studies on Explainable Artificial Intelligence show that a model should be small in order to be human understandable. The restriction on the size of a model drastically reduces the space of possible solutions. Many rule learning models still rely on greedy algorithms for generating ensembles of decision trees. This paper discusses FCA-inspired mathematical and engineering techniques to efficiently find most optimal short binary classifiers, i.e., classifiers that consist of no more than three binary attributes and are optimal w.r.t. F1 score.

PDF ↗

Intrinsically Interpretable Document Classification via Concept Lattices

2022 · CHAPTER · en

Explanations for the predictions made by Machine Learning (ML) models are best framed in terms of abstract, high-level concepts that are easily comprehensible to human beings. The use of such concepts constitutes a subfield of interpretability methods known as concept-based explanations. This work uses concept-based explanations to build an intrinsically interpretable document classifier using a combination of Formal Concept Analysis (FCA) and approaches from applied graph theory. FCA is used to formalize the vague notion of concepts in terms of the formal concepts found in the concept lattices of various document classes. The graph of the lattice covering relation helps to utilize the topological information present in the document-class concept lattices for classifying documents. Finally, the formal concepts that made the strongest contributions to the predictions of the document classifier are revealed, along with their intents; thereby making their contribution more comprehensible to human beings.

lazy classification of underground forums messages using pattern structures

2022 в печати · ARTICLE · en

Underground forums are monitored platforms where hackers announce attacks and tools to carry on attacks on businesses or organizations. In this paper, we will experiment on assessing the risk of a dataset of these messages, using pattern structures and a lazy classification scheme, with some introduced complexity-reducing elements and natural language analysis techniques. The results show promising application for this method for this problem, and serve as an introductory step for deeper investigation

PDF ↗

Decision Concept Lattice vs. Decision Trees and Random Forests

2021 · CHAPTER · en

DOI ↗ PDF ↗

Exploring the dataset structure by means of delta-classes of equivalence. The case of the titanic dataset?

2021 · CHAPTER · en

PDF ↗

Summation of Decision Trees

2021 · CHAPTER · en

Ensembles of decision trees, like Random Forests are efficient machine learning models with state-of-the-art prediction quality. However, their predictions are much less transparent than those of a single decision tree. In this paper, we describe a prediction model based on a single decision tree in terms of Formal Concept Analysis. We define a differential way to describing a decision rule. We conclude by presenting an approach to summing an ensemble of decision trees into a single decision semilattice with the same predictions.

PDF ↗

Ensemble Techniques for Lazy Classification Based on Pattern Structures

2021 · CHAPTER · en

This paper presents different versions of classification ensemble methods based on pattern structures. Each of these methods is described and tested on multiple datasets (including datasets with exclusively numerical and exclusively nominal features). As a baseline model Random Forest generation is used. For some classification tasks the classification algorithms based on pattern structures showed better performance than Random Forest. The quality of the algorithms is noticeably dependent on ensemble aggregation function and on boosting weighting scheme.

PDF ↗

Next Priority Concept: A new and generic algorithm computing concepts from complex and heterogeneous data

2020 · CHAPTER · en

In this article, we present a new data type agnostic algorithm calculating a concept lattice from heterogeneous and complex data. Our NextPriorityConcept algorithm is first introduced and proved in the binary case as an extension of Bordat's algorithm with the notion of strategies to select only some predecessors of each concept, avoiding the generation of unreasonably large lattices. The algorithm is then extended to any type of data in a generic way. It is inspired from pattern structure theory, where data are locally described by predicates independent of their types, allowing the management of heterogeneous data.

DOI ↗ PDF ↗

Курсы (7)

Research Seminar "Data Analysis in Applied Research"

2025/2026 · Бакалавриат · Анг
Research Seminar "Data Analysis and Artificial Intelligence" · 3 раза

2025/2026, 2024/2025, 2023/2024 · Бакалавриат · Анг
Mentor's Seminar · 3 раза

2025/2026, 2024/2025, 2023/2024 · Магистратура · Анг
Ordered Sets in Data Analysis · 5 раза

2025/2026, 2024/2025, 2023/2024, 2022/2023, 2021/2022 · Магистратура / Маго-лего · Анг
Postgraduate Seminar

2024/2025 · Аспирантура направление: 00.00.00. Аспирантура · Анг
Reading the Best Doctoral Dissertations in Computer Science

2024/2025 · Аспирантура · Анг
00.00.00. Аспирантура

2023/2024 · Аспирантура направление: 00.00.00. Аспирантура, направление: 00.00.00. Аспирантура · Анг