Кузнецов Сергей Олегович
Факультет компьютерных наук
Профессиональные интересы
Должности
- Директор центра — Факультет компьютерных наук, Центр языковых и семантических технологий
- Профессор — Факультет компьютерных наук, Департамент анализа данных и искусственного интеллекта
- Заведующий лабораторией — Факультет компьютерных наук, Международная лаборатория интеллектуальных систем и структурного анализа
- Академический руководитель образовательной программы — Науки о данных (Data Science)
Био
- · Начал работать в НИУ ВШЭ в 2006 году.
- · Научно-педагогический стаж: 46 лет.
Образование
- 2002 · Доктор физико-математических наук: Вычислительный центр им. А.А. Дородницына РАН, специальность 05.13.17 «Теоретические основы информатики», тема диссертации: Теория машинного обучения в решетках формальных понятий
- 1990 · Кандидат наук: Всероссийский институт научной и технической информации РАН, специальность 05.13.17 «Теоретические основы информатики», тема диссертации: Повышение обоснованности гипотез в интеллектуальных системах специального типа
- 1985 · Специалитет: Московский физико-технический институт, специальность «Автоматические системы управления», квалификация «Инженер-физик»
Опыт работы
- · 2014: наст. время - Руководитель департамента анализа данных и икусственного интеллекта, заведующий международной научно-учебной лабораторией «Интеллектуальные системы и структурный анализ»
- · 2006-2014: Заведующий отделением прикладной математики и информатики НИУ ВШЭ, заведующий кафедрой анализа данных и искусственного интеллекта, заведующий научно-учебной лабораторией «Интеллектуальные системы и структурный анализ»
- · 2006-2011: (по совместительству) Заведующий кафедрой «Распознавание изображений и обработка текста» (базовая кафедра ABBYY), Факультет инноваций и высоких технологий, Московский Физико-Технический Институт (МФТИ)
- · 2003-2004: научный сотрудник института алгебры политехнического университета Дрездена (по стипендии фонда Александра фон Гумбольдта)
- · 1985 -2006: ст. лаборант, м.н.с., н.с., с.н.с., вед. н.с. Всероссийского (до
- · 1991 г.: Всесоюзного) Института Научной и Технической Информации (ВИНИТИ РАН)
- · 1995-1999: (по совместительству) Старший переводчик, научный редактор журнала International Journal of Computer and System Sciences, МАИК Наука - Интерпериодика
Награды и поощрения
- · Благодарность проректора НИУ ВШЭ (октябрь 2024)
- · Благодарность проректора НИУ ВШЭ (ноябрь 2023)
- · Медаль "Признание - 15 лет успешной работы" НИУ ВШЭ (декабрь 2022)
- · Почетная грамота Министерства науки и высшего образования Российской Федерации (июль 2022)
- · Благодарственное письмо проректора НИУ ВШЭ (ноябрь 2021)
- · Благодарственное письмо ректора НИУ ВШЭ (март 2021)
- · Благодарственное письмо ректора НИУ ВШЭ (ноябрь 2020)
- · Благодарность проректора НИУ ВШЭ (май 2019)
- · Почетная грамота Высшей школы экономики (апрель 2019)
- · Благодарность Министра экономического развития Российской Федерации (сентябрь 2017)
- · Почетная грамота Высшей школы экономики (ноябрь 2013)
- · Почетное звание "Почетный работник науки и техники Российской Федерации" (ноябрь 2012)
- · Благодарность Высшей школы экономики (апрель 2012)
- · Надбавка за академическую работу (2009–2010, 2008–2009, 2007–2008)
- · Надбавка за публикацию в международном рецензируемом научном издании (2020–2021, 2018–2020)
- · Надбавка за регулярные публикации в международных рецензируемых научных изданиях (2021–2026)
- · Надбавка за статью в зарубежном рецензируемом журнале (2014–2016, 2012–2014, 2010–2012)
- · Надбавка за статью в зарубежном рецензируемом научном издании (2016–2018)
- · Лауреат премии "Золотая Вышка" 2017 в номинации Успех педагога
- · Лучший академический руководитель в номинации «Цифровые навыки студентов» — 2024–2025
- · Лучший академический руководитель в номинации «Прием иностранных студентов» — 2023–2024
- · Лучший академический руководитель в номинации «Привлечение студентов» — 2024
Гранты и проекты
- — · на соискание учёной степени кандидата наук
Конференции (8)
Показать все
- · 2018: The 14th International Conference on Concept Lattices and Their Applications (Оломоуц). Доклад: A First Study on What MDL Can Do for FCA
- · 2017: 14th International Conference on Formal Concept Analysis, ICFCA 2017 (Ренн). Доклад: On Overfitting of Classifiers Making a Lattice
- · 2016: The 13th International Conference on Concept Lattices and Their Applications (CLA2016) (Москва). Доклад: Stability for triadic concepts
- · 2015: 13th International Conference on Formal Concept Analysis, ICFCA 2015 (Nerja). Доклад: Revisiting Pattern Structure Projection
- · 2015: Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015 (Porto). Доклад: Fast Generation of Best Interval Patterns for Nonmonotonic Constraint
- · 2015: 4th International Workshop "What can FCA do for Artificial Intelligence?", FCA4AI 2015 (Buenos Aires). Доклад: SOFIA: How to Make FCA Polynomial?
- · 2015: Russian and South African Workshop on Knowledge Discovery Techniques Based on Formal Concept Analysis (RuZA 2015) (Stellenbosch). Доклад: Browsing publication data using tag clouds over concept lattices constructed by key-phrase extraction
- · 2013: 11th International Conference on Formal Concept Analysis (Дрезден). Доклад: FCART: A New FCA-based System for Data Analysis and Knowledge Discovery
Идентификаторы исследователя
- ORCID:
0000-0003-3284-9001 - ResearcherID:
I-9058-2012 - SPIN РИНЦ:
4864-2308 - Google Scholar: https://scholar.google.ru/citations?user=YcvE4i4AAAAJ&hl=en
- Scopus AuthorID:
7202573378
Публикации (82)
Gradual Discovery with Closure Structure of a Concept Lattice
2020 · CHAPTER · en
An approximate discovery of closed itemsets is usually based on either setting a frequency threshold or computing a sequence of projections. Both approaches, being incremental, do not provide any estimate of the size of the next output and do not ensure that “more interesting patterns” will be generated first. We propose to generate closed itemsets incrementally, w.r.t. the size of the smallest (cardinality-minimal or minimum) generators and show that this approach (i) exhibits anytime property, and (ii) first generates itemsets of better quality and then those of lower quality.
Adaptive Multi-model Approaches to Pattern Set Mining
2020 · CHAPTER · en
On pattern setups and pattern multistructures
2020 · ARTICLE · en
Order and lattice theory provides convenient mathematical tools for pattern mining, in particular for condensed irredundant representations of pattern spaces and their efficient generation. Formal Concept Analysis (FCA) offers a generic framework, called pattern structures, to formalize many types of patterns, such as itemsets, intervals, graphs, and sequence sets. Moreover, FCA provides generic algorithms to generate irredundantly all closed patterns, the only condition being that the pattern space is a meet-semilattice. This does not always hold, e.g. for sequential and graph patterns. Here, we discuss pattern setups consisting of descriptions making just a partial order. Such a framework can be too broad, causing several problems, so we propose a new model, dubbed pattern multistructures, lying between pattern setups and pattern structures, which relies on multilattices. Finally, we consider some techniques, namely completions, transforming pattern setups to pattern structures using sets/antichains of patterns.
Reduced vs. standard dose native E. coli-asparaginase therapy in childhood acute lymphoblastic leukemia: long-term results of the randomized trial Moscow–Berlin 2002
2019 · ARTICLE · en
Purpose Favorable outcomes were achieved for children with acute lymphoblastic leukemia (ALL) with the first Russian multicenter trial Moscow–Berlin (ALL-MB) 91. One major component of this regimen included a total of 18 doses of weekly intramuscular (IM) native Escherichia coli-derived asparaginase (E. coli-ASP) at 10000 U/m2 during three consolidation courses. ASP was initially available from Latvia, but had to be purchased from abroad at substantial costs after the collapse of Soviet Union. Therefore, the subsequent trial ALL-MB 2002 aimed at limiting costs to a reasonable extent and also at reducing toxicity by lowering the dose for standard risk (SR−) patients to 5000 U/m2 without jeopardizing efficacy. Methods Between April 2002 and November 2006, 774 SR patients were registered in 34 centers across Russia and Belarus, 688 of whom were randomized. In arm ASP-5000 (n = 334), patients received 5000 U/m2 and in arm ASP-10000 (n = 354) 10 000 U/m2 IM. Results Probabilities of disease-free survival, overall survival and cumulative incidence of relapse at 10 years were comparable: 79 ± 2%, 86 ± 2% and 17.4 ± 2.1% (ASP-5000) vs. 75 ± 2% and 82 ± 2%, and 17.9 ± 2.0% (ASP-10000), while death in complete remission was significantly lower in arm ASP-5000 (2.7% vs. 6.5%; p = 0.029). Conclusion Our findings suggest that weekly 5000 U/m2E. coli-ASP IM during consolidation therapy are equally effective, more cost-efficient and less toxic than 10000 U/m2 for SR patients with childhood ALL.
Numerical Pattern Mining Through Compression
2019 · CHAPTER · en
Pattern Mining (PM) has a prominent place in Data Science and finds its application in a wide range of domains. To avoid the exponential explosion of patterns different methods have been proposed. They are based on assumptions on interestingness and usually return very different pattern sets. In this paper, we propose to use a compression-based objective as a well-justified and robust interestingness measure. We define the description lengths for datasets and use the Minimum Description Length principle (MDL) to find patterns that ensure the best compression. Our experiments show that the application of MDL to numerical data provides a small and characteristic subset of patterns describing data in a compact way.
Increasing the efficiency of packet classifiers with closed descriptions
2019 · CHAPTER · en
Efficient representation of packet classifiers has become a significant challenge due to the rapid growth of data stored and processed in the forwarding, or routing, tables. In our work we propose two algorithms for reducing the size of forwarding tables both in length and width by the deletion of redundant bits and unreachable rules based on FCA analysis. We consider the task of transferring the forwarding packet to the correct destination as the task of multinomial classification. Thus, the process of reducing the forwarding table size corresponds to feature selection procedure with slight modifications. The presented techniques are based on closed descriptions and decision trees. The main challenge in applying decision trees to the task is processing the overlapping rules. To overcome this challenge we propose to employ concept-based hypotheses to delete unreachable actions assigned to the overlapping rules. The experiments were performed on data generated by the ClassBench software. The proposed approach results in significant decrease in bits in the forwarding tables as features.
On interestingness measures of formal concepts
2018 · ARTICLE · en
Formal concepts and closed itemsets proved to be of big importance for knowledge discovery, both as a tool for concise representation of association rules and a tool for clustering and constructing domain taxonomies and ontologies. Exponential explosion makes it difficult to consider the whole concept lattice arising from data, one needs to select most useful and interesting concepts. In this paper interestingness measures of concepts are considered and compared with respect to various aspects, such as efficiency of computation and applicability to noisy data and performing ranking correlation.
MDL for FCA: is there a place for background knowledge?
2018 · CHAPTER · en
Detecting logical argumentation in text via communicative discourse tree
2018 · ARTICLE · en
We solve the argument mining problem by investigating discourse and communicative text structure. A new formal graph-based structure called communicative discourse tree (CDT) is defined. It consists of a discourse tree with additional labels on edges, which stand for verbs. These verbs represent communicative actions. Discourse trees are based on rhetoric relations, extracted from a text according to Rhetoric Structure Theory. The problem is tackled as a binary classification task, where the positive class corresponds to texts with arguments and the negative class corresponds to texts with no arguments. The feature engineering for the classification task is conducted, deciding on which syntactic and discourse features are associated with logical argumentation. Text classification framework based on syntactic, discourse and communicative discourse text structures with a number of learning approaches is implemented. Evaluation on a combined data-set is provided.
Как улучшить оценку множеств признаков с помощью принципа минимальной длины описания?
2018 · CHAPTER · ru
Курсы (7)
-
Research Seminar "Data Analysis in Applied Research"
2025/2026 · Бакалавриат · Анг
-
Research Seminar "Data Analysis and Artificial Intelligence" · 3 раза
2025/2026, 2024/2025, 2023/2024 · Бакалавриат · Анг
-
Mentor's Seminar · 3 раза
2025/2026, 2024/2025, 2023/2024 · Магистратура · Анг
-
Ordered Sets in Data Analysis · 5 раза
2025/2026, 2024/2025, 2023/2024, 2022/2023, 2021/2022 · Магистратура / Маго-лего · Анг
-
Postgraduate Seminar
2024/2025 · Аспирантура направление: 00.00.00. Аспирантура · Анг
-
Reading the Best Doctoral Dissertations in Computer Science
2024/2025 · Аспирантура · Анг
-
00.00.00. Аспирантура
2023/2024 · Аспирантура направление: 00.00.00. Аспирантура, направление: 00.00.00. Аспирантура · Анг