Ильвовский Дмитрий Алексеевич

Факультет компьютерных наук

Профиль на hse.ru ↗ тел.: +7 (495) 772-95-90 | 27319 | +7 (916) 569-70-22

Публикаций

Языков

Наград

Конференций

Профиль Публикации (67) Курсы (5)

Профессиональные интересы

компьютерная лингвистикаанализ формальных понятий

Должности

Доцент — Факультет компьютерных наук, Департамент анализа данных и искусственного интеллекта
Научный сотрудник — Факультет компьютерных наук, Международная лаборатория интеллектуальных систем и структурного анализа

Био

· Начал работать в НИУ ВШЭ в 2011 году.
· Научно-педагогический стаж: 10 лет.

Образование

2017 · Кандидат наук
2010 · Специалитет: Московский авиационный институт, специальность «Прикладная математика и информатика», квалификация «Математик. Системный программист»

Опыт работы

· 2012 - н.в.: Научно-учебная лаборатория интеллектуальных систем и структурного анализа (Младший научный сотрудник)
· 2007 - н.в.: Эксперт отделения «Корпоративные Интернет-решения» компании ФОРС Центр разработки

Награды и поощрения

· Благодарность НИУ ВШЭ (январь 2024)
· Благодарность проректора НИУ ВШЭ (август 2021)
· Благодарность Факультета компьютерных наук НИУ ВШЭ (август 2017)
· Персональная надбавка ректора (2016–2017)
· Надбавка за академическую работу (2020–2021)
· Надбавка за публикацию в журнале из Списка А (и приравненном к нему научном издании) (2025–2026, 2024–2025, 2023–2024)
· Надбавка за публикацию в международном рецензируемом научном издании (2022–2023, 2021–2022, 2017–2019)

Гранты и проекты

— · на соискание учёной степени кандидата наук

Конференции (5)

Показать все

· 2023: ДИЗАЙН МЕЖДИСЦИПЛИНАРНЫХ ИССЛЕДОВАНИЙ В КОНТЕКСТЕ СБЛИЖЕНИЯ МОДЕЛЕЙ ЕСТЕСТВЕННО-НАУЧНОГО И ГУМАНИТАРНО- СОЦИАЛЬНОГО ЗНАНИЯ (Москва). Доклад: Искусственный интеллект как утилита базовой новостной грамотности
· 2016: Компьютерная лингвистика и интеллектуальные технологии (Диалог 22) (Москва). Доклад: Style and Genre Classification by Means of Deep Textual Parsing
· 2016: Пятнадцатая национальная конференция по искусственному интеллекту с международным участием (КИИ-2016) (Смоленск). Доклад: Discovering disinformation: discourse-level approach
· 2015: 7th International Joint Conference on Natural Language Processing, ACL 2015 (Beijing). Доклад: Rhetoric Map of an Answer to Compound Queries.
· 2015: Recent Advances in Natural Language Processing, RANLP 2015 (Hissar). Доклад: Text Classification into Abstract Classes Based on Discourse Structure

Идентификаторы исследователя

ORCID: 0000-0002-5484-372X
ResearcherID: D-9852-2014
SPIN РИНЦ: 3208-3161
Google Scholar: https://scholar.google.ru/citations?hl=ru&user=n7VSUf8AAAAJ
Scopus AuthorID: 55967196200

Публикации (67)

Shaped-Charge Learning Architecture for the Human–Machine Teams

2023 · ARTICLE · en

In spite of great progress in recent years, deep learning (DNN) and transformers have strong limitations for supporting human–machine teams due to a lack of explainability, information on what exactly was generalized, and machinery to be integrated with various reasoning techniques, and weak defense against possible adversarial attacks of opponent team members. Due to these shortcomings, stand-alone DNNs have limited support for human–machine teams. We propose a Meta-learning/DNN → kNN architecture that overcomes these limitations by integrating deep learning with explainable nearest neighbor learning (kNN) to form the object level, having a deductive reasoning-based meta-level control learning process, and performing validation and correction of predictions in a way that is more interpretable by peer team members. We address our proposal from structural and maximum entropy production perspectives.

DOI ↗

Sense-Annotated Corpus for Russian

2022 · CHAPTER · en

We present a sense-annotated corpus for Russian. The resource was obtained my manually annotating texts from the OpenCorpora corpus, an open corpus for the Russian language, by senses of Russian wordnet RuWordNet. The annotation was used as a test collection for comparing unsupervised (Personalized Pagerank) and pseudo-labeling methods for Russian word sense disambiguation.

Batch-Softmax Contrastive Loss for Pairwise Sentence Scoring Tasks

2022 · CHAPTER · en

The use of contrastive loss for representation learning has become prominent in computer vision, and it is now getting attention in Natural Language Processing (NLP).Here, we explore the idea of using a batch-softmax contrastive loss when fine-tuning large-scale pre-trained transformer models to learn better task-specific sentence embeddings for pairwise sentence scoring tasks.We introduce and study a number of variations in the calculation of the loss as well as in the overall training procedure; in particular, we find that a special data shuffling can be quite important.Our experimental results show sizable improvements on a number of datasets and pairwise sentence scoring tasks including classification, ranking, and regression.Finally, we offer detailed analysis and discussion, which should be useful for researchers aiming to explore the utility of contrastive loss in NLP.

DOI ↗ PDF ↗

Organizing Contexts as a Lattice of Decision Trees for Machine Reading Comprehension

2022 · CHAPTER · en

Supported decision trees that have been first proposed to boost the performance and the explainability of the expert systems built upon the texts can become a great basis for the machine reading comprehension (MRC) systems. The supported decision tree is based on building and combining the corresponding discourse trees for the text passage. In this work, we build an environment of supported decision trees for the MRC task. Each answer is represented by a path of a supported decision tree and the whole corpus of answers is then form a lattice of supported decision trees. This environment gives a boost to MRC performance, handling cases where it is nontrivial to determine which document/passage MRC needs to be applied to.

CrowdChecked: Detecting Previously Fact-Checked Claims in Social Media

2022 · CHAPTER · en

While there has been substantial progress in developing systems to automate fact-checking, they still lack credibility in the eyes of the users. Thus, an interesting approach has emerged: to perform automatic fact-checking by verifying whether an input claim has been previously fact-checked by professional fact-checkers and to return back an article that explains their decision. This is a sensible approach as people trust manual fact-checking, and as many claims are repeated multiple times. Yet, a major issue when building such systems is the small number of known tweet–verifying article pairs available for training. Here, we aim to bridge this gap by making use of crowd fact-checking, i.e., mining claims in social media for which users have responded with a link to a fact-checking article. In particular, we mine a large-scale collection of 330,000 tweets paired with a corresponding fact-checking article. We further propose an end-to-end framework to learn from this noisy data based on modified self-adaptive training, in a distant supervision scenario. Our experiments on the CLEF’21 CheckThat! test set show improvements over the state of the art by two points absolute. Our code and datasets are available at https://github.com/mhardalov/crowdchecked-claims

Обзор методов оценки сложности текстов в сфере регулирования банковской деятельности

2022 · CHAPTER · ru

Оценка сложности текстов является важной и актуальной задачей области обработки естественного языка. Например, в банковской сфере, по мнению экспертов, прослеживается тенденция к повышению сложности текстов во всех областях финансового регулирования, что усложняет их понимание даже профессионалами. Это может приводить к различным трактовкам, поэтому текст должен быть написан простым языком и быть понятным для адресатов. Но для того чтобы управлять сложностью текста, нужно уметь ее измерять. В данной работе будет сделан обзор методов оценки сложности текста. Будут рассмотрены как классические подходы на основе синтаксических характеристик текста, так и продвинутые методы, основанные на применении универсальных векторных языковых моделей.

Concept-based chatbot for interactive query refinement in product search

2021 · CHAPTER · en

Relying on Discourse Analysis to Answer Complex Questions by Neural Machine Reading Comprehension

2021 · CHAPTER · en

Relying on Discourse Trees to Extract Medical Ontologies from Text

2021 · CHAPTER · en

DOI ↗

Transformers: “The End of History” for Natural Language Processing?

2021 · CHAPTER · en

DOI ↗ PDF ↗

Курсы (5)

Автоматическая обработка текста · 5 раза

2025/2026, 2024/2025, 2023/2024, 2022/2023, 2021/2022 · Бакалавриат · рус
Mentor's Seminar · 3 раза

2025/2026, 2024/2025, 2023/2024 · Магистратура · Анг
“Искренние коммуникации”, искусственный интеллект и креативные практики

2023/2024 · Маго-лего · рус
SAS Technologies for Data Mining

2021/2022 · Бакалавриат · Анг
01.03.02. Прикладная математика и информатика

2021/2022 · Бакалавриат · Анг