Ильвовский Дмитрий Алексеевич

Факультет компьютерных наук

Профиль на hse.ru ↗ тел.: +7 (495) 772-95-90 | 27319 | +7 (916) 569-70-22

Публикаций

Языков

Наград

Конференций

Профиль Публикации (67) Курсы (5)

Профессиональные интересы

компьютерная лингвистикаанализ формальных понятий

Должности

Доцент — Факультет компьютерных наук, Департамент анализа данных и искусственного интеллекта
Научный сотрудник — Факультет компьютерных наук, Международная лаборатория интеллектуальных систем и структурного анализа

Био

· Начал работать в НИУ ВШЭ в 2011 году.
· Научно-педагогический стаж: 10 лет.

Образование

2017 · Кандидат наук
2010 · Специалитет: Московский авиационный институт, специальность «Прикладная математика и информатика», квалификация «Математик. Системный программист»

Опыт работы

· 2012 - н.в.: Научно-учебная лаборатория интеллектуальных систем и структурного анализа (Младший научный сотрудник)
· 2007 - н.в.: Эксперт отделения «Корпоративные Интернет-решения» компании ФОРС Центр разработки

Награды и поощрения

· Благодарность НИУ ВШЭ (январь 2024)
· Благодарность проректора НИУ ВШЭ (август 2021)
· Благодарность Факультета компьютерных наук НИУ ВШЭ (август 2017)
· Персональная надбавка ректора (2016–2017)
· Надбавка за академическую работу (2020–2021)
· Надбавка за публикацию в журнале из Списка А (и приравненном к нему научном издании) (2025–2026, 2024–2025, 2023–2024)
· Надбавка за публикацию в международном рецензируемом научном издании (2022–2023, 2021–2022, 2017–2019)

Гранты и проекты

— · на соискание учёной степени кандидата наук

Конференции (5)

Показать все

· 2023: ДИЗАЙН МЕЖДИСЦИПЛИНАРНЫХ ИССЛЕДОВАНИЙ В КОНТЕКСТЕ СБЛИЖЕНИЯ МОДЕЛЕЙ ЕСТЕСТВЕННО-НАУЧНОГО И ГУМАНИТАРНО- СОЦИАЛЬНОГО ЗНАНИЯ (Москва). Доклад: Искусственный интеллект как утилита базовой новостной грамотности
· 2016: Компьютерная лингвистика и интеллектуальные технологии (Диалог 22) (Москва). Доклад: Style and Genre Classification by Means of Deep Textual Parsing
· 2016: Пятнадцатая национальная конференция по искусственному интеллекту с международным участием (КИИ-2016) (Смоленск). Доклад: Discovering disinformation: discourse-level approach
· 2015: 7th International Joint Conference on Natural Language Processing, ACL 2015 (Beijing). Доклад: Rhetoric Map of an Answer to Compound Queries.
· 2015: Recent Advances in Natural Language Processing, RANLP 2015 (Hissar). Доклад: Text Classification into Abstract Classes Based on Discourse Structure

Идентификаторы исследователя

ORCID: 0000-0002-5484-372X
ResearcherID: D-9852-2014
SPIN РИНЦ: 3208-3161
Google Scholar: https://scholar.google.ru/citations?hl=ru&user=n7VSUf8AAAAJ
Scopus AuthorID: 55967196200

Публикации (67)

WhatTheWikiFact: Fact-Checking Claims Against Wikipedia

2021 · CHAPTER · en

DOI ↗ PDF ↗

Correcting Texts Generated by Transformers using Discourse Features and Web Mining

2021 · CHAPTER · en

Recent transformer-based approaches to NLG like GPT-2 can generate syntactically coherent original texts. However, these generated texts have serious flaws: global discourse incoherence and meaninglessness of sentences in terms of entity values. We address both of these flaws: they are independent but can be combined to generate original texts that will be both consistent and truthful. This paper presents an approach to estimate the quality of discourse structure. Empirical results confirm that the discourse structure of currently generated texts is inaccurate. We propose the research directions to correct it using discourse features during the fine-tuning procedure. The suggested approach is universal and can be applied to different languages. Apart from that, we suggest a method to correct wrong entity values based on Web Mining and text alignment.

DOI ↗ PDF ↗

Aschern at CheckThat! 2021: Lambda-Calculus of Fact-Checked Claims

2021 · CHAPTER · en

We describe our system for the CLEF 2021 CheckThat! Lab Task 2 Subtask A on detecting previously fact-checked claims. We developed a pipeline using TF.IDF, sentence-BERT fine-tuned on the training data, and reranking using LambdaMART and the predicted similarity scores and positions in the ranked list as features. We examined the quality of each model on the validation set and analyzed its contribution to the final result using the trained LambdaMART. The official evaluation ranked our system 1st by a wide margin over other participants and the organizers' baseline.

FCA-based Approach for Interactive Query Refinement with IR-chatbots

2020 · CHAPTER · en

Information retrieval (IR) chatbot is a special class of virtual assistants, which is widely used nowadays in customer support services. However, the work of modern IR retrieval systems is limited by simple queries to the database, which does not utilize all the potential of interaction with the user. In this paper we implement an FCA-based approach to deliver the relevant information the user has requested. A developing approach integrates a concept-based model build upon the database and intelligent traversal through it. The proposed algorithm has been implemented as an additional function within the existing IR chatbot. In this paper we also enlighten the perspectives for further development of the proposed system. Formal Concept Analysis (FCA) technique and Pattern Structures as its extension are proposed to process unstructured data (objects with a text description), which has become a common way of presenting various items recently.

PDF ↗

aschern at SemEval-2020 Task 11: It Takes Three to Tango: RoBERTa, CRF, and Transfer Learning

2020 · CHAPTER · en

DOI ↗

Обогащение контекста вопросов знаниями из ConceptNet для улучшения точности ответов

2020 · CHAPTER · ru

Современные модели для задачи Question Answering могут показывать точность ответов на фактические вопросы о данном фрагменте текста на английском языке близкую к человеческой. Между тем, такие модели не могут достичь такого же качества на наборах данных, которые требуют дополнительной информации, не представленной в контексте вопроса. В данной статье описывается экспериментальная оценка простого метода обогащения контекста вопроса на основе сбора связей ConceptNet и предлагается дальнейшее направление работы по созданию набора данных для русского языка.

Recursive Neural Text Classification Using Discourse Tree Structure for Argumentation Mining and Sentiment Analysis Tasks

2020 · CHAPTER · en

This paper considers sentiment classification of movie reviews and two argument mining tasks: verification of political statements and categorization of quotes from an Internet forum corresponding to argumentation (factual or emotional). In the case of the fact-checking problem, justifications can be used additionally in one of its sub-tasks. A strong model for solving these and similar problems still does not exist. It requires the style-based approach to achieve the best results. The proposed model effectively encodes parsed discourse trees due to the recursive neural network. The novel siamese model based on it is suggested to analyze discourse structures for the pairs of texts. In the paper, the comparison with state-of-the-art methods is given. Experiments illustrate that the proposed models are effective and reach the best results in the assigned tasks. The evaluation also demonstrates that discourse analysis improves quality for the classification of longer texts.

DOI ↗

DSNDM: Deep Siamese Neural Discourse Model with Attention for Text Pairs Categorization and Ranking

2020 · CHAPTER · en

In this paper, the utility and advantages of the discourse analysis for text pairs categorization and ranking are investigated. We consider two tasks in which discourse structure seems useful and important: automatic verification of political statements, and ranking in question answering systems. We propose a neural network based approach to learn the match between pairs of discourse tree structures. To this end, the neural TreeLSTM model is modified to effectively encode discourse trees and DSNDM model based on it is suggested to analyze pairs of texts. In addition, the integration of the attention mechanism in the model is proposed. Moreover, different ranking approaches are investigated for the second task. In the paper, the comparison with state-of-the-art methods is given. Experiments illustrate that combination of neural networks and discourse structure in DSNDM is effective since it reaches top results in the assigned tasks. The evaluation also demonstrates that discourse analysis improves quality for the processing of longer texts.

DOI ↗ PDF ↗

Dialogue management using extended discourse trees

2020 · CHAPTER · en

In this paper we learn how to manage a dialogue relying on discourse of its utterances. We consider two complementary approaches of dialogue management based on the discourse text analysis to extend the abilities of the interactive information retrieval-based chat bot.

PDF ↗

Tatar WordNet: The sources and the component parts

2020 · CHAPTER · en

We describe an ongoing project of construction of the Tatar Wordnet. The Tatar Wordnet is being constructed on the base of three source resources, developed by us. The first source is TatThes, a bilingual Russian-Tatar Social-Political Thesaurus. TatThes, in turn, has been constructed by manual translation and extension of RuThes, a linguistic ontology for Russian. The second source is a Tatar translation of RuWordNet, a wordnet for Russian. This translation was carried out automatically on the base of a Russian-Tatar dictionary, and then was manually verified. The third source is a semantic classification of Tatar verbs, developed from scratch. We discuss the structure, methodology of compilation and the current state these source resources, and justify the choice of them as the initial resources for building the Tatar Wordnet. Our ultimate goal is to publish Tatar Wordnet on the Linguistic Linked Open Data cloud and integrate it to the Global WordNet Grid.

DOI ↗

Курсы (5)

Автоматическая обработка текста · 5 раза

2025/2026, 2024/2025, 2023/2024, 2022/2023, 2021/2022 · Бакалавриат · рус
Mentor's Seminar · 3 раза

2025/2026, 2024/2025, 2023/2024 · Магистратура · Анг
“Искренние коммуникации”, искусственный интеллект и креативные практики

2023/2024 · Маго-лего · рус
SAS Technologies for Data Mining

2021/2022 · Бакалавриат · Анг
01.03.02. Прикладная математика и информатика

2021/2022 · Бакалавриат · Анг