Сошников Дмитрий Валерьевич
Факультет компьютерных наук
Профессиональные интересы
Должности
- Доцент — Факультет компьютерных наук, Департамент программной инженерии
Био
- · Начал работать в НИУ ВШЭ в 2009 году.
- · Научно-педагогический стаж: 11 лет.
Образование
- 2006 · Ученое звание: Доцент
- 2002 · Кандидат физико-математических наук: тема диссертации: Методы и средства построения распределённых интеллектуальных систем на основе продукционно-фреймовогопредставления знаний
- 2002 · Аспирантура: Московский государственный авиационный институт, специальность «05.13.11 Математическое и программное обеспечение вычислительных машин,комплексов и компьютерных сетей»
- 1999 · Специалитет: Московский государственный авиационный институт (технический университет), специальность «Прикладная математика», квалификация «Математик-инженер»
Награды и поощрения
- · Благодарность первого проректора НИУ ВШЭ (май 2024)
- · Благодарность Факультета компьютерных наук НИУ ВШЭ (август 2017)
- · Лучший преподаватель — 2011
Идентификаторы исследователя
- ORCID:
0000-0003-1021-091X - ResearcherID:
N-4321-2017 - Google Scholar: https://scholar.google.com/citations?hl=en&user=g8512yQAAAAJ
- Scopus AuthorID:
23394045400
Публикации (16)
Избранные вопросы цифровой трансформации образования
2024 · BOOK · ru
Монография посвящена актуальным проблемам исследований в области цифровой трансформации образования. Рассмотрены направления, связанные с поиском общего пути развития цифровой трансформации образования, такие как искусственный интеллект в образовании и анализ данных учебной аналитики. Предназначена для педагогов-исследователей, аспирантов, магистрантов, студентов с целью ориентации их на новые актуальные исследования, а также для преподавателей образовательных и научно-исследовательских организаций, интересующихся вопросами применения информационных технологий в образовании.
Experimental use of educational materials developed using artificial intelligence in natural science education
2024 · ARTICLE · en
This study explores the potential of modern generative models for automatically generating educational task texts. Building on prior research in educational task generation, we focused on leveraging generative artiicial intelligence to create tasks based on textbook content, leading to the development of a multiple-choice educational task generator. This tool, powered by a large language model, empowers educators to independently craft tasks for their courses. In the experiments, teachers from various disciplines were involved in selecting topics for the generation of educational materials. The results demonstrate the capability of modern large language models to generate simple text-based multiple-choice questions suitable for use. While the current need for manual verification and refinement of distractors by educators presents a challenge, it is anticipated that generative AI will address this soon. The study sheds light on the potential of generative AI in education.
EXPERIMENTAL GENERATION OF EDUCATIONAL TASKS IN NATURAL SCIENCE DISCIPLINES USING ARTIFICIAL INTELLIGENCE
2023 · ARTICLE · en
В работе исследовалась пригодность современных генеративных моделей для автоматического создания текстов учебных задач. В первой части работы мы провели библиометрическое картирование поля исследовательских работ, связанных с автоматической генерацией вопросов. В качестве источников были использованы три базы данных: Lens, Dimensions и Digital Library ACM. Во второй части работы мы сравнивали возможности трех генеративных систем (ChatGPT-3.5, YaGPT, GigaChat) формулировать на основе текста учебника задания различных видов: вопросы с вариантами ответа, вопросы с открытым ответом, темы эссе по заданному фрагменту текста. В качестве исходного материала был взят фрагмент текста учебника по биологии для пятого класса, в котором описывалось различие живого и неживого. Для каждой из поставленных задач оценивалась способность генеративной модели формулировать разнообразные варианты вопросов, записывать вопросы в формате JSON, корректность создаваемых моделями вопросов.
Analyzing COVID-19 Medical Papers Using Artificial Intelligence: Insights for Researchers and Medical Professionals
2022 · ARTICLE · en
Since the beginning of the COVID-19 pandemic almost two years ago, there have been more than 700,000 scientific papers published on the subject. An individual researcher cannot possibly get acquainted with such a huge text corpus and, therefore, some help from artificial intelligence (AI) is highly needed. We propose the AI-based tool to help researchers navigate the medical papers collections in a meaningful way and extract some knowledge from scientific COVID-19 papers. The main idea of our approach is to get as much semi-structured information from text corpus as possible, using named entity recognition (NER) with a model called PubMedBERT and Text Analytics for Health service, then store the data into NoSQL database for further fast processing and insights generation. Additionally, the contexts in which the entities were used (neutral or negative) are determined. Application of NLP and text-based emotion detection (TBED) methods to COVID-19 text corpus allows us to gain insights on important issues of diagnosis and treatment (such as changes in medical treatment over time, joint treatment strategies using several medications, and the connection between signs and symptoms of coronavirus, etc.).
COVID-19 reproduction number estimated from SEIR model: association with people's mobility in 2020
2021 · PREPRINT · en
This paper is an exploratory study of two epidemiological questions on a worldwide basis. How fast is the disease spreading? Are the restrictions (especially mobility restrictions) for people bring the expected effect? To answer the first question, we propose a tool for estimating the reproduction number of epidemic (the number of secondary infections Rt) based on the SEIR model and compare it with an non-model Rt estimation. To measure the Rt of COVID-19 for different countries, real-time data on coronavirus daily cases of infections, recoveries, deaths are retrieved from the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. To assess the effectiveness of mobility restrictions for the COVID-19 pandemic in 2020, the correlations between the Rt and people's mobility (based on the Apple mobility index) are presented. The correlations were considered for 12 countries and for most of them, the correlations are negative. This shows a delay in the implementation of mobility restrictions - the countries imposed them in response to growth of new COVID-19 cases, rather than preventively.
mPyPl: Python Monadic Pipeline Library for Complex Functional Data Processing
2021 · PREPRINT · en
In this paper, we present a new Python library called mPyPl, which is intended to simplify complex data processing tasks using functional approach. This library defines operations on lazy data streams of named dictionaries represented as generators (so-called multi-field datastreams), and allows enriching those data streams with more 'fields' in the process of data preparation and feature extraction. Thus, most data preparation tasks can be expressed in the form of neat linear 'pipeline', similar in syntax to UNIX pipes, or |> functional composition operator in F#. We define basic operations on multi-field data streams, which resemble classical monadic operations, and show similarity of the proposed approach to monads in functional programming. We also show how the library was used in complex deep learning tasks of event detection in video, and discuss different evaluation strategies that allow for different compromises in terms of memory and performance.
Using Text Analytics for Health to Get Meaningful Insights from a Corpus of COVID Scientific Papers
2021 · PREPRINT · en
Since the beginning of COVID pandemic, there have been around 700000 scientific papers published on the subject. A human researcher cannot possibly get acquainted with such a huge text corpus -- and therefore developing AI-based tools to help navigating this corpus and deriving some useful insights from it is highly needed. In this paper, we will use Text Analytics for Health pre-trained service together with some cloud tools to extract some knowledge from scientific papers, gain insights, and build a tool to help researcher navigate the paper collection in a meaningful way.
Estimation of Time-Dependent Reproduction Number for Global COVID-19 Outbreak
2020 · PREPRINT · en
Real-time estimation of the parameters characterising infectious disease transmission is important for optimization quarantine interventions during outbreaks. One of the most significant parameters is the effective reproduction number - number of secondary cases produced by a single infection. The current study presents an approach for estimating the effective reproduction number and its application to COVID-19 outbreak. The method is based on fitting SIR epidemic model to observation data in a sliding time window and allows to show real-time dynamics of reproduction number at any phase of epidemic for countries globally. Online data on COVID-19 daily cases of infections, recoveries, deaths are used.Finally, time-dependent reproduction number is explored in connection with dynamics of peoples mobility. The method allows to assess the disease transmission potential and understand the effect of interventions on epidemics spread. It also can be easily adapted to future outbreaks of different pathogens. The tool is available online as Python code from the Github repository.
mPyPl: Python Monadic Pipeline Library for Complex Functional Data Processing
2019 · ARTICLE · en
In this paper, we present a new Python library called mPyPl, which is intended to simplify complex data processing tasks using a functional approach. This library defines operations on lazy data streams of named dictionaries represented as generators (so-called multi-field datastreams), and allows enriching those data streams with more ’fields’ in the process of data preparation and feature extraction. Thus, most data preparation tasks can be expressed in the form of a neat linear ’pipeline’, similar in syntax to UNIX pipes, or |> functional composition operator in F#. We define basic operations on multi-field data streams, which resemble classical monadic operations, and show similarity of the proposed approach to monads in functional programming. We also show how the library was used in complex deep learning tasks of event detection in video, and discuss different evaluation strategies that allow for different compromises in terms of memory and performance.
Программы Майкрософт в современном IT-образовании школьников и студентов.
2015 · CHAPTER · ru
В сборнике представлены тезисы докладов и выступлений участников Тринадцатой открытой Всероссийской конференции «Преподавание информационных технологий в Российской Федерации». Организатор конференции - Ассоциация предприятий компьютерных и информационных технологий (АПКИТ, www.apkit.ru) совместно с Пермским государственным национальным исследовательским университетом (ПГНИУ, www.psu.ru) при поддержке Министерства образования и науки Российской Федерации, Министерства связи и массовых коммуникаций Российской Федерации, Правительства Пермского края, Министерства образования и науки Пермского края и Российского Союза ректоров.
Курсы (3)
-
Прикладное применение генеративных нейросетей в креативных индустриях и промпт-дизайн · 2 раза
2025/2026, 2024/2025 · Бакалавриат / Бакалавриат направление: 54.03.01 Дизайн / Нижний Новгород / Пермь · рус
-
Создание инфраструктуры креативного производства инструментами искусственного интеллекта
2025/2026 · Бакалавриат · рус
-
Functional and Logic Programming · 5 раза
2025/2026, 2024/2025, 2023/2024, 2022/2023, 2021/2022 · Бакалавриат · Анг