DSA Faculty
API
← к списку преподавателей

Ветров Дмитрий Петрович

Факультет компьютерных наук

Профиль на hse.ru ↗ тел.: +7 (495) 772-95-90 | 27252
Публикаций
86
Языков
1
Наград
11
Конференций
1
Профиль Публикации (86) Курсы (2)

Должности

  • Научный руководительФакультет компьютерных наук, Институт искусственного интеллекта и цифровых наук
  • Профессор-исследовательФакультет компьютерных наук, Департамент больших данных и информационного поиска

Био

  • · Начал работать в НИУ ВШЭ в 2014 году.
  • · Научно-педагогический стаж: 15 лет.

Образование

  • 2007 · Кандидат физико-математических наук: Московский государственный университет им. М.В. Ломоносова, специальность 01.01.09 «Дискретная математика и математическая кибернетика», тема диссертации: Влияние устойчивости алгоритмов классификации на точность их работы
  • 2003 · Специалитет: Московский государственный университет им. М.В. Ломоносова, специальность «Прикладная математика и информатика», квалификация «Математик. Системный программист»

Опыт работы

  • · 2017-н.в.: : руководитель центра глубинного обучения и байесовских методов (НИУ ВШЭ, факультет компьютерных наук)
  • · 2018-2020: : руководитель лаборатории компании Самсунг (НИУ ВШЭ, факультет компьютерных наук)
  • · 2016-н.в.: : профессор-исследователь (НИУ ВШЭ, факультет компьютерных наук)
  • · 2016-н.в.: : профессор-ислледователь (НИУ ВШЭ, факультет компьютерных наук)
  • · 2016-2018: : Яндекс, ведущий исследователь (полставки)
  • · 2015-2016: : Сколтех, доцент
  • · 2014-2016: : НИУ ВШЭ, факультет компьютерных наук, доцент (неполная ставка)
  • · 2014-2015: : МГУ, факультет вычислительной математики и кибернетики, доцент
  • · 2011-2014: : МГУ, факультет вычислительной математики и кибернетики, ассистент
  • · 2010-2012: : Курчатовский институт, НБИК-центр, зав. лабораторией (полставки)
  • · 2007-2011: : МГУ, факультет вычислительной математики и кибернетики, научный сотрудник
  • · 2005: Лето
  • · 2006: : Валлийский университет, Бангор, стажер
  • · 2000-2007: : Вычислительный центр им. А.А. Дородницына РАН, математик (полставки)

Награды и поощрения

  • · Благодарность НИУ ВШЭ (март 2024)
  • · Благодарственное письмо первого проректора НИУ ВШЭ (февраль 2023)
  • · Почетное звание "Почетный работник сферы образования Российской Федерации" (ноябрь 2022)
  • · Почетная грамота НИУ ВШЭ (февраль 2022)
  • · Почетная грамота НИУ ВШЭ (декабрь 2015)
  • · Золотая медаль Российского отделения Европейской академии за цикл научных работ по байесовской регуляризации и выводу в графических моделях (декабрь 2012)
  • · Стипендия Президента РФ для ведущих молодых ученых (июнь 2012)
  • · Надбавка за публикации, вносящие особый вклад в международную научную репутацию НИУ ВШЭ (2022–2025, 2021–2024)
  • · Надбавка за публикацию в международном рецензируемом научном издании (2019–2021, 2017–2019)
  • · Надбавка за статью в зарубежном рецензируемом журнале (2015–2017)
  • · Лучший преподаватель — 2019–2020, 2019

Гранты и проекты

  • · на соискание учёной степени кандидата наук

Конференции (1)

Показать все
  • · 2016: Advances in Neural Information Processing Systems 2016 (Барселона). Доклад: PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions

Идентификаторы исследователя

Публикации (86)

Multi-utility Learning: Structured-Output Learning with Multiple Annotation-Specific Loss Functions

2015 · ARTICLE · en

Structured-output learning is a challenging problem; particularly so because of the difficulty in obtaining large datasets of fully labelled instances for training. In this paper we try to overcome this difficulty by presenting a multi-utility learning framework for structured prediction that can learn from training instances with different forms of supervision. We propose a unified technique for inferring the loss functions most suitable for quantifying the consistency of solutions with the given weak annotation. We demonstrate the effectiveness of our framework on the challenging semantic image segmentation problem for which a wide variety of annotations can be used. For instance, the popular training datasets for semantic segmentation are composed of images with hard-to-generate full pixel labellings, as well as images with easy-to-obtain weak annotations, such as bounding boxes around objects, or image-level labels that specify which object categories are present in an image. Experimental evaluation shows that the use of annotation-specific loss functions dramatically improves segmentation accuracy compared to the baseline system where only one type of weak annotation is used.

Submodular Relaxation for Inference in Markov Random Fields

2015 · ARTICLE · en

In this paper we address the problem of finding the most probable state of a discrete Markov random field (MRF), also known as the MRF energy minimization problem. The task is known to be NP-hard in general and its practical importance motivates numerous approximate algorithms. We propose a submodular relaxation approach (SMR) based on a Lagrangian relaxation of the initial problem. Unlike the dual decomposition approach of Komodakis et al., 2011 SMR does not decompose the graph structure of the initial problem but constructs a submodular energy that is minimized within the Lagrangian relaxation. Our approach is applicable to both pairwise and high-order MRFs and allows to take into account global potentials of certain types. We study theoretical properties of the proposed approach and evaluate it experimentally.

Многоклассовая модель формы со скрытыми переменными

2015 · ARTICLE · ru

В данной работе рассматриваются модели формы объектов на изображении: бинарная и многоклассовая модели Больцмана. Предлагается новый алгоритм обучения многоклассовой модели формы Больцмана, для применения которого достаточно неполной разметки данных, а именно: бинарной разметки и задания семян, указывающих приближенное расположение частей объектов.

Breaking Sticks and Ambiguities with Adaptive Skip-gram

2015 · PREPRINT · en

Recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number of Skip-gram modifications were proposed to overcome this limitation and learn multi-prototype word representations, they either require a known number of word meanings or learn them using greedy heuristic approaches. In this paper we propose the Adaptive Skip-gram model which is a nonparametric Bayesian extension of Skip-gram capable to automatically learn the required number of representations for all words at desired semantic resolution. We derive efficient online variational learning algorithm for the model and empirically demonstrate its efficiency on word-sense induction task.

Tensorizing neural networks

2015 · CHAPTER · en

Deep neural networks currently demonstrate state-of-the-art performance in several domains.At the same time, models of this class are very demanding in terms of computational resources. In particular, a large amount of memory is required by commonly used fully-connected layers, making it hard to use the models on low-end devices and stopping the further increase of the model size. In this paper we convert the dense weight matrices of the fully-connected layers to the Tensor Train format such that the number of parameters is reduced by a huge factor and at the same time the expressive power of the layer is preserved.In particular, for the Very Deep VGG networks we report the compression factor of the dense weight matrix of a fully-connected layer up to 200000 times leading to the compression factor of the whole network up to 7 times.

Tensorizing Neural Networks

2015 · CHAPTER · en

Deep neural networks currently demonstrate state-of-the-art performance in several domains.At the same time, models of this class are very demanding in terms of computational resources. In particular, a large amount of memory is required by commonly used fully-connected layers, making it hard to use the models on low-end devices and stopping the further increase of the model size. In this paper we convert the dense weight matrices of the fully-connected layers to the Tensor Train format such that the number of parameters is reduced by a huge factor and at the same time the expressive power of the layer is preserved.In particular, for the Very Deep VGG networks we report the compression factor of the dense weight matrix of a fully-connected layer up to 200000 times leading to the compression factor of the whole network up to 7 times.

Variational Inference for Sequential Distance Dependent Chinese Restaurant Process.

2014 · ARTICLE · en

Recently proposed distance dependent Chinese Restaurant Process (ddCRP) generalizes extensively used Chinese Restaurant Process (CRP) by accounting for dependencies between data points. Its posterior is intractable and so far only MCMC methods were used for inference. Because of very different nature of ddCRP no prior developments in variational methods for Bayesian nonparametrics are appliable. In this paper we propose novel variational inference for important sequential case of ddCRP (seqddCRP) by revealing its connection with Laplacian of random graph constructed by the process. We develop efficient algorithm for optimizing variational lower bound and demonstrate its efficiency comparing to Gibbs sampler. We also apply our variational approximation to CRPequivalent seqddCRP-mixture model, where it could be considered as alternative to one based on truncated stick-breaking representation. This allowed us to achieve significantly better variational lower bound than variational approximation based on truncated stick breaking for Dirichlet process.

Variational Inference for Sequential Distance Dependent Chinese Restaurant Process

2014 · CHAPTER · en

Recently proposed distance dependent Chinese Restaurant Process (ddCRP) generalizes extensively used Chinese Restaurant Process (CRP) by accounting for dependencies between data points. Its posterior is intractable and so far only MCMC methods were used for inference. Because of very different nature of ddCRP no prior developments in variational methods for Bayesian nonparametrics are appliable. In this paper we propose novel variational inference for important sequential case of ddCRP (seqddCRP) by revealing its connection with Laplacian of random graph constructed by the process. We develop efficient algorithm for optimizing variational lower bound and demonstrate its efficiency comparing to Gibbs sampler. We also apply our variational approximation to CRP-equivalent seqddCRP-mixture model, where it could be considered as alternative to one based on truncated stick-breaking representation. This allowed us to achieve significantly better variational lower bound than variational approximation based on truncated stick breaking for Dirichlet process.

Putting MRFs on a Tensor Train

2014 · CHAPTER · en

In the paper we present a new framework for dealing with probabilistic graphical models. Our approach relies on the recently proposed Tensor Train format (TT-format) of a tensor that while being compact allows for efficient application of linear algebra operations. We present a way to convert the energy of a Markov random field to the TT-format and show how one can exploit the properties of the TT-format to attack the tasks of the partition function estimation and the MAP-inference. We provide theoretical guarantees on the accuracy of the proposed algorithm for estimating the partition function and compare our methods against several state-of-the-art algorithms.

Putting MRFs on a Tensor Train

2014 · ARTICLE · en

In the paper we present a new framework for dealing with probabilistic graphical models. Our approach relies on the recently proposed Tensor Train format (TT-format) of a tensor that while being compact allows for efficient application of linear algebra operations. We present a way to convert the energy of a Markov random field to the TT-format and show how one can exploit the properties of the TT-format to attack the tasks of the partition function estimation and the MAP-inference. We provide theoretical guarantees on the accuracy of the proposed algorithm for estimating the partition function and compare our methods against several state-of-the-art algorithms.

Курсы (2)