Ветров Дмитрий Петрович

Факультет компьютерных наук

Профиль на hse.ru ↗ тел.: +7 (495) 772-95-90 | 27252

Публикаций

Языков

Наград

Конференций

Профиль Публикации (86) Курсы (2)

Должности

Научный руководитель — Факультет компьютерных наук, Институт искусственного интеллекта и цифровых наук
Профессор-исследователь — Факультет компьютерных наук, Департамент больших данных и информационного поиска

Био

· Начал работать в НИУ ВШЭ в 2014 году.
· Научно-педагогический стаж: 15 лет.

Образование

2007 · Кандидат физико-математических наук: Московский государственный университет им. М.В. Ломоносова, специальность 01.01.09 «Дискретная математика и математическая кибернетика», тема диссертации: Влияние устойчивости алгоритмов классификации на точность их работы
2003 · Специалитет: Московский государственный университет им. М.В. Ломоносова, специальность «Прикладная математика и информатика», квалификация «Математик. Системный программист»

Опыт работы

· 2017-н.в.: : руководитель центра глубинного обучения и байесовских методов (НИУ ВШЭ, факультет компьютерных наук)
· 2018-2020: : руководитель лаборатории компании Самсунг (НИУ ВШЭ, факультет компьютерных наук)
· 2016-н.в.: : профессор-исследователь (НИУ ВШЭ, факультет компьютерных наук)
· 2016-н.в.: : профессор-ислледователь (НИУ ВШЭ, факультет компьютерных наук)
· 2016-2018: : Яндекс, ведущий исследователь (полставки)
· 2015-2016: : Сколтех, доцент
· 2014-2016: : НИУ ВШЭ, факультет компьютерных наук, доцент (неполная ставка)
· 2014-2015: : МГУ, факультет вычислительной математики и кибернетики, доцент
· 2011-2014: : МГУ, факультет вычислительной математики и кибернетики, ассистент
· 2010-2012: : Курчатовский институт, НБИК-центр, зав. лабораторией (полставки)
· 2007-2011: : МГУ, факультет вычислительной математики и кибернетики, научный сотрудник
· 2005: Лето
· 2006: : Валлийский университет, Бангор, стажер
· 2000-2007: : Вычислительный центр им. А.А. Дородницына РАН, математик (полставки)

Награды и поощрения

· Благодарность НИУ ВШЭ (март 2024)
· Благодарственное письмо первого проректора НИУ ВШЭ (февраль 2023)
· Почетное звание "Почетный работник сферы образования Российской Федерации" (ноябрь 2022)
· Почетная грамота НИУ ВШЭ (февраль 2022)
· Почетная грамота НИУ ВШЭ (декабрь 2015)
· Золотая медаль Российского отделения Европейской академии за цикл научных работ по байесовской регуляризации и выводу в графических моделях (декабрь 2012)
· Стипендия Президента РФ для ведущих молодых ученых (июнь 2012)
· Надбавка за публикации, вносящие особый вклад в международную научную репутацию НИУ ВШЭ (2022–2025, 2021–2024)
· Надбавка за публикацию в международном рецензируемом научном издании (2019–2021, 2017–2019)
· Надбавка за статью в зарубежном рецензируемом журнале (2015–2017)
· Лучший преподаватель — 2019–2020, 2019

Гранты и проекты

— · на соискание учёной степени кандидата наук

Конференции (1)

Показать все

· 2016: Advances in Neural Information Processing Systems 2016 (Барселона). Доклад: PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions

Идентификаторы исследователя

ORCID: 0000-0001-6863-9028
ResearcherID: H-4870-2015
SPIN РИНЦ: 4339-7570
Google Scholar: https://scholar.google.ru/citations?user=7HU0UoUAAAAJ&hl=ru
Scopus AuthorID: 8382687000

Публикации (86)

Structured Bayesian Pruning via Log-Normal Multiplicative Noise

2017 · CHAPTER · en

Dropout-based regularization methods can be regarded as injecting random noise with pre-defined magnitude to different parts of the neural network during training. It was recently shown that Bayesian dropout procedure not only improves generalization but also leads to extremely sparse neural architectures by automatically setting the individual noise magnitude per weight. However, this sparsity can hardly be used for acceleration since it is unstructured. In the paper, we propose a new Bayesian model that takes into account the computational structure of neural networks and provides structured sparsity, e.g. removes neurons and/or convolutional channels in CNNs. To do this we inject noise to the neurons outputs while keeping the weights unregularized. We establish the probabilistic model with a proper truncated log-uniform prior over the noise and truncated log-normal variational approximation that ensures that the KL-term in the evidence lower bound is computed in closed-form. The model leads to structured sparsity by removing elements with a low SNR from the computation graph and provides significant acceleration on a number of deep neural architectures. The model is very easy to implement as it only corresponds to the addition of one dropout-like layer in computation graph.

PDF ↗

Variational Dropout Sparsifies Deep Neural Networks

2017 · CHAPTER · en

We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse solutions both in fully-connected and convolutional layers. This effect is similar to automatic relevance determination effect in empirical Bayes but has a number of advantages. We reduce the number of parameters up to 280 times on LeNet architectures and up to 68 times on VGG-like networks with a negligible decrease of accuracy.

PDF ↗

Bayesian Sparsification of Recurrent Neural Networks

2017 · CHAPTER · en

Recurrent neural networks show state-of-the-art results in many text analysis tasks but often require a lot of memory to store their weights. Recently proposed Sparse Variational Dropout (Molchanov et al., 2017) eliminates the majority of the weights in a feed-forward neural network without significant loss of quality. We apply this technique to sparsify recurrent neural networks. To account for recurrent specifics we also rely on Binary Variational Dropout for RNN (Gal & Ghahramani, 2016b). We report 99.5% sparsity level on sentiment analysis task without a quality drop and up to 87% sparsity level on language modeling task with slight loss of accuracy.

Breaking Sticks and Ambiguities with Adaptive Skip-gram

2016 · ARTICLE · en

The recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number of Skip-gram modifications were proposed to overcome this limitation and learn multi-prototype word representations, they either require a known number of word meanings or learn them using greedy heuristic approaches. In this paper we propose the Adaptive Skip-gram model which is a nonparametric Bayesian extension of Skip-gram capable to automatically learn the required number of representations for all words at desired semantic resolution. We derive efficient online variational learning algorithm for the model and empirically demonstrate its efficiency on word-sense induction task.

Robust Variational Inference

2016 · PREPRINT · en

Variational inference is a powerful tool for approximate inference. However, it mainly focuses on the evidence lower bound as variational objective and the development of other measures for variational inference is a promising area of research. This paper proposes a robust modification of evidence and a lower bound for the evidence, which is applicable when the majority of the training set samples are random noise objects. We provide experiments for variational autoencoders to show advantage of the objective over the evidence lower bound on synthetic datasets obtained by adding uninformative noise objects to MNIST and OMNIGLOT. Additionally, for the original MNIST and OMNIGLOT datasets we observe a small improvement over the non-robust evidence lower bound.

PDF ↗

Spatially Adaptive Computation Time for Residual Networks

2016 · PREPRINT · en

This paper proposes a deep learning architecture based on Residual Network that dynamically adjusts the number of executed layers for the regions of the image. This architecture is end-to-end trainable, deterministic and problem-agnostic. It is therefore applicable without any modifications to a wide range of computer vision problems such as image classification, object detection and image segmentation. We present experimental results showing that this model improves the computational efficiency of Residual Networks on the challenging ImageNet classification and COCO object detection datasets. Additionally, we evaluate the computation time maps on the visual saliency dataset cat2000 and find that they correlate surprisingly well with human eye fixation positions.

PDF ↗

PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions

2016 · CHAPTER · en

We propose a novel approach to reduce the computational cost of evaluation of convolutional neural networks, a factor that has hindered their deployment in low-power devices such as mobile phones. Inspired by the loop perforation technique from source code optimization, we speed up the bottleneck convolutional layers by skipping their evaluation in some of the spatial positions. We propose and analyze several strategies of choosing these positions. We demonstrate that perforation can accelerate modern convolutional networks such as AlexNet and VGG-16 by a factor of 2x - 4x. Additionally, we show that perforation is complementary to the recently proposed acceleration method of Zhang et al.

PDF ↗

Deep Part-Based Generative Shape Model with Latent Variables

2016 · CHAPTER · en

The Shape Boltzmann Machine (SBM) and its multilabel version MSBM have been recently introduced as deep generative models that capture the variations of an object shape. While being more flexible MSBM requires datasets with labeled parts of the objects for training. In the paper we present an algorithm for training MSBM using binary masks of objects and the seeds which approximately correspond to the locations of objects parts. The latter can be obtained from part-based detectors in an unsupervised manner. We derive a latent variable model and an EM-like training procedure for adjusting the weights of MSBM using a deep learning framework. We show that the model trained by our method outperforms SBM in the tasks related to binary shapes and is very close to the original MSBM in terms of quality of multilabel shapes.

DOI ↗ PDF ↗

A new approach for sparse Bayesian channel estimation in SCMA uplink systems

2016 · CHAPTER · en

The rapid growth of traffic and number of simultaneously available devices leads to the new challenges in constructing fifth generation wireless networks (5G). To handle with them various schemes of non-orthogonal multiple access (NOMA) were proposed. One of these schemes is Sparse Code Multiple Access (SCMA), which is shown to achieve better link level performance. In order to support SCMA signal decoding channel estimation is needed and sparse Bayesian learning framework may be used to reduce the requirement of pilot overhead. In this paper we propose a modification of sparse Bayesian learning based channel estimation algorithm that is shown to achieve better accuracy of user detection and faster convergence in numerical simulations.

DOI ↗

Breaking Sticks and Ambiguities with Adaptive Skip-gram

2016 · CHAPTER · en

Курсы (2)

Байесовские методы в машинном обучении · 5 раза

2025/2026, 2024/2025, 2023/2024, 2022/2023, 2021/2022 · Бакалавриат / Бакалавриат направление: 38.03.01 Экономика / Дисциплина общефакультетского пула / Магистратура / Маго-лего · рус
Нейробайесовские методы в машинном обучении

2021/2022 · Магистратура · рус