Николенко Сергей Игоревич
Факультет компьютерных наук
Профессиональные интересы
Должности
- Профессор — Факультет компьютерных наук, Департамент анализа данных и искусственного интеллекта
Био
- · Начал работать в НИУ ВШЭ в 2023 году.
Образование
- 2009 · Кандидат физико-математических наук: Санкт-Петербургский государственный университет, специальность 01.01.06 «Математическая логика, алгебра и теория чисел», тема диссертации: Новые конструкции криптографических примитивов, основанные на полугруппах, группах и линейной алгебре
- 2005 · Специалитет: Санкт-Петербургский государственный университет, специальность «Математика», квалификация «Математик»
Опыт работы
- · 2005-2008: : аспирант, лаборатория математической логики ПОМИ РАН, Санкт-Петербург
- · 2006-2010: : ассистент, СПбГУ ИТМО, Санкт-Петербург
- · 2008-2010: : старший научный сотрудник, Центр речевых технологий, Санкт-Петербург
- · 2011-2012: : старший научный сотрудник, Лаборатория алгоритмической биологии, СПбАУ РАН, Санкт-Петербург
- · 2011-2014: : директор по разработкам, Surfingbird, Москва. 2008-...: доцент, СПбАУ РАН, Санкт-Петербург. 2008-...: научный сотрудник, лаборатория математической логики ПОМИ РАН, Санкт-Петербург
Награды и поощрения
- · Надбавка за публикацию в международном рецензируемом научном издании (2021–2022, 2020–2022, 2018–2020)
- · Надбавка за статью в зарубежном рецензируемом журнале (2015–2017, 2013–2015)
- · Лучший преподаватель — 2020–2021, 2017
Гранты и проекты
- — · на соискание учёной степени кандидата наук
Идентификаторы исследователя
- ORCID:
0000-0001-7787-2251 - ResearcherID:
I-7696-2013 - SPIN РИНЦ:
8186-1253 - Google Scholar: http://scholar.google.ru/citations?&user=_lk95cEAAAAJ
- Scopus AuthorID:
13608710100
Публикации (89)
Stable Topic Modeling with Local Density Regularization
2016 · CHAPTER · en
Topic modeling has emerged over the last decade as a powerful tool for analyzing large text corpora, including Web-based user-generated texts. Topic stability, however, remains a concern: topic models have a very complex optimization landscape with many local maxima, and even different runs of the same model yield very different topics. Aiming to add stability to topic modeling, we propose an approach to topic modeling based on local density regularization, where words in a local context window of a given word have higher probabilities to get the same topic as that word. We compare several models with local density regularizers and show how they can improve topic stability while remaining on par with classical models in terms of quality metrics.
On demand elastic capacity planning for service auto-scaling
2016 · CHAPTER · en
Cloud computing allows on demand elastic service scaling. The capability of a service to predict resource requirements for the next operational period defines how well it will exploit the elasticity of cloud computing in order to reduce operational costs. In this work, we consider a capacity planning process for service scale-out as an online pricing model. In particular, we study the impact of buffering service requests on revenues in various settings with allocation and maintenance costs. In addition, we analyze the incurred latency implied by buffering service requests. We believe that our insights will allow to significantly simplify predictions and mitigate the unknowns of future demands on resources.
How to represent IPv6 forwarding tables on IPv4 or MPLS dataplanes
2016 · CHAPTER · en
The Internet routing ecosystem is facing substantial scalability challenges on the data plane. Various “clean slate” architectures for representing forwarding tables (FIBs), such as IPv6, introduce additional constraints on efficient implementations from both lookup time and memory footprint perspectives due to significant classification width. In this work, we propose an abstraction layer able to represent IPv6 FIBs on existing IP and even MPLS infrastructure. Feasibility of the proposed representations is confirmed by an extensive simulation study on real IPv6 forwarding tables, including low-level experimental performance evaluation.
FIB Efficiency in Distributed Platforms
2016 · CHAPTER · en
The Internet routing ecosystem is facing substantial scalability challenges due to continuous, significant growth of the state represented in the data plane. Distributed switch architectures introduce additional constraints on efficient implementations from both lookup time and memory footprint perspectives. In this work we explore efficient FIB representations in common distributed switch architectures. Our approach introduces substantial savings in memory footprint transparently for existing hardware. Our results are supported by an extensive simulation study on real IPv4 and IPv6 FIBs.
ARTM vs. LDA: an SVD Extension Case Study
2016 · CHAPTER · en
In this work, we compare two extensions of two different topic models for the same problem of recommending full-text items: previously developed SVD-LDA and its counterpart SVD-ARTM based on additive regularization. We show that ARTM naturally leads to the inference algorithm that has to be painstakingly developed for LDA.
Pseudo-Bimodal Community Detection in Twitter-Based Networks
2016 · CHAPTER · en
We present a novel approach to clustering Twitter users and characterizing their preferences (political or otherwise) based on the features of communication networks extracted from their tweets. We make the assumption that central users in the network, the so-called “top”, or “power” users, set the agenda, while other, “regular” users often retweet and/or mention their tweets, and behavior towards “top” users differs from the behaviour of “regular” users towards each other. We show that network clustering on Twitter can be observed more distinctively on unimodal projections of specially created bimodal networks (bipartite graphs), where top users in the networks are artificially separated into a second part according to node centrality measures. We evaluate our approach on Twitter-based datasets of mentions and retweets related to Russian political protests and a benchmark English-language Twitter dataset with distinctly polarized clusters; we compare various centrality measures and show that our algorithm yields high modularity in the resulting community structure.
Stable topic modeling for web science: Granulated LDA
2016 · CHAPTER · en
Topic modeling is a powerful tool for analyzing large collections of user-generated web content, but it still suffers from problems with topic stability, which are especially important for social sciences. We evaluate stability for differenttopic models and propose a new model, granulated LDA,that samples short sequences of neighboring words at once. We show that gLDA exhibits very stable results. ©2016 Copyright held by the owner/author(s).
Constructing aspect-based sentiment lexicons with topic modeling
2016 · CHAPTER · en
We study topic models designed to be used for sentiment analysis, i.e., models that extract certain topics (aspects) from a corpus of documents and mine sentiment-related labels related to individual aspects. For both direct applications in sentiment analysis and other uses, it is desirable to have a good lexicon of sentiment words, preferably related to different aspects in the words. We have previously developed a modification for several popular sentiment-related LDA extensions that trains prior hyperparameters β for specific words. We continue this work and show how this approach leads to new aspect-specific lexicons of sentiment words based on a small set of “seed” sentiment words; the lexicons are useful by themselves and lead to improved sentiment classification.
Mapping Ethnic Discourse in the Russian Blogosphere of 2010s
2015 · CHAPTER · en
Курсы (3)
-
Машинное обучение · 4 раза
2025/2026, 2024/2025, 2023/2024, 2022/2023 · Магистратура / Маго-лего · рус
-
Deep Generative Models
2022/2023 · Маго-лего / Нижний Новгород · Анг
-
01.04.02. Прикладная математика и информатика
2022/2023 · Магистратура · рус