DSA Faculty
API
← к списку преподавателей

Николенко Сергей Игоревич

Факультет компьютерных наук

Публикаций
89
Языков
2
Наград
3
Конференций
0
Профиль Публикации (89) Курсы (3)

Профессиональные интересы

20.00.00 Информатика27.00.00 Математика

Должности

  • ПрофессорФакультет компьютерных наук, Департамент анализа данных и искусственного интеллекта

Био

  • · Начал работать в НИУ ВШЭ в 2023 году.

Образование

  • 2009 · Кандидат физико-математических наук: Санкт-Петербургский государственный университет, специальность 01.01.06 «Математическая логика, алгебра и теория чисел», тема диссертации: Новые конструкции криптографических примитивов, основанные на полугруппах, группах и линейной алгебре
  • 2005 · Специалитет: Санкт-Петербургский государственный университет, специальность «Математика», квалификация «Математик»

Опыт работы

  • · 2005-2008: : аспирант, лаборатория математической логики ПОМИ РАН, Санкт-Петербург
  • · 2006-2010: : ассистент, СПбГУ ИТМО, Санкт-Петербург
  • · 2008-2010: : старший научный сотрудник, Центр речевых технологий, Санкт-Петербург
  • · 2011-2012: : старший научный сотрудник, Лаборатория алгоритмической биологии, СПбАУ РАН, Санкт-Петербург
  • · 2011-2014: : директор по разработкам, Surfingbird, Москва. 2008-...: доцент, СПбАУ РАН, Санкт-Петербург. 2008-...: научный сотрудник, лаборатория математической логики ПОМИ РАН, Санкт-Петербург

Награды и поощрения

  • · Надбавка за публикацию в международном рецензируемом научном издании (2021–2022, 2020–2022, 2018–2020)
  • · Надбавка за статью в зарубежном рецензируемом журнале (2015–2017, 2013–2015)
  • · Лучший преподаватель — 2020–2021, 2017

Гранты и проекты

  • · на соискание учёной степени кандидата наук

Идентификаторы исследователя

Публикации (89)

Discerning Depression Propensity Among Participants of Suicide and Depression-Related Groups of Vk.com

2015 · CHAPTER · en

In online social networks, high level features of user behavior such as character traits can be predicted with data from user profiles and their connections. Recent publications use data from online social networks to detect people with depression propensity and diagnosis. In this study, we investigate the capabilities of previously published methods and metrics applied to the Russian online social network VKontakte. We gathered user profile data from most popular communities about suicide and depression on VK.com and performed comparative analysis between them and randomly sampled users. We have used not only standard user attributes like age, gender, or number of friends but also structural properties of their egocentric networks, with results similar to the study of suicide propensity in the Japanese social network Mixi.com. Our goal is to test the approach and models in this new setting and propose enhancements to the research design and analysis. We investigate the resulting classifiers to identify profile features that can indicate depression propensity of the users in order to provide tools for early depression detection. Finally, we discuss further work that might improve our analysis and transfer the results to practical applications.

Communications in Computer and Information Science, Vol. 542, Springer, 2015

2015 · BOOK · en

In online social networks, high level features of user behavior such as character traits can be predicted with data from user profiles and their connections. Recent publications use data from online social networks to detect people with depression propensity and diagnosis. In this study, we investigate the capabilities of previously published methods and metrics applied to the Russian online social network VKontakte. We gathered user profile data from most popular communities about suicide and depression on VK.com and performed comparative analysis between them and randomly sampled users. We have used not only standard user attributes like age, gender, or number of friends but also structural properties of their egocentric networks, with results similar to the study of suicide propensity in the Japanese social network Mixi.com. Our goal is to test the approach and models in this new setting and propose enhancements to the research design and analysis. We investigate the resulting classifiers to identify profile features that can indicate depression propensity of the users in order to provide tools for early depression detection. Finally, we discuss further work that might improve our analysis and transfer the results to practical applications.

Priority Queueing with Multiple Packet Characteristics

2015 · CHAPTER · en

Abstract: Modern network elements are increasingly required to deal with heterogeneous traffic. Recent works consider processing policies for buffers that hold packets with different processing requirement (number of processing cycles needed before a packet can be transmitted out) but uniform value, aiming to maximize the throughput, i.e., the number of transmitted packets. Other developments deal with packets of varying value but uniform processing requirement (each packet requires one processing cycle); the objective here is to maximize the total transmitted value. In this work, we consider a more general problem, combining packets with both nonuniform processing and nonuniform values in the same queue. We study the properties of various processing orders in this setting. We show that in the general case natural processing policies have poor performance guarantees, with linear lower bounds on their competitive ratio. Moreover, we show an adversarial lower bound that holds for every online policy. On the positive side, in the special case when only two different values are allowed, 1 and V, we present a policy that achieves competitive ratio (1 + W+2/V), where W is the maximal number of required processing cycles. We also consider copying costs during admission.

Single Cells within the Puerto Rico Trench Suggest Hadal Adaptation of Microbial Lineages

2015 · ARTICLE · en

Hadal ecosystems are found at a depth of 6,000 m below sea level and below, occupying less than 1% of the total area of the ocean. The microbial communities and metabolic potential in these ecosystems are largely uncharacterized. Here, we present four single amplified genomes (SAGs) obtained from 8,219 m below the sea surface within the hadal ecosystem of the Puerto Rico Trench (PRT). These SAGs are derived from members of deep-sea clades, including the Thaumarchaeota and SAR11 clade, and two are related to previously isolated piezophilic (high-pressure-adapted) microorganisms. In order to identify genes that might play a role in adaptation to deep-sea environments, comparative analyses were performed with genomes from closely related shallow-water microbes. The archaeal SAG possesses genes associated with mixotrophy, including lipoylation and the glycine cleavage pathway. The SAR11 SAG encodes glycolytic enzymes previously reported to be missing from this abundant and cosmopolitan group. The other SAGs, which are related to piezophilic isolates, possess genes that may supplement energy demands through the oxidation of hydrogen or the reduction of nitrous oxide. We found evidence for potential trench-specific gene distributions, as several SAG genes were observed only in a PRT metagenome and not in shallower deep-sea metagenomes. These results illustrate new ecotype features that might perform important roles in the adaptation of microorganisms to life in hadal environments.

Online Recommender System for Radio Station Hosting: Experimental Results Revisited

2014 · CHAPTER · en

We present a new recommender system developed for the Russian interactive radio network FMhost based on a previously proposed model. The underlying model combines a collaborative user-based approach with information from tags of listened tracks in order to match user and radio station profiles. It follows an adaptive online learning strategy based on the user history. We compare the proposed algorithms and an industry standard technique based on singular value decomposition (SVD) in terms of precision, recall, and NDCG measures; experiments show that in our case the fusion-based approach shows the best results.

SAX-PAC (Scalable And eXpressive PAcket Classification)

2014 · CHAPTER · en

Efficient packet classification is a core concern for network services. Traditional multi-field classification approaches, in both software and ternary content-addressable memory (TCAMs), entail tradeoffs between (memory) space and (lookup) time. TCAMs cannot efficiently represent range rules, a common class of classification rules confining values of packet fields to given ranges. The exponential space growth of TCAM entries relative to the number of fields is exacerbated when multiple fields contain ranges. In this work, we present a novel approach which identifies properties of many classifiers which can be implemented in linear space and with worst-case guaranteed logarithmic time and allows the addition of more fields including range constraints without impacting space and time complexities. On real-life classifiers from Cisco Systems and additional classifiers from ClassBench (with real parameters), 90-95% of rules are thus handled, and the other 5-10% of rules can be stored in TCAM to be processed in parallel.

Shared Memory Buffer Management for Heterogeneous Packet Processing

2014 · CHAPTER · en

Packet processing increasingly involves heterogeneous requirements. We consider the well-known model of a shared memory switch with bounded-size buffer and generalize it in two directions. First, we consider unit-sized packets labeled with an output port and a processing requirement (i.e., packets with heterogeneous processing), maximizing the number of transmitted packets. We analyze the performance of buffer management policies under various characteristics via competitive analysis that provides uniform guarantees across traffic patterns (Borodin and El-Yaniv, 1998). We propose the Longest-Work-Drop policy and show that it is at most 2-competitive and at least sqrt 2}-competitive. Second, we consider another generalization, posed as an open problem in [10], where each unit-sized packet is labeled with an output port and intrinsic value, and the goal is to maximize the total value of transmitted packets. We show first results in this direction and define a scheduling policy that, as we conjecture, may achieve constant competitive ratio. We also present a comprehensive simulation study that validates our results.

Measuring Topic Quality in Latent Dirichlet Allocation

2014 · CHAPTER · en

Topic modeling is an important direction of study for modern text mining; unsupervised mining of collections of topics is intended to produce understanding and capture the essence of issues a dataset is devoted to. However, existing techniques of topic evaluation in topic models such as latent Dirichlet allocation (LDA) are still lacking in their ability to represent human interpretability and worth for qualitative studies. In this work, we propose a novel topic quality metric that more closely corresponds to human judgement than existing ones. We support this claim with the results of an experimental study where test subjects rate LDA topics on how interpretable they are.

Latent Dirichlet Allocation: Stability and Applications to Studies of User-Generated content

2014 · CHAPTER · en

Topic modeling, in particular the Latent Dirichlet Allocation (LDA) model, has recently emerged as an important tool for understanding large datasets, in particular, user-generated datasets in social studies of the Web. In this work, we investigate the instability of LDA inference, propose a new metric of similarity between topics and a criterion of vocabulary reduction. We show the limitations of the LDA approach for the purposes of qualitative analysis in social science and sketch some ways for improvement.

Improving Quality Of Service For Radio Station Hosting: An Online Recommender System Based On Information Fusion

2014 · PREPRINT · en

We present a new recommender system developed for the Russian interactive radio network FMhost. The system aims to improve the quality of this service; it is designed specifically to deal with small datasets, overcoming the shortage of data on observed user behavior. The underlying model combines a collaborative user-based approach with information from tags of listened tracks in order to match user and radio station profiles. It follows an adaptive online learning strategy based on both user history and implicit feedback. We compare the proposed algorithms with industry standard methods based on Singular Value Decomposition (SVD) in terms of precision, recall, and Normalized Discounted Cumulative Gain (NDCG) measures; experiments show that in our case the fusion-based approach produces the best results.

Курсы (3)