Николенко Сергей Игоревич
Факультет компьютерных наук
Профессиональные интересы
Должности
- Профессор — Факультет компьютерных наук, Департамент анализа данных и искусственного интеллекта
Био
- · Начал работать в НИУ ВШЭ в 2023 году.
Образование
- 2009 · Кандидат физико-математических наук: Санкт-Петербургский государственный университет, специальность 01.01.06 «Математическая логика, алгебра и теория чисел», тема диссертации: Новые конструкции криптографических примитивов, основанные на полугруппах, группах и линейной алгебре
- 2005 · Специалитет: Санкт-Петербургский государственный университет, специальность «Математика», квалификация «Математик»
Опыт работы
- · 2005-2008: : аспирант, лаборатория математической логики ПОМИ РАН, Санкт-Петербург
- · 2006-2010: : ассистент, СПбГУ ИТМО, Санкт-Петербург
- · 2008-2010: : старший научный сотрудник, Центр речевых технологий, Санкт-Петербург
- · 2011-2012: : старший научный сотрудник, Лаборатория алгоритмической биологии, СПбАУ РАН, Санкт-Петербург
- · 2011-2014: : директор по разработкам, Surfingbird, Москва. 2008-...: доцент, СПбАУ РАН, Санкт-Петербург. 2008-...: научный сотрудник, лаборатория математической логики ПОМИ РАН, Санкт-Петербург
Награды и поощрения
- · Надбавка за публикацию в международном рецензируемом научном издании (2021–2022, 2020–2022, 2018–2020)
- · Надбавка за статью в зарубежном рецензируемом журнале (2015–2017, 2013–2015)
- · Лучший преподаватель — 2020–2021, 2017
Гранты и проекты
- — · на соискание учёной степени кандидата наук
Идентификаторы исследователя
- ORCID:
0000-0001-7787-2251 - ResearcherID:
I-7696-2013 - SPIN РИНЦ:
8186-1253 - Google Scholar: http://scholar.google.ru/citations?&user=_lk95cEAAAAJ
- Scopus AuthorID:
13608710100
Публикации (89)
Balancing Work and Size with Bounded Buffers
2014 · CHAPTER · en
We consider the fundamental problem of managing a bounded size queue buffer where traffic consists of packets of varying size, each packet requires several rounds of processing before it can be transmitted out, and the goal is to maximize the throughput, i.e., total size of successfully transmitted packets. Our work addresses the tension between two conflicting algorithmic approaches: favoring packets with fewer processing requirements as opposed to packets of larger size. We present a novel model for studying such systems and study the performance of online algorithms that aim to maximize throughput.
Single and Multiple Buffer Processing
2014 · CHAPTER · en
Buffer management policies are online algorithms that control a limited buffer of packets with homogeneous or heterogeneous characteristics, deciding whether to accept new packets when they arrive, which packets to process and transmit, and possibly whether to push out packets already residing in the buffer. Although settings differ, the problem is always to achieve the best possible competitive ratio, i.e., find a policy with good worst-case guarantees in comparison with an optimal offline clairvoyant algorithm.
Analysis and interpretation of imaging mass spectrometry data by clustering mass-to-charge images according to their spatial similarity
2013 · ARTICLE · en
Imaging mass spectrometry (imaging MS) has emerged in the past decade as a label-free, spatially resolved, and multipurpose bioanalytical technique for direct analysis of biological samples from animal tissue, plant tissue, biofilms, and polymer films. Imaging MS has been successfully incorporated into many biomedical pipelines where it is usually applied in the so-called untargeted mode-capturing spatial localization of a multitude of ions from a wide mass range. An imaging MS data set usually comprises thousands of spectra and tens to hundreds of thousands of mass-to-charge (m/z) images and can be as large as several gigabytes. Unsupervised analysis of an imaging MS data set aims at finding hidden structures in the data with no a priori information used and is often exploited as the first step of imaging MS data analysis. We propose a novel, easy-to-use and easy-to-implement approach to answer one of the key questions of unsupervised analysis of imaging MS data: what do all m/z images look like? The key idea of the approach is to cluster all m/z images according to their spatial similarity so that each cluster contains spatially similar m/z images. We propose a visualization of both spatial and spectral information obtained using clustering that provides an easy way to understand what all m/z images look like. We evaluated the proposed approach on matrix-assisted laser desorption ionization imaging MS data sets of a rat brain coronal section and human larynx carcinoma and discussed several scenarios of data analysis.
Comment-Based Discussion Communities In The Russian LiveJournal And Their Topical Coherence
2013 · PREPRINT · en
В работе изучается структура онлайн-дискуссий с целью выявления скрытых сообществ, в которых обсуждаются социально значимые вопросы. В исследовании показано, что дискуссионные сообщества, образующиеся на основе взаимного комментирования, в русскоязычной блогосфере центрируются в первую очередь вокруг авторов комментируемых постов как лидеров мнений и, в меньшей степени, вокруг тематики постов. Выводы получены на основе изучения выборки в 17386 постов, написанных топовыми двумя тысячами блоггеров Живого Журнала в течение одной недели и около 520 тысяч комментариев, составляющих около 4,5 миллионов ребер в сети со-комментирования
Interval Semi-supervised LDA: Classifying Needles in a Haystack
2013 · CHAPTER · en
An important text mining problem is to fi nd, in a large collection of texts, documents related to speci c topics and then discern further structure among the found texts. This problem is especially important for social sciences, where the purpose is to nd the most representative documents for subsequent qualitative interpretation. To solve this problem, we propose an interval semi-supervised LDA approach, in which certain prede ned sets of keywords (that de ne the topics researchers are interested in) are restricted to speci c intervals of topic assignments. We present a case study on a Russian LiveJournal dataset aimed at ethnicity discourse analysis.
Interval Semi-Supervised LDA: Classifying Needles in a Haystack
2013 · CHAPTER · en
An important text mining problem is to find, in a large collection of texts, documents related to specific topics and then discern further structure among the found texts. This problem is especially important for social sciences, where the purpose is to find the most representative documents for subsequent qualitative interpretation. To solve this problem, we propose an interval semi-supervised LDA approach, in which certain predefined sets of keywords (that define the topics researchers are interested in) are restricted to specific intervals of topic assignments. We present a case study on a Russian LiveJournal dataset aimed at ethnicity discourse analysis.
BayesHammer: Bayesian clustering for error correction in single-cell sequencing
2013 · ARTICLE · en
Error correction of sequenced reads remains a difficult task, especially in single-cell sequencing projects with extremely non-uniform coverage. While existing error correction tools designed for standard (multi-cell) sequencing data usually come up short in single-cell sequencing projects, algorithms actually used for single-cell error correction have been so far very simplistic. We introduce several novel algorithms based on Hamming graphs and Bayesian subclustering in our new error correction tool BAYESHAMMER. While BAYESHAMMER was designed for single-cell sequencing, we demonstrate that it also improves on existing error correction tools for multi-cell sequencing data while working much faster on real-life datasets. We benchmark BAYESHAMMER on both k-mer counts and actual assembly results with the SPADES genome assembler.
Multi-Queued Network Processors for Packets with Heterogeneous Processing Requirements
2013 в печати · CHAPTER · en
Modern network processors (NPs) increasingly deal with packets with heterogeneous processing requirements. In this work, we consider the fundamental problem of managing a bounded size buffer at the input queue of an NP. Incoming traffic consists of packets, each packet requiring several rounds of processing before it can be transmitted out of the queue. The objective is to maximize the total number of successfully transmitted packets. In such an environment, it is well known that Shortest-Remaining-Processing-Time (SRPT) first scheduling with push-out is optimal [1]. However, it is hard to implement both priority queueing (PQ) by remaining processing and the push-out mechanism simultaneously in an NP. We explore alternatives for this architecture, addressing the simplicity vs. performance system design tradeoffs. We design a simplified architecture and provide worst-case guarantees for its throughput performance in different settings. We also conduct a comprehensive simulation study that validates our results.
Efficient Demand Assignment in Multi-Connected Microgrids
2013 в печати · CHAPTER · en
With the proliferation of distributed generation, an electrical load can be satisfied either by a centralized generator or by local/nearby distributed generators. Given a set of resource demands in a collection of geographically co-located microgrids that are connected to the central grid and also potentially to each other, each such demand characterized by a power level and a duration, we study algorithms that allocate generation resources to the set of demands by configuring switched paths from sources to loads.
Towards Efficient Implementation of Packet Classifiers in SDN/OpenFlow
2013 в печати · CHAPTER · en
Traffic classification is a core problem underlying efficient implementation of network services. In this work we draw from our experience in classifier design for commercial systems to address this problem in SDN and OpenFlow. We identify methods from other fields of computer science and show research directions that can be applied for efficient design of packet classifiers. Proposed abstractions and design patterns can significantly reduce requirements on network elements and enable deployment of functionality that would be infeasible in a traditional way.
Курсы (3)
-
Машинное обучение · 4 раза
2025/2026, 2024/2025, 2023/2024, 2022/2023 · Магистратура / Маго-лего · рус
-
Deep Generative Models
2022/2023 · Маго-лего / Нижний Новгород · Анг
-
01.04.02. Прикладная математика и информатика
2022/2023 · Магистратура · рус