Talks
Let's start GraphQL: structure, behavior, and architecture. Berlin, 2020
In this talk, I describe the path to start with GraphQL in a company that has experience with Python stack and REST API. We go from the definition of GraphQL, via behavioral aspects and data management, to the most common architectural questions.
Read MoreExceeding Classical: Probabilistic Data Structures in Data-Intensive Applications, EuroSciPy 2019, Bilbao, Spain.
In this talk, I explain the five most important problems in data processing that occurred in different domains but can be efficiently solved with probabilistic data structures and algorithms. We cover the membership querying, counting of unique elements, frequency and rank estimation in data streams, and similarity. [Slides are here.]
Read MoreToo Much Data? - Just Sample, Just Hash, ... , Pittsburgh Code & Supply, May 31, 2019.
Probabilistic Data Structure (PDS) concepts have been incorporated into Spark SQL. They are also used by Amazon Redshift and Google BigQuery. Consequently, PDS is not just some interesting academic topic.
Read MoreAn Introduction to Time Series Forecasting with Python, PyCon UA, April 28-29, 2018.
In this talk, we learn the basic theoretical concepts without going deep into mathematical aspects, study different models, and try them in practice using StatsModels, Prophet, scikit-learn, and keras.
Read MoreLoad distribution with DNS Delegation
The talk is about the problem of balancing the load without a single point of failure with user geographic built-in support.
Read MoreImplementing a Fileserver with Nginx and Lua
Using the power of Nginx it is easy to implement the quite complex logic of file upload with metadata and authorization support and without the need of any heavy application server. In this article, you can find the basic implementation of such Fileserver using Nginx and Lua only.
Read MoreRecurrent Neural Networks. Part 1: Theory
In presentation I cover basic aspects of the popular RNN architectures: LSTM and GRU.
Read MoreUkrainian Food Traditions for beginners
You have heard about Salad Olivje, Vereniki, Pirogi and Bliny, but still unsure what it is all about? This easy Pecha Kucha presentation can help you to become an expert :).
Read MoreProbabilistic data structures. Part 4. Similarity.
In this presentation I describe popular algorithms that employed Locality Sensitive Hashing (LSH) approach to solve the similarity problem. I start with LSH in general, and then switch to such algorithms as MinHash (LSH for Jaccard similarity) and SimHash (LSH for cosine similarity). Each approach comes with some math that is behind it and simple examples to clarify the theory statements.
Read MoreProbabilistic data structures. Part 3. Frequency.
In the presentation I describe popular and very simple data structures and algorithms used to estimate frequency of elements or find most occurred values in a data stream, such as Count-Min Sketch, Majority Algorithm and Misra-Gries Algorithm. Each approach comes with some math that is behind it and simple examples to clarify the theory statements.
Read MoreProbabilistic data structures. Part 2. Cardinality.
In the presentation I describe common data structures and algorithms used to estimate number of distinct elements in a set (cardinality), such as Linear Counting, HyperLogLog and HyperLogLog++. Each approach comes with some math that is behind it and simple examples to clarify the theory statements.
Read MoreProbabilistic data structures. Part 1. Membership.
In the presentation I describe such probabilistic data structures as Bloom Filter and Quotient Filter. Also, each structure comes with simple examples to clarify the theory statements.
Read More