Artem is available for hire

Artem Ryzhikov

Verified Expert in Engineering

Machine Learning Developer

Location

Batumi, Adjara, Georgia

Toptal Member Since

February 22, 2022

Artem拥有机器学习(ML)博士学位，在数据结构方面有7年的经验，在ML研究方面有6年的经验, two years of working at tech startups, and four years of team management. As a senior ML engineer, 他为一款手机应用建立了一个CV算法，这款应用在一周内以1个应用排名app Store第二.5 million downloads. Artem has also worked on various other projects, including recommendation and search systems, text processing, and conventional data science.

Machine Learning Deep Learning Computer Vision Time Series Analysis Algorithms Git Python Pandas NumPy PyTorch Recommendation Systems Natural Language Processing (NLP)Linux TensorFlow Spark Anomaly Detection Data Structures Generative Adversarial Networks (GAN)

Portfolio

National Research University Higher School of Economics

Python, Computer Vision, Generative Adversarial Networks (GANs)...

Snapchat

Computer Vision, OpenCV, TensorFlow, Python, C++

Double Data

Scala, Python, Spark, PySpark，机器学习，搜索，社交网络...

Experience

Data Science - 7 years Machine Learning - 7 years Python - 7 years Deep Learning - 5 years PyTorch - 5 years Computer Vision - 5 years Spark - 2 years Recommendation Systems - 2 years

Availability

Part-time

Preferred Environment

Linux, Vim Text Editor, Jupyter Notebook, Git

The most amazing...

...我开发的项目是一款手机应用，它在一周内成为app Store第二好的手机应用. It was sold to Snapchat along with a team.

Work Experience

Research Fellow

2017 - PRESENT

National Research University Higher School of Economics

学习如何进行异常检测中的机器学习(ML)研究, time series, and domain adaptation algorithms. Designed and implemented new algorithms. Compared them with the existing methods.
Wrote scientific articles about my new ML algorithms. 在顶级机器学习期刊上发表了六篇影响机器学习的文章，并成为大型强子对撞机(LHCb)合作项目的成员.
Conducted ML lectures and seminars. Co-authored several ML, deep learning (DL), 以及Coursera和大学的生成对抗网络(GAN)课程. 通过向广大听众发表公开演讲，培养了沟通技巧.
设计并实现了一种新的模型无关异常增强技术，使表格和图像数据的ROC AUC增强达到0.08 higher than state-of-the-art methods in four out of six datasets. It was published in a top ML journal.
设计了一种新的表格和图像数据的异常检测算法，使ROC AUC boost达到0.1 higher than state-of-the-art methods in five out of six datasets. It was published in the Journal of Machine Learning Research.
设计并实现了一种新的多变量数据深度学习变化点检测算法. 在6个基准测试中，该算法将变化点得分提高到现有最先进模型的8倍.
Applied Bayesian sparsification of classification models, making them 16 times faster with no quality decrease. Deployed the C++ model to the LHCb pipeline. The paper was published in conference proceedings.
设计并实现了一种领域自适应技术，在合成数据上有效拟合深度学习模型，避免过拟合. The results were published in conference proceedings.
设计并实现了一种领域自适应技术，用于在训练数据集中存在的一小部分领域上有效地拟合深度学习模型. The results were published in conference proceedings.
Managed three research projects with up to six people in a team. 研究内容包括设计一种基于bert的半监督主题建模算法.

技术:Python，计算机视觉，生成对抗网络(GANs)，贝叶斯推理 & 建模，异常检测，机器学习，深度学习，PyTorch

Senior Machine Learning Engineer

2017 - 2017

Snapchat

设计并测试了语义分割算法，从自拍照的人身上分割背景. Conducted experiments with postprocessing. 用平均相交-超并度(IoU)评价的分割质量从0得到了提高.93 to 0.98.
Worked on neural networks speedup and quantization. The existing segmentation models were sped up nine times.
实现了该算法的实时版本，以超过30帧/秒的速度对视频进行分割。.
应用头发着色使用生成模型和其他花式过滤器和面具在c++上实现.
Helped with the deployment of a neural network to mobile devices. The app reached second place in the App Store in a week with 1.5 million downloads. It was sold to Snapchat along with a team.

Technologies: Computer Vision, OpenCV, TensorFlow, Python, C++

Data Scientist | Scala Developer

2016 - 2017

Double Data

在三个开发人员的小团队中，用三个月的时间实现了第一个人员搜索引擎. Reached search quality with 54.5% recall at 99.并且获得了使用JUnit和Mockito进行测试驱动开发的大量经验.
Implemented and optimized Spark jobs for over 200TB data processing. Configured Spark for efficient batch processing.
Conducted data analytics tasks for the business. Learned to present and visualize the results.
训练和评估信用评分的ML模型，达到40%的不良率.

Technologies: Scala, Python, Spark, PySpark，机器学习，搜索，社交网络, Scikit-learn, Elasticsearch, PostgreSQL, Amazon Web Services (AWS), Flask

Data Scientist | Data Engineer

2015 - 2016

VK group

对现有协同过滤(CF)推荐算法进行优化，使推荐速度提高5倍, which led to a 45% click-through rate (CTR) increase. The product was sold to VK Group along with a team.
针对冷启动问题，设计并实现了一种有效的用户分割算法, which led to a 70+% CTR increase, making it just 0.09% smaller than CF.
实现了新的有效推荐、情感分析和主题建模算法.
使得几乎所有的推荐算法都是个性化的、实时的，每次用户交互后都会更新.
Refactored the entire legacy Scala code of the recommendation engine, which was around 4,000 lines long.
Applied data science analytics for the business and management.
Designed a distributed data storage architecture and data flow. Sped up data loading three times.
Managed a machine learning research team. 为推荐算法测试构建沙盒、ab测试和CI.

Technologies: Python, Scala, Spark, Elasticsearch, Redis, Cassandra, PostgreSQL, Scikit-learn, Amazon Web Services (AWS), DataStax, RabbitMQ

Experience

PyTorch ARD

http://github.com/HolyBayes/pytorch_ard

PyTorch Conv2d和具有内置贝叶斯稀疏化的线性层使任何神经网络的速度提高300倍. 该实现在训练过程中自动减少参数的数量.

该项目基于一篇“变分Dropout Sparsifies Deep Neural Networks”的论文.

PyTorch Implementation of TIRE

http://github.com/HolyBayes/TIRE_pytorch

This is an unofficial PyTorch implementation of TIRE. TIRE是一种基于自编码器的时间序列数据变化点检测算法，它使用时不变表示.

更多信息可以在2020年预印本“使用具有时不变表示的自编码器进行时间序列数据中的变化点检测”中找到."

PyTorch Implementation of KL-CPD

http://github.com/HolyBayes/klcpd

This is an unofficial PyTorch implementation of KL-CPD, an algorithm for time-series change point and anomaly detection.

更多信息可以在2019年的论文“使用辅助深度生成模型的核变化点检测”中找到."

EM Algorithm with Automatic Relevance Determination

http://github.com/HolyBayes/ard-em

这是对期望最大化(EM)算法的贝叶斯修正，具有自动确定组件数量的功能. 用于重建正态分布混合物的传统EM算法不允许确定混合成分的数量. ARD EM实现提出了一种基于相关向量法自动确定ARD EM分量数的算法. The idea of the algorithm is to use, at the initial stage, 混合物中成分的故意过量，并通过最大化效度进一步确定相关成分.

Education

2017 - 2021

PhD Degree in Computer Science

National Research University HSE - Moscow, Russia

2015 - 2017

Master's Degree in Computer Science

National Research University HSE - Moscow, Russia

2011 - 2015

Bachelor's Degree in Physics

Novosibirsk State University - Novosibirsk, Russia

Certifications

MAY 2017 - PRESENT

Yandex School of Data Analysis

Yandex

Skills

Libraries/APIs

PyTorch, Pandas, NumPy, Scikit-learn, OpenCV, TensorFlow, PySpark

Tools

Git, RabbitMQ, DataStax

Languages

Python, Scala, SQL, C++

Paradigms

Data Science, Anomaly Detection

Storage

Elasticsearch, PostgreSQL, Redis, Cassandra

Frameworks

Spark, Flask

Platforms

Linux, Amazon Web Services (AWS)

Other

Machine Learning, Deep Learning, Computer Vision, Experimental Design, Algorithms, Generative Adversarial Networks (GANs), Bayesian Inference & Modeling, Time Series Analysis, Data Structures, Recommendation Systems, Natural Language Processing (NLP), Domain Adaptation, Search, Social Networks, GPT, Generative Pre-trained Transformers (GPT)

Collaboration That Works

How to Work with Toptal

在数小时内，而不是数周或数月，我们的网络将为您直接匹配全球行业专家.

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.

Choose your talent

在24小时内获得专业匹配人才的简短列表，以进行审查，面试和选择.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring