Nikolai Zakharov (罗一阳)
Machine Learning Engineer
Shanghai, China or (currently) Moscow, Russia Tel: +86 136 2105 3870 Email: n.d.zakharov@outlook.com LinkedIn: https://www.linkedin.com/in/nikolai-zakharov-285841a7/ GitHub: https://github.com/yiyang92
Professional Experience
VK AI, Experimental Research Team (Russia) — Machine Learning Engineer (LLM & Audio)
Research team focused on LLMs, speech, and audio understanding within VK’s AI organization. Jan 2026 – Present
- Building LLM-driven audio and speech understanding systems and evaluation pipelines.
- Researching multimodal modeling approaches that bridge text and audio modalities.
VK Video, VK (Russia) — Machine Learning Engineer
Largest social network and one of the biggest tech companies in Russia; VK Video serves tens of millions of users. Jan 2025 – Dec 2025
- Built a TV-series detection, normalization, and metadata extraction pipeline using regex bootstrapping and LLM-generated data; improved similar video recommendations, driving a 4.5–6.3% uplift in view counts and total watch time across short and full-length formats.
- Identified and fixed scroll session bugs (~6% of cases), improving session tracking and doc2doc relevance.
- Deployed geo-ban and music filters to ensure compliance and improve anchor-based music recommendations.
NIO (Shanghai, China) — Machine Learning Engineer
Leading premium EV manufacturer in China with global expansion. Jul 2023 – Jan 2025
- Developed a RAG-based assistant for driving log analysis, accelerating root-cause investigations.
- Built LLM-based classifiers to assess idea quality and provide automated feedback, increasing acceptance rate by ~20%.
- Developed a C++ module for real-time autonomous driving event detection, deployed in live vehicles to identify edge-case scenarios.
- Reduced Apollo preprocessing time by 50% with a custom Python converter.
Tinkoff (T‑Bank) (Moscow, Russia) — Machine Learning Engineer
Major Russian digital bank / fintech with millions of retail clients. Sep 2021 – Jan 2023
- Deployed a Russian Voice Conversion model, achieving 95% similarity and improving voice assistant quality.
- Reduced real-time factor by 88% for a production HiFiGAN vocoder, improving response times.
Huawei Technologies (Shenzhen, China) — Machine Learning Engineer (CBG)
Global telecom and device manufacturer; one of the world’s largest ICT companies. Jul 2019 – Sep 2021
- Built a multilingual trie-based NLU engine and automatic model update service with 80%+ request coverage, cutting inference cost by 20%.
- Built a voice cloning model for cloud and on-device environments with fine-tuning support.
Huawei Technologies — Research Intern (Noah Ark Lab)
Huawei’s research lab focusing on AI, CV, and NLP. 2018 – 2019
- Worked on AutoML (DARTS) and GAN-based solutions for CV and face recognition datasets.
Technical Skills
- Languages: Python (PyTorch, TensorFlow, HuggingFace), C++, Go, Java
- NLP/GenAI: GPT fine-tuning, RAG, semantic search, tokenization, voice modeling
- Infra: Docker, Kubernetes, KServe, Helm, Prometheus, Grafana, distributed inference
- Spoken Languages: English (Fluent), Chinese (Fluent), Russian (Native), Turkish (Basic)
Education
Tsinghua University, Beijing, China — M.S. in Computer Science (Machine Learning) Sep 2016 – Jul 2019
- Thesis: Diversified Image Captioning with Deep Learning (written in Chinese)
- Awarded a full Chinese Government Scholarship for academic excellence
Publications & Open Source
- Towards Controllable Image Descriptions with Semi-Supervised VAE (JVCIR, 2019)
- Proposed SCVAE for controllable stylized image captioning using labeled + unlabeled data, improving diversity and style control.
- DOI: https://doi.org/10.1016/j.jvcir.2019.102574
- GitHub: Open-source contributions at https://github.com/yiyang92