Nikolai Zakharov (罗一阳)
Senior Machine Learning Engineer (Multimodal / LLM)
Moscow, Russia (Open to China/Singapore/Global)
Tel: +86 136 2105 3870
Email: n.d.zakharov@outlook.com
LinkedIn: https://www.linkedin.com/in/nikolai-zakharov-285841a7/
Website: https://yiyang92.github.io/
Download English CV (PDF) → | 下载中文简历 (PDF) →
Summary
Senior Machine Learning Engineer (Multimodal / LLM). 8+ years production AI at scale. Specialized in Audio LLMs, multimodal systems, inference optimization (vLLM, TensorRT). Delivered: captioning service for music recommendations (VK), RAG agents (+20%, NIO), voice synthesis (-88% latency, Tinkoff). Tsinghua MSc, 10 years China tech ecosystem. Fluent in Russian, English, Chinese.
Professional Experience
VK AI, Applied Research Team (Russia) — Senior Machine Learning Engineer
Jan 2026 – Present
- Built LLM-based caption generation service improving recommendation relevance for music videos; designed 3-stage pipeline with distillation to train metadata extractor and audio-text relevance predictor; deployed item2item recommendation quality dashboard as key metric for recsys team.
VK Video, VK (Russia) — Senior Machine Learning Engineer
Jan 2025 – Dec 2025
- Architected NER + metadata extraction pipeline (LLM + rules) for recommendation system; improved candidate generation quality, increased TVT by 25%, reduced duplicate results from 60% to 7%.
NIO (Shanghai, China) — Senior Machine Learning Engineer
Mar 2023 – Jan 2025
- Designed AI Agent workflow (Golden Idea Platform): RAG retrieval + LLM assessment (+20%); built RAG-based LLM assistant; deployed C++ real-time AD event detection with on-car evaluation.
Tinkoff (T‑Bank) (Moscow, Russia) — Machine Learning Engineer
Sep 2021 – Feb 2023
- Trained and deployed Russian Voice Conversion model (95% speaker similarity) with TensorRT edge optimization; reduced HiFiGAN vocoder RTF by 88%.
Huawei Technologies (Shenzhen, China) — Machine Learning Engineer (CBG)
Jul 2019 – Sep 2021
- Built multilingual trie-based NLU engine serving large-scale voice assistant traffic (90%+ coverage); developed voice cloning system for cloud and edge.
Research Experience
Sber AI / Kandinsky Lab (Russia) — Research Collaboration
Dec 2025 – Jan 2026
- Built prompt optimization system for Kandinsky diffusion model; designed distributed LLM evaluation infrastructure with GPU cluster orchestration.
Huawei Technologies — Research Intern (Noah Ark Lab)
2018 – 2019
- Developed AutoML (DARTS) and GAN-based identity-preserving face recognition augmentation.
Education
Tsinghua University, Beijing, China
- M.S. in Computer Science (Machine Learning), 2016 – 2019
- Full Scholarship
- Thesis: Diversified Image Captioning with Deep Learning (Chinese)
- Ph.D. in Computer Science, 2022 – 2023
- Completed coursework; returned to industry
Technical Expertise
- LLM Systems: RAG pipelines, agent frameworks, model distillation, inference optimization (vLLM, TensorRT, quantization)
- Multimodal: audio-text models, music understanding, captioning, diffusion models
- ML Infrastructure: distributed training (FSDP, DeepSpeed), Kubernetes, KServe, GPU clusters
- Languages: Python (PyTorch, HuggingFace), C++ (real-time inference), Go
Publications
- Towards Controllable Image Descriptions with Semi-Supervised VAE (JVCIR, 2019)
DOI: https://doi.org/10.1016/j.jvcir.2019.102574
中文简历
个人简介
高级机器学习工程师(多模态/LLM),8年以上生产级AI经验。专注音频大模型、多模态系统、推理优化(vLLM、TensorRT)。交付:音乐推荐描述服务(VK)、RAG智能体(+20%,蔚来)、实时语音合成(-88%,Tinkoff)。清华硕士,十年中国科技生态。精通俄语、英语、中文。
工作经历
VK AI / 应用研究组 — 高级机器学习工程师
2026.01 – 至今
- 构建LLM描述生成服务,提升音乐视频推荐相关性;设计三阶段管道(蒸馏训练元数据提取器、音频-文本相关性预测器);部署item2item推荐质量仪表板作为推荐团队核心指标。
VK Video / 视频推荐组 — 高级机器学习工程师
2025.01 – 2025.12
- 架构推荐系统的NER+元数据提取管道(LLM+规则),提升候选生成质量,TVT提升25%,重复结果率60%降至7%。
蔚来汽车 / 自动驾驶大模型组 — 高级机器学习工程师
2023.03 – 2025.01
- 架构AI智能体系统(金点子平台):RAG检索+LLM评估(+20%);构建RAG-based LLM助手;部署C++实时AD事件检测,支持车载评估。
Tinkoff银行 / 语音技术组 — 机器学习工程师
2021.09 – 2023.02
- 训练并部署语音转换系统(Transformer,95%相似度),服务1000万+月活;优化HiFiGAN声码器(GPU算子融合,RTF降低88%);架构TensorRT边缘部署。
华为 / 小艺语音助手 — 机器学习工程师
2019.07 – 2021.09
- 构建多语言Trie树NLU引擎,服务大规模语音助手流量(90%+覆盖);开发云端与端侧语音克隆系统。
研究经历
Sber AI / Kandinsky Lab — 研究合作
2025.12 – 2026.01
- 构建Kandinsky扩散模型的提示词优化系统;设计分布式LLM评估基础设施,支持GPU集群编排。
华为诺亚方舟实验室 — 研究实习生
2018 – 2019
- 开发AutoML (DARTS);基于GAN的identity-preserving人脸识别数据增强。
教育背景
清华大学,北京,中国
- 硕士(2016 – 2019),计算机科学与技术(机器学习),全额奖学金
论文:《基于深度学习的多样性图像描述生成》 - 博士(2022 – 2023),课程完成,返回产业界
技术专长
- LLM系统:RAG管道、智能体框架、模型蒸馏、推理优化(vLLM、TensorRT、量化)
- 多模态:音频-文本模型、音乐理解、描述生成、扩散模型
- ML基础设施:分布式训练(FSDP、DeepSpeed)、Kubernetes、KServe、GPU集群
- 编程:Python(PyTorch、HuggingFace)、C++(实时推理)、Go
论文发表
- Towards Controllable Image Descriptions with Semi-Supervised VAE (JVCI&R, 2019)