
Nikolai Zakharov (罗一阳)
ML Engineer at VK AI, working with the Experimental Research Team on LLM + audio systems (since Jan 2026).
I build language models that understand audio and speech, bridging the gap between text and audio modalities. Based in Moscow after 10 years living in China, studying and working across telecom, fintech, and EVs (Huawei, Tinkoff, NIO). Prior to VK AI, I worked on recommender systems at VK Video (2025).
Core expertise: Audio LLMs • Speech Recognition • Multilingual NLP • Generative AI
Quick Facts
- Languages: Russian (native), English (fluent), Chinese/Mandarin (fluent)
- Current Focus: Next-generation speech understanding systems at VK AI
- Experience: 10 years in China’s tech ecosystem
- Education: Tsinghua University (Computer Science)
- Location: Moscow, Russia
My Journey
My passion for machine learning began at the intersection of linguistics and technology. After realizing that mechanical engineering programs in China weren’t suitable for international students, I pivoted to Computer Science at Tsinghua University, where I discovered deep learning in its early days.
This decision led me through China’s tech scene and major Russian tech companies, spanning telecom, fintech, and EVs:
- Huawei — China-based global telecom and device manufacturer (one of the world’s largest ICT companies).
- Tinkoff (T‑Bank) — Russia-based digital bank/fintech with millions of retail clients.
- NIO (Shanghai) — China-based EV manufacturer and one of the leading premium EV brands.
What I Do Today
At VK AI’s Experimental Research Team, I’m part of the research group working on:
- Audio understanding in Large Language Models
- Multilingual speech processing
- Building AI agents with voice capabilities
- Bridging text and audio modalities
Let’s Connect
I’m always interested in connecting with fellow ML engineers, researchers, and technology enthusiasts. Whether you want to discuss the latest in LLMs, audio technology, or share experiences about working in global tech markets.
Collaboration & Speaking
I’m open to:
- Speaking at conferences or meetups about AI, audio technology, or cross-cultural tech experiences
- Technical writing collaborations
- Open source contributions
- Advisory roles for ML startups
- Research collaborations in speech/audio AI
Response Times
- Telegram: Usually within a few hours
- Email: 24-48 hours on weekdays
- LinkedIn: 2-3 days
I read every message and try to respond to all genuine inquiries about AI, audio technology, or working across cultures.