Alexander Karpekov
PhD Student in Computer Science. Data Scientist.
Political Science and Economics student turned Big Tech guy turned Computer Science grad student. After spending almost ten years as a Data Scientist in the industry (seven years at Google, and three years at a startup), I decided to go back to school to pursue a PhD in Computer Science with a focus on AI and Machine Learning.
I have a particular interest in the field of Explainable Artificial Intelligence, and the intersection of AI and social sciences. I am also always looking to explore how to use data visualizations for storytelling.
I am always open to new opportunities to collaborate on projects at the intersection of AI and other fields, so feel free to reach out!
Education
Georgia Institute of Technology
Atlanta, GAPhD in Computer Science
Current research areas focus on Interactive Computing, Explainable AI, and Ubiquitous Computing.
Advisors: Sonia Chernova and Thomas Plötz .
Georgia Institute of Technology
Atlanta, GAMSc in Computer Science | GPA: 3.9/4.0
Completed 2nd Master’s Degree remotely while working full time at Google. Focused on Machine Learning and Artificial Intelligence.
University of California, San Diego
San Diego, CAMA in Economics | GPA: 3.8/4.0
Worked as a Teaching Assistant for 3 graduate-level classes in Statistics and Econometrics, leading sessions for 120+ students. Received the best TA award. Regional focus was on China. Studied Mandarin Chinese.
MGIMO University
Moscow, RussiaBA in Political Science | GPA: 92/100
Studied Comparative Politics. Languages: English, French. Thesis on History of Migration in the United Kingdom.
Industry Experience
Senior Data Scientist (L5)
Worked as a Data Scientist in Google Search and YouTube Music, with the main focus on statistical data analysis and A/B experiment design and evaluation to improve search quality and music recommendations. Was promoted twice to L5. Presented my work and findings at regular director, VP, and executive level meetings, including YouTube CEO Susan Wojcicki. A few notable projects:
- Developed a pathfinding algorithm in song embedding space, improving music recommendations that led to 3% boost in user engagement and music discovery rates. This work was presented at Google-level Data Science Conference in 2023.
- Implemented a new methodology to cluster YouTube multi-billion music corpus using text, sound, search, and co-watch embeddings, which led to a 30% reduction in harmful watchtime and a 0.5% increase in music revenue ($100s millions).
- Created a new counterfactual causal impact methodology to evaluate the impact of the new feature launch on user engagement and conversion that helped establish no statistically significant long-term effects on key business metrics. The analysis was instrumental to halt the global rollout at Engineering and Product VP-level.
Dataminr
London, UK & New York, NYData Analyst
Worked as a Data Analyst in the Data Science team, focusing on Twitter data analysis and news discovery algorithms.
- Built statistical models to automatically classify Twitter user handles.
- Conducted Twitter user clustering and unsupervised learning using networks analysis methodologies to improve news discovery algorithms.
- Led company-wide effort for reporting automation using Python instead of Excel.
Publications
Transformer Explainer: Interactive Learning of Text-Generative Models
IEEE Viz, AAAI Demo TrackAeree Cho, Grace Kim, Alexander Karpekov, et al.
An interactive visualization tool that helps users understand how transformer models work through hands-on experimentation and real-time feedback.
- Best Poster Award at IEEE Viz 2024
- Went Viral: 150K+ visitors in the first 3 months
Is Attention Truly All We Need?
Deep Learning for Text: Final ProjectAlexander Karpekov, Sidney Miller
This project investigates the use of Transformer attention weights for deriving feature importance in NLP tasks, demonstrating that combining attention weights with gradient information improves explainability and providing an open-source GitHub tool for applying this method to any Transformer model.
Double-Relocation Policy Evaluation in Guangdong, China using Night Lights Data
ArcGIS: Final ProjectAlexander Karpekov
This project examines Guangdong's shifting economic growth using Night Lights data from satelites, focusing on development beyond the Pearl River Delta and the impact of 2008 government policies.
Skills
Programming
- Python
- SQL
- TypeScript
- R
- Stata
- C
- Java
ML & DS
- PyTorch
- Hugging Face
- TensorFlow
- Keras
- Scikit-learn
- Statsmodels
- XGBoost
Data & Viz
- NumPy
- Pandas
- SciPy
- Jupyter
- Colab
- Matplotlib
- Altair
- Plotnine
Frontend
- Svelte
- D3
- HTML
- Tailwind CSS
- Figma
- Illustrator
Languages
- English
- Russian
- French
- German
- Mandarin
- Latin
- Ancient Greek