Nicole

Experience & Education

Speciality

Large Model Acceleration Evangelist who declares "GPU minutes = enterprise goldmines." Bridges the gap between R&D and production deployment.

Experience

1.Deployment Engineer, Hugging Face (3 years)
2.Slashed AI financial chatbot latency from 3.2s to 0.8s
3.Inventor of WhaleFlux Real-time Monitoring System

Education

1.MS Machine Learning, University of Toronto
2.BEng Computer Engineering, National University of Singapore

Posts

Text Generation Inference: Scaling LLM Deployment with Hugging Face and WhaleFlux

Nicole 9 月 12, 2025

How to Split LLM Computation Across Different Computers: A Distributed Computing Guide

Nicole 9 月 12, 2025

How to List and Manage Models on vLLM Server: A Complete Guide

Nicole 9 月 11, 2025

How to Split and Serve Large Language Models Across GPUs: PowerInfer and Beyond

Nicole 9 月 11, 2025

LLM Companies and Their Notable Large Language Models

Nicole 8 月 28, 2025

How to Leverage LLM Tools to Enhance Your Professional Life

Nicole 8 月 28, 2025

How LLMs Answer Questions in Different Languages

Nicole 8 月 27, 2025

Token: The Hidden Currency Powering Large Language Models

Nicole 8 月 25, 2025

How LLM Applications Are Making Daily Tasks Way Easier？

Nicole 8 月 21, 2025