Now
2025
July
- Designed and implemented an LLM-based product ingestion service. Due to the latency of LLM inference, integrating this service with the existing product indexing pipeline would violate current SLA requirements.
- The new service is implemented in Golang (previously I want to use Python - but app size may affect service startup time). Some books helped me a lot:
Jun
-
On-device ML deployment: Upgrade our customized Mediapipe for the latest Unity (6.1) as well as the new SDK from Android & iOS. The in-house version is way too old and has many issues, I will probably need to fork mediapipe again and re-implement our SDKs.
-
Develop a product summarizer service using ChatGPT. POC is easy, developing a production-ready service is not. The main challenge is the fact that LLM is not a “real-time” service, and to save cost, we may need to use Batch API. For that, a whole new service need to be developed, focusing on batch processing and handling race condition between database updates and batch processing.
-
Read:
May
April
- Finished both Cloud Computing & Text Mining courses. Both are practical and I pretty like them, although I wish I can spent more times on some topics taught in lectures.
- Worked on:
- Non-English product search: adding non-English supports to the indexing backend and autocomplete services.
- Synonym set support: It’s more about allowing merchandisers to define their our product taxonomy, the most tricky part is how to adapt vector search to the customized definition from users w/o re-training models for each customer.
- Read:
Mar
- Spent a considerable effort to hunt down search bad cases reported by customers. Reproducing issues is easy, but actually pinning the root cause is a completely different story. Long story short, it is due to wrong expanding terms for sparse models.
- Developed new codebase for embedding finetuning. Loved uv after completely switching to it. I also reported some bugs to neural-cherche. Still struggled to get good performance on Splade, SparseEmbed gave me some hope, but not much.
- Reading
- Introduction to Information Retrieval
- Amazon Web Services in Action: Assisted in completing assignments.
- AI-Powered Search: Engaging read, though some chapters were slightly disappointing.
- Traveled to Bangkok: What a journey. Much fun, best food in SEA (aside from VN), and got scammed.
- Study:
- Tried some event detection models and implemented event deduplication. I don’t like the idea of using LLM for everything, but LLM actually is the best out-of-the-box solution for event detection.
- Lots of coding for the other project: learned more about OpenAPI, FastAPI, mongoDB …
Feb
-
A lot of coding & hacking:
- Developing backend services for sport facility recommendation.
- Play around with event detection and knowledge graph construction.
-
Reading
Jan
-
Working on. DNF:- Spell checker - I’ll write a post about modern design of spell checker for product search.
- Introduce personalization to rerank.
- Try LoRA on language models.
-
study@NUS: this semester I’ll take Cloud Computing and Text Mining courses.
-
Reading
- Cloud Computing
- Introduction to Information Retrieval
- Programming Pearls: Will go through all exercises in the book.
-
Done:
- DevOps, DataOps, MLOps: 2/5 - NOT Recommended.
- GNU Make Book: Review.
2024
Dec
- Reading The Silmarillion.
- Working on threshold optimization.
- Accuracy is maintained while performance improves between 50-80%, the first feature is released at the start of Y25.
- Improving multi-lingual search quality by finetuning BGE-M3.
- It was a huge pain 😠 😠. I stopped working on it, the training is not stable with some random spikes of gradients that completely destroys the training.
- Studying: MLOps, AWS.
- Overhauled my homepage. I like the simplicity of the new theme.
- Also added vale to my workflow.
- Decoupled the deployment of lora from the homepage.
- Upgraded to 11ty v3.
- Added Github comments
- (VN) Reviewed Dune.