Now

by Le Tan Dang Khoa, Invalid DateTime

2025

The new requirement from Google for apps to publish to store, i.e, supporting 16kb page size took me several days to figure out.
During PR review, I need to revise the current C4 diagram & sequence diagrams of the new service. Previously I used mermaid for drawing, but later found planttext is much better.
Studied spectral theory and its applications in ML such as spectral clustering, Graph SVM, and recommendation.
Studied queueing theory for computer system performance analysis.
Read:

The new service for LLM-based production is about 80% done. Also, spent time reading queue systems, mostly SQS.
Got my hand dirty with AWS SDK, Redis, Docker for the brand new service. Also, learned a lot about writing production-ready Golang code.

Designed and implemented an LLM-based product ingestion service. Due to the latency of LLM inference, integrating this service with the existing product indexing pipeline would violate current SLA requirements.
The new service is implemented in Golang (previously I want to use Python - but app size may affect service startup time). Some books helped me a lot:

On-device ML deployment: Upgrade our customized Mediapipe for the latest Unity (6.1) as well as the new SDK from Android & iOS. The in-house version is way too old and has many issues, I will probably need to fork mediapipe again and re-implement our SDKs.
Develop a product summarizer service using ChatGPT. POC is easy, developing a production-ready service is not. The main challenge is the fact that LLM is not a “real-time” service, and to save cost, we may need to use Batch API. For that, a whole new service need to be developed, focusing on batch processing and handling race condition between database updates and batch processing.
Read:

Finished both Cloud Computing & Text Mining courses. Both are practical and I pretty like them, although I wish I can spent more times on some topics taught in lectures.
Worked on:
- Non-English product search: adding non-English supports to the indexing backend and autocomplete services.
- Synonym set support: It’s more about allowing merchandisers to define their our product taxonomy, the most tricky part is how to adapt vector search to the customized definition from users w/o re-training models for each customer.
Read:
- AI-Powered Search.
- Algorithmic Thinking.

Spent a considerable effort to hunt down search bad cases reported by customers. Reproducing issues is easy, but actually pinning the root cause is a completely different story. Long story short, it is due to wrong expanding terms for sparse models.
Developed new codebase for embedding finetuning. Loved uv after completely switching to it. I also reported some bugs to neural-cherche. Still struggled to get good performance on Splade, SparseEmbed gave me some hope, but not much.
Reading
- Introduction to Information Retrieval
- Amazon Web Services in Action: Assisted in completing assignments.
- AI-Powered Search: Engaging read, though some chapters were slightly disappointing.
Traveled to Bangkok: What a journey. Much fun, best food in SEA (aside from VN), and got scammed.
Study:
- Tried some event detection models and implemented event deduplication. I don’t like the idea of using LLM for everything, but LLM actually is the best out-of-the-box solution for event detection.
- Lots of coding for the other project: learned more about OpenAPI, FastAPI, mongoDB …

A lot of coding & hacking:
- Developing backend services for sport facility recommendation.
- Play around with event detection and knowledge graph construction.
Reading
- Speech and Language Processing
- DNF Mastering Regular Expressions. My short review

~~Working on~~ . DNF:
- Spell checker - I’ll write a post about modern design of spell checker for product search.
- Introduce personalization to rerank.
- Try LoRA on language models.
study@NUS: this semester I’ll take Cloud Computing and Text Mining courses.
Reading
- Cloud Computing
- Introduction to Information Retrieval
- Programming Pearls: Will go through all exercises in the book.
Done:
- DevOps, DataOps, MLOps: 2/5 - NOT Recommended.
- GNU Make Book: Review.

Reading The Silmarillion.
Working on threshold optimization.
- Accuracy is maintained while performance improves between 50-80%, the first feature is released at the start of Y25.
Improving multi-lingual search quality by finetuning BGE-M3.
- It was a huge pain 😠 😠. I stopped working on it, the training is not stable with some random spikes of gradients that completely destroys the training.
Studying: MLOps, AWS.
Overhauled my homepage. I like the simplicity of the new theme.
- Also added vale to my workflow.
- Decoupled the deployment of lora from the homepage.
- Upgraded to 11ty v3.
- Added Github comments
(VN) Reviewed Dune.