Now

2026

Jan

  • Played with OpenCode.
  • Reading Recursion via Pascal
  • Reading Make Time.
  • Brought many old posts back to life.
  • Improved the sta­bil­ity of my LLM-based in­ges­tion pipeline in­clud­ing bet­ter mes­sage queue con­fig­u­ra­tion, im­ple­mented retry mech­a­nism, and fall­back logic.
  • In the process of mov­ing to Linux, Windows now be­comes un­bear­able for daily use.
  • Developed a new prod­uct un­der­stand­ing model for gen­eral prod­uct. It’s a multi-step agent that lever­ages pre­de­fined tax­on­omy to ex­tract prod­uct at­trib­utes, gen­er­ate prod­uct de­scrip­tion, and cat­e­go­rize prod­ucts.

2025

Dec

  • Completed my study at NUS. What a jour­ney it has been!
  • My pro­posed LLM-enhanced in­ges­tion pipeline has gone live. Some un­ex­pected is­sues popped up dur­ing ini­tial launch.
  • Developing LLM agents for gen­eral prod­uct in­ges­tion pipeline. Thought about us­ing knowl­edge graph to en­hance the model ca­pac­ity, but I did not have enough time to ex­plore this di­rec­tion.
  • Played with Raku.
  • Reading:

Nov

Oct

Sep

August

July

  • Designed and im­ple­mented an LLM-based prod­uct in­ges­tion ser­vice. Due to the la­tency of LLM in­fer­ence, in­te­grat­ing this ser­vice with the ex­ist­ing prod­uct in­dex­ing pipeline would vi­o­late cur­rent SLA re­quire­ments.
  • The new ser­vice is im­ple­mented in Golang (previously I wanted to use Python - but app size may af­fect ser­vice startup time). Some books helped me a lot:

Jun

  • On-device ML de­ploy­ment: Upgrade our cus­tomized Mediapipe for the lat­est Unity (6.1) as well as the new SDK from Android & iOS. The in-house ver­sion is way too old and has many is­sues, I will prob­a­bly need to fork me­di­apipe again and re-im­ple­ment our SDKs.

  • Develop a prod­uct sum­ma­rizer ser­vice us­ing ChatGPT. POC is easy, de­vel­op­ing a pro­duc­tion-ready ser­vice is not. The main chal­lenge is the fact that LLM is not a real-time” ser­vice, and to save cost, we may need to use Batch API. For that, a whole new ser­vice needs to be de­vel­oped, fo­cus­ing on batch pro­cess­ing and han­dling race con­di­tion be­tween data­base up­dates and batch pro­cess­ing.

  • Read:

May

Apr

  • Finished both Cloud Computing & Text Mining courses. Both are prac­ti­cal and I re­ally liked them, al­though I wish I could have spent more time on some top­ics taught in lec­tures.
  • Worked on:
    • Non-English prod­uct search: adding non-Eng­lish sup­ports to the in­dex­ing back­end and au­to­com­plete ser­vices.
    • Synonym set sup­port: It’s more about al­low­ing mer­chan­dis­ers to de­fine their own prod­uct tax­on­omy, the most tricky part is how to adapt vec­tor search to the cus­tomized de­f­i­n­i­tion from users w/​o re-train­ing mod­els for each cus­tomer.
  • Read:

Mar

  • Spent a con­sid­er­able ef­fort to hunt down search bad cases re­ported by cus­tomers. Reproducing is­sues is easy, but ac­tu­ally pin­ning the root cause is a com­pletely dif­fer­ent story. Long story short, it is due to wrong ex­pand­ing terms for sparse mod­els.
  • Developed new code­base for em­bed­ding fine­tun­ing. Loved uv af­ter com­pletely switch­ing to it. I also re­ported some bugs to neural-cherche. Still strug­gled to get good per­for­mance on Splade, SparseEmbed gave me some hope, but not much.
  • Reading
  • Traveled to Bangkok: What a jour­ney. Much fun, best food in SEA (aside from VN), and got scammed.
  • Study:
    • Tried some event de­tec­tion mod­els and im­ple­mented event dedu­pli­ca­tion. I don’t like the idea of us­ing LLM for every­thing, but LLM ac­tu­ally is the best out-of-the-box so­lu­tion for event de­tec­tion.
    • Lots of cod­ing for the other pro­ject: learned more about OpenAPI, FastAPI, mon­goDB …

Feb

Jan

2024

Dec

  • Reading The Silmarillion.
  • Working on thresh­old op­ti­miza­tion.
    • Accuracy is main­tained while per­for­mance im­proves be­tween 50 – 80%, the first fea­ture is re­leased at the start of Y25.
  • Improving multi-lin­gual search qual­ity by fine­tun­ing BGE-M3.
    • It was a huge pain 😠 😠. I stopped work­ing on it, the train­ing is not sta­ble with some ran­dom spikes of gra­di­ents that com­pletely de­stroys the train­ing.
  • Studying: MLOps, AWS.
  • Overhauled my home­page. I like the sim­plic­ity of the new theme.
    • Also added vale to my work­flow.
    • Decoupled the de­ploy­ment of lora from the home­page.
    • Upgraded to 11ty v3.
    • Added Github com­ments
  • (VN) Reviewed Dune.

2023

May