Now

2025

Nov

Oct

Sep

August

July

  • Designed and im­ple­mented an LLM-based prod­uct in­ges­tion ser­vice. Due to the la­tency of LLM in­fer­ence, in­te­grat­ing this ser­vice with the ex­ist­ing prod­uct in­dex­ing pipeline would vi­o­late cur­rent SLA re­quire­ments.
  • The new ser­vice is im­ple­mented in Golang (previously I want to use Python - but app size may af­fect ser­vice startup time). Some books helped me a lot:

Jun

  • On-device ML de­ploy­ment: Upgrade our cus­tomized Mediapipe for the lat­est Unity (6.1) as well as the new SDK from Android & iOS. The in-house ver­sion is way too old and has many is­sues, I will prob­a­bly need to fork me­di­apipe again and re-im­ple­ment our SDKs.

  • Develop a prod­uct sum­ma­rizer ser­vice us­ing ChatGPT. POC is easy, de­vel­op­ing a pro­duc­tion-ready ser­vice is not. The main chal­lenge is the fact that LLM is not a real-time” ser­vice, and to save cost, we may need to use Batch API. For that, a whole new ser­vice need to be de­vel­oped, fo­cus­ing on batch pro­cess­ing and han­dling race con­di­tion be­tween data­base up­dates and batch pro­cess­ing.

  • Read:

May

Apr

  • Finished both Cloud Computing & Text Mining courses. Both are prac­ti­cal and I re­ally liked them, al­though I wish I could have spent more time on some top­ics taught in lec­tures.
  • Worked on:
    • Non-English prod­uct search: adding non-Eng­lish sup­ports to the in­dex­ing back­end and au­to­com­plete ser­vices.
    • Synonym set sup­port: It’s more about al­low­ing mer­chan­dis­ers to de­fine their own prod­uct tax­on­omy, the most tricky part is how to adapt vec­tor search to the cus­tomized de­f­i­n­i­tion from users w/​o re-train­ing mod­els for each cus­tomer.
  • Read:

Mar

  • Spent a con­sid­er­able ef­fort to hunt down search bad cases re­ported by cus­tomers. Reproducing is­sues is easy, but ac­tu­ally pin­ning the root cause is a com­pletely dif­fer­ent story. Long story short, it is due to wrong ex­pand­ing terms for sparse mod­els.
  • Developed new code­base for em­bed­ding fine­tun­ing. Loved uv af­ter com­pletely switch­ing to it. I also re­ported some bugs to neural-cherche. Still strug­gled to get good per­for­mance on Splade, SparseEmbed gave me some hope, but not much.
  • Reading
  • Traveled to Bangkok: What a jour­ney. Much fun, best food in SEA (aside from VN), and got scammed.
  • Study:
    • Tried some event de­tec­tion mod­els and im­ple­mented event dedu­pli­ca­tion. I don’t like the idea of us­ing LLM for every­thing, but LLM ac­tu­ally is the best out-of-the-box so­lu­tion for event de­tec­tion.
    • Lots of cod­ing for the other pro­ject: learned more about OpenAPI, FastAPI, mon­goDB …

Feb

Jan

2024

Dec

  • Reading The Silmarillion.
  • Working on thresh­old op­ti­miza­tion.
    • Accuracy is main­tained while per­for­mance im­proves be­tween 50 – 80%, the first fea­ture is re­leased at the start of Y25.
  • Improving multi-lin­gual search qual­ity by fine­tun­ing BGE-M3.
    • It was a huge pain 😠 😠. I stopped work­ing on it, the train­ing is not sta­ble with some ran­dom spikes of gra­di­ents that com­pletely de­stroys the train­ing.
  • Studying: MLOps, AWS.
  • Overhauled my home­page. I like the sim­plic­ity of the new theme.
    • Also added vale to my work­flow.
    • Decoupled the de­ploy­ment of lora from the home­page.
    • Upgraded to 11ty v3.
    • Added Github com­ments
  • (VN) Reviewed Dune.

2023

May