LTDK

Résumé

TLDR: shorter ver­sion


Experiences

Now

I’m cur­rently a Senior Machine Learning Engineer at Visenze work­ing on mak­ing prod­uct search bet­ter:

  • Re-designed Multisearch, i.e., craft­ing the new search logic, and de­vel­op­ing var­i­ous fea­tures to en­hance end-user ex­pe­ri­ences: rerank & rec­om­men­da­tion ser­vices, nat­ural fil­ter­ing, boost­ing mech­a­nism, adap­tive thresh­old, au­to­com­ple­tion ser­vice. I also lead the ef­fort to im­prove search qual­ity: from im­ple­ment­ing al­go­rithm code­base, ex­per­i­ment­ing & fine­tun­ing text em­bed­ding, to de­sign­ing vir­tual agents for an­no­ta­tion tasks.

  • Vector search en­gine: con­ducted ex­per­i­ments on vec­tor search al­go­rithms, bench­marked search qual­ity of SIMD-based vec­tor cal­cu­la­tion. [1], tuned op­ti­mal HNSW with repli­ca­tion set­tings.

  • Object Detection: over­hauled an­no­ta­tion guide­lines, up­graded the train­ing frame­work from Detectron to Pytorch[2], de­vel­oped and re­leased mul­ti­ple de­tec­tion mod­els across all Visenze so­lu­tions (search, tag­ging, rec­om­men­da­tion).

I also spent about 2 years at Visenze de­vel­op­ing on-de­vice ML so­lu­tions:

  1. Built a 3D ob­ject la­belling so­lu­tion. It com­prises: (1) an AR mo­bile app cap­tur­ing ob­jects and col­lect 3D data, (2) an Unity tool for im­port­ing and re­fin­ing raw data, and (3) a 3D ob­ject de­tec­tion code­base for train­ing, eval­u­a­tion and vi­su­al­iza­tion.
  2. Delivered vir­tual try-on (a sneak peak), ob­ject track­ing, and hand ges­ture recog­ni­tion so­lu­tions on mo­bile de­vices. I de­vel­oped DL mod­els de­signed for low mem­ory us­age and fast in­fer­ence to main­tain 30 FPS con­straint and in­te­grated it to the ex­ist­ing Unity games via na­tive plu­g­ins. The game has more than 10M down­loads on Play Store (Android), and stands at top-30 Education apps on iOS Store.

Other pro­jects I got in­volved:

  • Re-designed the se­quen­tial rec­om­men­da­tion al­go­rithm.
  • Improved ex­act prod­uct search via new a re-rank logic & ad­vanced data aug­men­ta­tions[3].

Past

Before join­ing Visenze, I spent 2.5 years as a re­search as­sis­tant in Prof Ngai-Man Cheung lab at SUTD fo­cus­ing on:

  • Hashing-based im­age re­trieval: We were the one of the first tried to jointly learn­ing hash­ing and vec­tor em­bed­ding in end-to-end man­ner for re­trieval task. I im­ple­mented most of the pro­posed meth­ods on Caffe and Torch (not Pytorch)[4] and con­tributed to 10 pub­li­ca­tions.
  • Vision-based lo­cal­iza­tion: We built a 3D model of Singapore from Google Maps to in­fer user lo­ca­tion based on build­ing pho­tos. I was re­spon­si­ble for re-im­ple­ment­ing the in­fer­ence frame­work (originally in MATLAB) to Android de­vices. The most chal­leng­ing task is to im­ple­ment fast SIFT fea­tures on Android de­vices us­ing shader pro­grams while main­tain­ing the ac­cu­racy.

Education

Now

Since 2023, I have been pur­su­ing Master of Computing (Artificial Intelligence Specialisation) (part-time course­works) at NUS. This pro­gram al­lows me to delve into AI top­ics that I don’t en­counter in my daily work. I’ve com­pleted sev­eral pro­jects:

  • Menu Recommendation us­ing Deep Learning Critic-Actor Framework [Github, Report]: Thanks to our team mem­ber, we have ac­cess to or­der his­tory of a F&B fran­chise in Singapore. I led im­ple­men­ta­tion and model train­ing. It helps me re­al­ize train­ing RL model is painful 😭😭.
  • Predicting HDB Rental Price in Singapore [Github, Report]: we stud­ied rental price in­creases in Singapore us­ing 2021 – 2023 HDB data and de­vel­oped sev­eral pre­dic­tion mod­els, achiev­ing Top #1 on the pri­vate leader­board of the com­pe­ti­tion.
  • Blood Cell Identification: I pro­posed an U-Net vari­ant which uti­lizes seg­men­ta­tion masks to clas­sify 5 leuko­cytes. It achieves 92.59% ac­cu­racy[5] on the CAM16 dataset.
  • LightningCat: Report: a cloud-based SaaS plat­form that pro­vides real-time up­dates on Singapore’s sport­ing fa­cil­i­ties, in­clud­ing op­er­a­tional sta­tus, ca­pac­ity, weather-re­lated clo­sures, book­ing slots, through an in­tu­itive map in­ter­face to op­ti­mize fa­cil­ity us­age and fos­ter com­mu­nity en­gage­ment.
  • TemporalLens [Github, Report]: an AI-driven pipeline track­ing emerg­ing top­ics, con­structs event time­lines & knowl­edge graphs, an­a­lyzes sen­ti­ment shifts.

Past

I com­pleted my un­der­grad­u­ate de­gree at HCMUS, where I had the op­por­tu­nity to en­gage in sev­eral in­tern­ships:


Skills

Programming Languages

  • Professions: Python, Go, C++, Java, C#.
  • Hobbies: Julia, D, Crystal, Raku, Clojure.

Tools

  • Professions: PyTorch, HuggingFace, Mediapipe, OpenCV, Sklearn, Unity, Docker, Latex, XGBoost.
  • Hobbies: Raylib, ImGui.

Misc

Build Status

Netlify Status


  1. Changing the de­fault quan­tized vec­tor quan­ti­za­tion from SDC to ADC. It seems stu­pidly sim­ple, but it was a huge pain since the en­gine API al­ways as­sumes that both query vec­tors and in­dex vec­tors are quan­tized ↩︎

  2. the sweet old an­cient time of DL ↩︎

  3. While do­ing this, I an­tic­i­pated Meta’s com­pe­ti­tion and got #12 on the leader­board ↩︎

  4. The good old day when DL en­gi­neers code a model in Lua ↩︎

  5. TBH, the dataset seems too easy. ↩︎

  6. My first pa­per ever. ↩︎