All Posts

Published on
January 12, 2022
Building open source projects - Lessons learned (2021)
open-source musings notes
A review of the things I have learned from building in the open over the past year. Thoughts and reflections on what it takes to grow a project and the difficulty translating open-source success to commerical success.
Published on
December 26, 2021
Building a Responsible AI Solution - Principles into Practice
fairness ethics veritas responsible-ai data-science singapore
Translating responsible AI principles to create VerifyML. User feedback, design decisions and architecture choices in creating our responsible AI solution.
Published on
September 16, 2021
A Human-centric Approach to Fairness in AI
fairness ethics veritas philosophy responsible-ai data-science singapore
Fairness is messy and complicated. Attempts to distil it down to a single metric is unhelpful and counter-productive. As business owners and model developers we should embrace the struggle in trying to apply fairness in artificial intelligence and data analytics models.
Published on
August 29, 2021
Creating a Rehype Syntax Highlighting Plugin
mdx javascript markdown notes
An exploration of markdown and HTML syntax trees. Documenting my experience creating rehype-prism-plus, a syntax highlighting plugin that creates pretty code blocks.
Published on
May 16, 2021
Automated Market Makers (AMM) Explained
defi amm crypto
This post introduces Automated Market Makers, a key protocol powering decentralized exchanges.
Published on
February 21, 2021
Growth Hacking Github - How to Get Github Stars
marketing musings notes
A good project is only one part of the puzzle. Getting stars is really all about marketing and promoting it. A guide on growth hacking a Github project.
Published on
February 7, 2021
TraceTogether and the Difficulty of Graph Anonymisation
graph-theory networks singapore privacy musings
An explanation of the challenges of graph anonymisation and the difficulty of striking a balance between usefulness and anonymity. Written as a response to Singapore's TraceTogether privacy saga
Published on
January 12, 2021
Introducing Tailwind Nextjs Starter Blog
next-js tailwind guide
Looking for a performant, out of the box template, with all the best in web technology to support your blogging needs? Checkout the Tailwind Nextjs Starter Blog template
Published on
August 1, 2020
Schelling's Segregation Model in Julia
julia learning-julia notes agent-based-models
Learn Julia by implementing Schelling's famous segregation model. You will see many similarities to Python - no types need to be specified (it's a dynamic language) and pick up some nice syntactical properties of Julia.
Published on
May 10, 2020
Benchmark of popular graph/network packages v2
benchmarks networks notes python r julia
A revised benchmark of graphs / network computation packages featuring an updated methodology and more comprehensive testing. Find out how Networkx, igraph, graph-tool, Networkit, SNAP and lightgraphs perform
Published on
March 29, 2020
Efficient Large Graph Propagation Algorithm
gcp networks notes python crypto
How we engineered a large scale label propagation algorithm at Cylynx
Published on
January 22, 2020
Serverless Machine Learning with R on Cloud Run
notes r visualization gcp serverless
The serverless way - using Google Cloud Platform to deploy simple machine learning models via Cloud Run. A fun weekend project that analyses the twitter-verse
Published on
December 17, 2019
Speeding up R Plotly web apps - R x Javascript
notes javascript r visualisation Dashboard
Tips and tricks to speed up R and plotly based web apps
Published on
May 5, 2019
Benchmark of popular graph/network packages
benchmarks networks notes python r julia
Benchmark of 5 popular graph/network packages - Networkx, igraph, graph-tool, Networkit and SNAP
Published on
February 11, 2019
Binance hackathon - 2nd place solution
javascript react visualisation networks notes crypto
Technical overview of our 2nd place solution and my experience at the Binance hackathon
Published on
January 5, 2019
Cleaning openstreetmap intersections in python
python spatial visualisation notes
In this post, I explore the problem of simplifying route intersections and document some Python code that can be used to clean and visualize Open Street Maps as a network representation
Published on
November 21, 2018
An Overview of the Singapore Hiring Landscape
r visualisation Singapore SG-Economy Web-Scraping
An exploratory analysis of a jobs posting dataset along with some tidbits on the Singapore hiring landscape
Published on
October 14, 2018
Visualising Networks in ASOIAF - Part II
r notes visualisation graph-theory networks
Part II in the network exploration of the Game of Thrones series. In this post, we combine the plots together and use gganimate to visualise relationships across all 5 books
Published on
September 9, 2018
Visualising Networks in ASOIAF
r notes visualisation graph-theory networks
A network exploration on the links between characters in the Game of Thrones series with the help of igraph and tidygraph
Published on
August 9, 2018
Applications of DAGs in Causal Inference
r dags notes musings causal-inference
Chains, Forks, Colliders, paths and d-seperation - how DAGs can contribute to better causal inference
Published on
June 19, 2018
Feature Selection Using Feature Importance Score - Creating a PySpark Estimator
python spark big-data
Extending Pyspark's MLlib native feature selection function by using a feature importance score generated from a machine learning model and extracting the variables that are plausibly the most important
Published on
April 28, 2018
Statistical Musings
personal musings
Published on
April 8, 2018
Creating a Custom Cross-Validation Function in PySpark
python spark big-data
Custom cross-validation class written in PySpark with support for user-defined category such as by time, geographical or consumer segments.
Published on
March 25, 2018
Uploading Jupyter Notebook Files to Blogdown
python blogdown
Simple guide to convert jupyter notebooks to markdown posts which can be published in your favourite static site generator
Published on
February 26, 2018
Notes on Regression - Approximation of the Conditional Expectation Function
regression ols notes
Deriving the OLS formula as a means of approximating the conditional expectation function
Published on
February 11, 2018
February Thoughts
personal musings
Published on
December 25, 2017
Notes on Graphs and Spectral Properties
graph-theory notes
A reference cheatsheet on adjacency matrix, incidence matrix, laplacian matrix and the basics of algebraic graph theory
Published on
November 23, 2017
Dashboard 2.0
Singapore SG-Economy Web-Scraping R Dashboard
SG Dashboard is now released and updated with Q3's economic results
Published on
November 18, 2017
Choosing a Control Group in a RCT with Multiple Treatment Periods
R notes simulation metrics
How should we choose the control group in a situation where we have multiple treatments and time periods? A simple statistical simulation exercise
Published on
November 5, 2017
November Reflections
personal musings
Published on
October 21, 2017
Notes on Regression - Singular Vector Decomposition
regression ols notes
Applying the SVD to the regression framework
Published on
October 11, 2017
Mapping SG - Shiny App
Singapore R spatial visualisation
An R shiny application with Leaflet
Published on
October 1, 2017
Comparing the Population and Group Level Regression
regression notes
To what extent do the coefficients obtained from a regression carried out at the group level correspond to the estimates at the individual level?
Published on
September 21, 2017
Notes on Regression - Maximum Likelihood
regression ols notes
Deriving the OLS estimator via the maximum likelihood approach
Published on
September 13, 2017
Using Leaflet in R - Tutorial
Singapore R spatial visualisation notes
A tutorial on using Leaflet in R for geospatial visualisation
Published on
September 10, 2017
Examining the Changes in Religious Beliefs - Part 2
Singapore R spatial sg-social visualisation
Exploring the changes in religious beliefs in Singapore between 2000 to 2015
Published on
August 31, 2017
Notes on Regression - Method of Moments
regression ols notes
Establishing the OLS formula via the method of moments approach
Published on
August 29, 2017
Mapping the Distribution of Religious Beliefs in Singapore
Singapore R spatial sg-social visualisation
Examining the spatial distribution of Singapore's population
Published on
August 23, 2017
Notes on Regression - Projection
regression ols notes
Deriving the OLS estimator - projection method
Published on
August 17, 2017
Thesis Thursday 7 - Conclusion
Thesis-Thursday R Stata
The last installment of the Thesis Thursday series - some miscellaneous thoughts and lessons learnt over the past few months
Published on
August 16, 2017
Notes on Regression - OLS
regression ols notes
This post is the first in a series of my study notes on regression techniques. It covers regression as a solution to the least squares minimisation problem
Published on
August 15, 2017
Update on the SG Economic Dashboard
Singapore SG-Economy R Dashboard
updated the SG-Dashboard with 2Q 2017 numbers
Published on
July 20, 2017
Thesis Thursday 6 - The Final Stretch
Thesis-Thursday
I find a positive correlation between the foreign-born and consumption shares within U.S. counties but this result does not hold across Asian countries. In fact, an increase in foreign-born share led to a decline in consumption of Asian-related consumer packaged goods
Published on
July 1, 2017
Thesis Thursday 5 - From recipes to weights
Thesis-Thursday R
In the previous post, I provided an exploratory analysis of the allrecipe dataset. This post is a continuation and details the construction of product weights from the recipe corpus
Published on
June 24, 2017
Thesis Thursday 4 - Analysing Recipes
Thesis-Thursday Web-Scraping R
A first look at the recipe dataset scraped from allrecipes.com
Published on
June 23, 2017
Thesis Thursday 3 - Model and Methodology
Thesis-Thursday
A mathematical model of my Thesis Thursday project with some discussion of endogeneity
Published on
June 11, 2017
Binscatter for R
binscatter R
Binscatter for R - a convenient plot to observe the relationship between two variables, especially when working with large datasets
Published on
June 9, 2017
Thesis Thursday 2
Thesis-Thursday
Over the past week I made a few detours and explored other options that yielded little. On the positive side, I managed to merge and clean most of the datasets and started generating some descriptive statistics to get a better understanding of the data
Published on
June 2, 2017
Thesis Thursday - Introduction
Thesis-Thursday
I decided to document my progress on my masters thesis as a weekly Thursday special. Hopefully I would have enough materials or progress to continue the weekly post but this should also give me some motivation to work on it
Published on
May 15, 2017
Hello World
personal
Welcome to my blog