Is it possible to beat lottery? In this post, we conduct data analysis of Toto, a Singapore lottery with twice-weekly draws.
Mapping Global Cuisine with Word Embeddings
How can we systemically identify analogous food items across cultures? Learn how using word embeddings.
Random Forest Tutorial: Predicting Goals in Soccer
Learn how a random forest model can help us to predict the probability of a goal, with applications ranging from performance appraisal to match-fixing detection.
Kernel Density Plots
Visualizing soccer data, we identify hot spots where most shots occur, shot preferences across players, and comparisons between different teams like Liverpool and Manchester United.
Self-Organizing Maps Tutorial
Visualize large datasets and identify potential clusters with this special breed of neural networks that uses neurons to learn the intrinsic shape of your data.
Layman’s Guide to A/B Testing
A/B tests help you decide between two options, A and B. Read this step-by-step guide on conducting your own A/B test to make the right decisions.
Time Series Analysis with Generalized Additive Models
Whenever you spot a trend plotted against time, you would be looking at a time series. The de facto choice for studying financial market performance and weather forecasts, time series are one of the most pervasive analysis techniques because of its inextricable relation to time - we are always interested to foretell the future.
Artificial Neural Networks Introduction (Part II)
In the 2nd part of our tutorial on artificial neural networks, we cover 3 techniques to improve prediction accuracy: distortion, mini-batch gradient descent and dropout.
k-Nearest Neighbors & Anomaly Detection Tutorial
Do you know what gives red and white wine their colors? Use k-NN to discover the chemical make-up that defines typical types of wines, as well as to detect atypical ones.
Random Forest Tutorial: Predicting Crime in San Francisco
Learn how random forests, an ensemble of decision trees, can help predict where and when a crime will happen in San Francisco, California.
Decision Trees Tutorial
Decision trees can be used to identify customer profiles or to predict who will resign. Using the Titanic dataset, learn about its advantages and pitfalls, as well as better alternatives.
Principal Component Analysis Tutorial
You are exploring the nutritional content of food. How can food items be differentiated? How might they be classified? PCA derives underlying variables that help you slice your data for these insights.
Build your own Deep Learning Box
Want to use deep learning for your analysis but don't know where to start? This tutorial teaches you how to build your own deep learning box, from hardware purchase to software installation.
Where Will Your Country Stand in World War III?
Using weapons trade data, we map out who's against who in the complex arena of international politics.
Association Rules and the Apriori Algorithm
You own a store. How do you discover purchasing patterns, such as which items tend to be bought together? Knowing this can improve your product placement and advertisement.
Artificial Neural Networks (ANN) Introduction
Modern smartphone apps allow you to recognize handwriting and convert them into typed words. We look at how we can train our own neural network algorithm to do this.
Regression & Correlation Tutorial
You have employees. But who should you pick to lead them? Learn how to predict leadership potential using multiple sources of personnel data, as well as pitfalls to watch out for.
Convolutional Neural Networks (CNN) Introduction
While an artificial neural network could learn to recognize a cat on the left, it would not recognize the same cat if it appeared on the right. To solve this problem, we introduce convolutional neural networks.
Multi-Arm Bandit & A/B Testing
You want to publish ads for your product. While you have 2 promising ad designs, you have a limited budget. How can you find out which ad is more effective, while maximizing the impact of all the ads you publish?
K-Means Clustering Tutorial
You have customers. But how should you categorize them to target sales? How many of such categories exist? To answer these questions, we can use cluster analysis.
K-Nearest Neighbor (KNN) Tutorial: Anomaly Detection
Outliers can be detected by algorithms used for predictions. To illustrate, we use the k-nearest neighbor (kNN) clustering algorithm.
Topic Modeling with LDA Introduction
Latent Dirichlet allocation (LDA) is a technique that automatically discovers topics that a set of documents contain. It is used to analyze large volumes of text efficiently. To find out how it works, check out this tutorial.
Automated Biography for a Nation
Singapore turns 50 years old in 2015. While Singaporeans are proud of our progress from 3rd to 1st world status - one wonders how this progress has been portrayed though the lens of global media. By examining Singapore-related news, could we predict Singapore's growth trends? Could we examine how much an export-dependent economy like Singapore is affected by world events?
You Are Who You Like
Research has shown that we like people similar to ourselves. But does this rule of attraction apply to our liking for fictional characters? Analysis of Star Wars character fans suggests so. Personality scores of Facebook users who had 'liked' Star Wars character pages were aggregated and profiled in this post.