Creating a Development Environment for SQL

by Kevin bonds, at 21 August 2025, category : SQL

My past few positions have had me using a lot of Python, R, Docker, Airflow etc., and not as much pure SQL for analysis. As a result, my SQL skills are a little rusty. So I decided to create a development environment for SQL in order to practice. In this quick post I’ll outline how I setup a PostgreSQL environment in a docker container. There are plenty of practice environments online, but I want to have certain data tables available to practice analyzing business data specifically (MAU, ARPU, Revenue, etc.).

read more

A/B/N Testing in Python

by Kevin bonds, at 07 June 2025, category : Ab testing Python Eda

The following case study will illustrate how to analyze the results for an A/N Test (or multitest). An A/N Test is a type of A/B Test in which multiple variants are tested at the same time.

We’ll compare 2 variants, against a control, to increase purchase rate on a fictional website. Since testing multiple variants at once increases the error rate (known as Family Wise Error Rate–FWER), we’ll use a correction when determining statistical significance.

Along the way, I’ll warn against some common mistakes when designing and interpreting results of experiments. And touch on the sticky subject of P-values and what they mean (and don’t mean). Hope you find it informative.

read more

A/B Testing in R

by Kevin bonds, at 13 May 2023, category : Ab testing Eda

A stakeholder may ask if a particular change, to an application, will make a user more likely to make a purchase (or more likely to make a larger purchase, etc.). These types of questions are excellent candidates for a controlled experiment–known as A/B Testing. To answer these questions, a data scientist must apply good testing methods; and understand well, certain statistical concepts to evaluate the experiment effectively. A/B testing can be tricky to conduct without bias and difficult to evaluate. And like all hypothesis testing, there is a certain amount of uncertainty inherent. It’s this uncertainty that the Data Scientist attempts to quantify and explain.

read more