Creating a Development Environment for SQL

by Kevin bonds, at 21 August 2025, category : SQL

My past few positions have had me using a lot of Python, R, Docker, Airflow etc., and not as much pure SQL for analysis. As a result, my SQL skills are a little rusty. So I decided to create a development environment for SQL in order to practice. In this quick post I’ll outline how I setup a PostgreSQL environment in a docker container. There are plenty of practice environments online, but I want to have certain data tables available to practice analyzing business data specifically (MAU, ARPU, Revenue, etc.).

Once finished We’ll be able to connect, to the db, and query like below:

library(DBI)
library(RPostgres)

con <- dbConnect(RPostgres::Postgres(), host="localhost", 
                 port=5432, user="anton", password="gilfoyle", dbname="piper")
select * from orders limit 5
Table 1: 5 records
order_date user_id order_id meal_id order_quantity
2018-06-01 0 0 14 2
2018-06-01 0 0 15 1
2018-06-01 0 0 17 1
2018-06-01 1 1 16 2
2018-06-01 1 1 17 1

Setting up a Docker Container with Docker Compose

Let’s get started with a simple docker container using Docker Compose. Create a docker-compose.yml file with the below.

version: '3.8'
services:
  db:
    image: postgres:16-alpine # Or a specific version like postgres:14-alpine
    restart: always
    environment:
      POSTGRES_DB: piper
      POSTGRES_USER: anton
      POSTGRES_PASSWORD: gilfoyle
    ports:
      - "5432:5432" # Maps host port 5432 to container port 5432
    volumes:
      - db_data:/var/lib/postgresql/data # Persists data in a named volume
      - ./csv:/csv

volumes:
  db_data: # Define the named volume

This installs postgres16, creates the environment and mounts a few volumes.

We can start the container by:

docker-compose up --build

To build and stand-up the image from a terminal window.

make sure to have postgres and libpq installed using your preferred method (here homebrew):

brew install postgresql 
brew install libpq 

Connect to the Database with psql

Then open an new terminal in our working directory and issue:

psql -h localhost -U anton -d piper 

This will open psql and allow us to execute sql commands. You should see piper=# as a prompt.

As a Hello World we can create a simple table. Paste in the code below:

Create Example MTCARS Table

CREATE TABLE mtcars (
    car_name VARCHAR(50) PRIMARY KEY,
    mpg NUMERIC,
    cyl INT,
    disp NUMERIC,
    hp INT,
    drat NUMERIC,
    wt NUMERIC,
    qsec NUMERIC,
    vs INT,
    am INT,
    gear INT,
    carb INT
);

INSERT INTO mtcars
  (car_name, mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb)
VALUES
  ('Mazda RX4', 21, 6, 160, 110, 3.9, 2.62, 16.46, 0, 1, 4, 4),
  ('Mazda RX4 Wag', 21, 6, 160, 110, 3.9, 2.875, 17.02, 0, 1, 4, 4),
  ('Datsun 710', 22.8, 4, 108, 93, 3.85, 2.32, 18.61, 1, 1, 4, 1),
  ('Hornet 4 Drive', 21.4, 6, 258, 110, 3.08, 3.215, 19.44, 1, 0, 3, 1),
  ('Hornet Sportabout', 18.7, 8, 360, 175, 3.15, 3.44, 17.02, 0, 0, 3, 2),
  ('Valiant', 18.1, 6, 225, 105, 2.76, 3.46, 20.22, 1, 0, 3, 1),
  ('Duster 360', 14.3, 8, 360, 245, 3.21, 3.57, 15.84, 0, 0, 3, 4),
  ('Merc 240D', 24.4, 4, 146.7, 62, 3.69, 3.19, 20, 1, 0, 4, 2),
  ('Merc 230', 22.8, 4, 140.8, 95, 3.92, 3.15, 22.9, 1, 0, 4, 2),
  ('Merc 280', 19.2, 6, 167.6, 123, 3.92, 3.44, 18.3, 1, 0, 4, 4),
  ('Merc 280C', 17.8, 6, 167.6, 123, 3.92, 3.44, 18.9, 1, 0, 4, 4),
  ('Merc 450SE', 16.4, 8, 275.8, 180, 3.07, 4.07, 17.4, 0, 0, 3, 3),
  ('Merc 450SL', 17.3, 8, 275.8, 180, 3.07, 3.73, 17.6, 0, 0, 3, 3),
  ('Merc 450SLC', 15.2, 8, 275.8, 180, 3.07, 3.78, 18, 0, 0, 3, 3),
  ('Cadillac Fleetwood', 10.4, 8, 472, 205, 2.93, 5.25, 17.98, 0, 0, 3, 4),
  ('Lincoln Continental', 10.4, 8, 460, 215, 3, 5.424, 17.82, 0, 0, 3, 4),
  ('Chrysler Imperial', 14.7, 8, 440, 230, 3.23, 5.345, 17.42, 0, 0, 3, 4),
  ('Fiat 128', 32.4, 4, 78.7, 66, 4.08, 2.2, 19.47, 1, 1, 4, 1),
  ('Honda Civic', 30.4, 4, 75.7, 52, 4.93, 1.615, 18.52, 1, 1, 4, 2),
  ('Toyota Corolla', 33.9, 4, 71.1, 65, 4.22, 1.835, 19.9, 1, 1, 4, 1),
  ('Toyota Corona', 21.5, 4, 120.1, 97, 3.7, 2.465, 20.01, 1, 0, 3, 1),
  ('Dodge Challenger', 15.5, 8, 318, 150, 2.76, 3.52, 16.87, 0, 0, 3, 2),
  ('AMC Javelin', 15.2, 8, 304, 150, 3.15, 3.435, 17.3, 0, 0, 3, 2),
  ('Camaro Z28', 13.3, 8, 350, 245, 3.73, 3.84, 15.41, 0, 0, 3, 4),
  ('Pontiac Firebird', 19.2, 8, 400, 175, 3.08, 3.845, 17.05, 0, 0, 3, 2),
  ('Fiat X1-9', 27.3, 4, 79, 66, 4.08, 1.935, 18.9, 1, 1, 4, 1),
  ('Porsche 914-2', 26, 4, 120.3, 91, 4.43, 2.14, 16.7, 0, 1, 5, 2),
  ('Lotus Europa', 30.4, 4, 95.1, 113, 3.77, 1.513, 16.9, 1, 1, 5, 2),
  ('Ford Pantera L', 15.8, 8, 351, 264, 4.22, 3.17, 14.5, 0, 1, 5, 4),
  ('Ferrari Dino', 19.7, 6, 145, 175, 3.62, 2.77, 15.5, 0, 1, 5, 6),
  ('Maserati Bora', 15, 8, 301, 335, 3.54, 3.57, 14.6, 0, 1, 5, 8),
  ('Volvo 142E', 21.4, 4, 121, 109, 4.11, 2.78, 18.6, 1, 1, 4, 2)
;

MTCARS is a famous sample data set. Now we can query it.

select * from mtcars limit 5
Table 2: 5 records
car_name mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2

Loading csv Files as Tables

I won’t go into great detail on how to load all the tables but here is a quick example.

A good way to populate some sample data is to curl some files exec int the container, copy the files. From the terminal:

curl "https://assets.datacamp.com/production/repositories/4016/datasets/606e6e9165c25477db078996fa7e0a3e994b93d3/orders.csv" > orders.csv

docker exec -it <container-id> bash

cp orders.csv var/lib/postgresql/.

Now create an orders table in the container using psql like:

DROP TABLE IF EXISTS "orders";
CREATE TABLE orders (
  order_date DATE,
  user_id INT,
  order_id INT,
  meal_id INT,
  order_quantity INT
);

Now use the psql COPY command to copy the csv into the orders table

COPY orders FROM '../orders.csv' DELIMITER ',' CSV HEADER;

Now you have a nice little development environment to practice SQL based data analysis. Enjoy!

Some Analysis of Customer Order Data

Let’s say you want to take a look at the change in frequent vs infrequent customers over time. We’ll user data from a fictional food delivery service. We’ll define “frequent” users as those who place 4 or more orders a month and infrequent as those who placed less. Let’s look at the sum of orders for both groups to see how many orders each place in total each month (i.e. How many total orders each group contributes to the total).

-- frequent vs infrequent

with order_count_by_user as (
    select
        user_id,
        DATE_TRUNC('month', order_date)::DATE as delivr_month,
        COUNT(distinct order_id) as order_count
    from orders
    group by delivr_month, user_id
),

frequent_users as (
    select
        user_id,
        delivr_month,
        order_count
    from order_count_by_user
    where order_count >= 4
    order by user_id
),

sum_frequent as (
    select
        frequent_users.delivr_month,
        SUM(frequent_users.order_count) as order_sum_frequent
    from frequent_users
    group by frequent_users.delivr_month
    order by frequent_users.delivr_month
),

infrequent_users as (
    select
        user_id,
        delivr_month,
        order_count
    from order_count_by_user
    where order_count < 4
    order by user_id
),

sum_infrequent as (
    select
        infrequent_users.delivr_month,
        SUM(infrequent_users.order_count) as order_sum_infrequent
    from infrequent_users
    group by infrequent_users.delivr_month
    order by infrequent_users.delivr_month
)

select
    sum_frequent.delivr_month,
    order_sum_frequent,
    order_sum_infrequent
from sum_frequent
inner join sum_infrequent
    on sum_frequent.delivr_month = sum_infrequent.delivr_month;
Table 3: 7 records
delivr_month order_sum_frequent order_sum_infrequent
2018-06-01 74 208
2018-07-01 95 350
2018-08-01 122 548
2018-09-01 214 791
2018-10-01 505 1044
2018-11-01 1198 1386
2018-12-01 3472 1344

Now let’s calculate the percent of users that are frequent vs infrequent, by count of users, to see how this is trending.

-- percent frequent users by count

with order_count_by_user as (
    select
        user_id,
        DATE_TRUNC('month', order_date)::DATE as delivr_month,
        COUNT(distinct order_id) as order_count
    from orders
    group by delivr_month, user_id
),

frequent_users as (
    select
        user_id,
        delivr_month,
        order_count
    from order_count_by_user
    where order_count >= 4
    order by user_id
),

count_frequent as (
    select
        frequent_users.delivr_month,
        COUNT(distinct frequent_users.user_id) as user_count_frequent
    from frequent_users
    group by frequent_users.delivr_month
    order by frequent_users.delivr_month
),

infrequent_users as (
    select
        user_id,
        delivr_month,
        order_count
    from order_count_by_user
    where order_count < 4
    order by user_id
),

count_infrequent as (
    select
        infrequent_users.delivr_month,
        COUNT(distinct infrequent_users.user_id) as user_count_infrequent
    from infrequent_users
    group by infrequent_users.delivr_month
    order by infrequent_users.delivr_month
),

all_users as (
    select
        user_id,
        delivr_month,
        order_count
    from order_count_by_user
    order by user_id
),

count_all_users as (
    select
        all_users.delivr_month,
        COUNT(distinct all_users.user_id) as user_count_total
    from all_users
    group by all_users.delivr_month
    order by all_users.delivr_month
)

select
    count_frequent.delivr_month,
    user_count_frequent,
    user_count_infrequent,
    user_count_total,
    ROUND(user_count_frequent / user_count_total::NUMERIC, 3)
        as perc_users_frequent
from count_frequent
inner join count_infrequent
    on count_frequent.delivr_month = count_infrequent.delivr_month
inner join
    count_all_users
    on count_frequent.delivr_month = count_all_users.delivr_month;
Table 4: 7 records
delivr_month user_count_frequent user_count_infrequent user_count_total perc_users_frequent
2018-06-01 17 106 123 0.138
2018-07-01 22 204 226 0.097
2018-08-01 28 309 337 0.083
2018-09-01 48 441 489 0.098
2018-10-01 113 576 689 0.164
2018-11-01 257 687 944 0.272
2018-12-01 668 599 1267 0.527

The ratio of frequent to infrequent customers is increasing over time. As well, total orders and customer counts are also growing. These are good signs of a healthy service.

… to be continued