Admissions Open 2026
Skip to content

Build Your First End-to-End Pipeline

Apr 28, 2026·12 min read

In this tutorial, we'll build a complete data pipeline that extracts data from a public API, transforms it using Python, loads it into a local SQLite database, and visualises the results with a simple dashboard. No cloud services required — everything runs on your laptop.

Step 1: Extract

We'll use the OpenWeatherMap API to fetch weather data. Start by signing up for a free API key. Then write a Python script that requests current weather data for your city and saves the raw JSON response.

import requests
import json

API_KEY = "your_key"
CITY = "Silchar"
url = f"https://api.openweathermap.org/data/2.5/weather?q={CITY}&appid={API_KEY}"

response = requests.get(url)
data = response.json()

with open("raw_weather.json", "w") as f:
    json.dump(data, f)

Step 2: Transform

Read the raw JSON and extract only the fields we care about: temperature, humidity, pressure, and weather description.

import json

with open("raw_weather.json") as f:
    data = json.load(f)

transformed = {
    "city": data["name"],
    "temp_c": round(data["main"]["temp"] - 273.15, 1),
    "humidity": data["main"]["humidity"],
    "pressure": data["main"]["pressure"],
    "description": data["weather"][0]["description"],
    "timestamp": data["dt"]
}

with open("weather_clean.json", "w") as f:
    json.dump(transformed, f)

Step 3: Load

Create a SQLite database and a table, then insert the cleaned data.

import sqlite3, json

conn = sqlite3.connect("weather.db")
conn.execute("CREATE TABLE IF NOT EXISTS weather (city TEXT, temp_c REAL, humidity INT, pressure INT, description TEXT, timestamp INT)")

with open("weather_clean.json") as f:
    row = json.load(f)

conn.execute("INSERT INTO weather VALUES (?, ?, ?, ?, ?, ?)",
             (row["city"], row["temp_c"], row["humidity"], row["pressure"], row["description"], row["timestamp"]))
conn.commit()

Step 4: Visualise

Use Streamlit to build a quick dashboard that reads from the database and plots temperature trends over time.

import streamlit as st
import sqlite3
import pandas as pd

conn = sqlite3.connect("weather.db")
df = pd.read_sql("SELECT * FROM weather ORDER BY timestamp", conn)

st.line_chart(df.set_index("timestamp")["temp_c"])
st.write(f"Average temperature: {df['temp_c'].mean():.1f}°C")

Automate It

Wrap the entire pipeline in a single Python script and schedule it with cron (Linux/macOS) or Task Scheduler (Windows). Congratulations — you've just built a production-ready data pipeline!