WeatherData

Title: Weather Data Portfolio Project, Full Data Pipeline.

To see the full codebase for this project: Link to my github account

Description:

A project intended to build out a full data pipeline for rich location/hourly weather observation metrics from 2005 - till the present day. The purpose is to have a fully updatable Postgres Data Warehouse that can be mined for exploration and analytics on a regular basis.
Purpose:

The ultimate purpose of this project was to produce a clean, historical Data Warehouse of by location, by hour observational weather data to be able to analyze the impact of baromtric pressure changes over time, and in comparison of locations. The data warehouse is to serve as the main repository of information to be able to explore and analyze.

pressure_days

Data Pipeline Process:
Technologies:
  1. Python and various standard library modules.
  2. The Pandas and Numpy third-party packages.
  3. SQLAlchemy and SQLAlchemy ORM.
  4. Postgres database.
  5. Knowledge of data cleaning and tidying.
  6. Advanced SQL techniques including: CTE’s, Window Functions and CASE Statements for data analysis and aggregation.
  7. Command Line and Bash Scripting.
Folder Structure:

Main Level: Includes the python scripts, jupyter notebook and bash scripts as well as the folders for the following:

Running the Bash Script:

Not produced yet.

Collaborators:

Thank you to the National Oceanic and Atmospheric Administration for making all of your rich data available to the masses.

Licen