E-Commerce - Data Engineering
Tech Stack: Python, AWS, PySpark, MySQL, Tableau, ETL
- Extracted and imported e-commerce dataset into database and performed ETL functions, transforming data in accordance with business requirements, and stored it in data warehouse connected to Tableau to visualize historical data trends.
- Setup RDS MySQL instance and optimized ingestion of data from S3 to RDS using Lambda function.
- Automated ETL job (Glue with PySpark) to distribute data processing on potential larger dataset (400K rows).
- Used S3 as staging table with Athena during ETL job and loaded transformed data into Redshift warehouse.
- Connected Redshift to Tableau and translated business requirements into actionable reports.