Samuel Epstein

Data analyst

After 10 years of training as a laboratory researcher, I greatly respect the power of data to provide us with informaiton about the world around us. I aim to use my programming skills in R, Python, and SQL and to harness my creativity and curiosity to explore deep quesitons that can be answered with data analysis. This portfolio features projects focused on political elections and scientific topics to display my coding experience and capabilities. My primary goal is to continue to learn new tools to answer these questions more effectively.


Interests: R/Python/SQL, Data visualization, Problem solving, Big data, Politics, Science


Projects

Benefits of being a popular second choice: Analysis of the 2023 Chicago mayoral election. 2023-04

Analysis of Chicago's dual round mayoral election results on the precinct level provides insight on Brandon Johnson's path to victory in the runoff eleciton.

RStudio R geojson R/ggplot2 ETL (Extract, Transform, Load) Integration data visualization Geographic information systems

Read More
Scraping and analysis of Stop and Shop receipts to study the impact of inflation on food prices. 2023-02

Five years of my friend's online Stop and Shop receipts were scraped using a Python script to collect data on the price of groceries over time to measure the impact of inflation. Plots and analysis using Python, R, and Tableau try to answer the question: how serious is the impact of inflation?

Python R Web Scraping Tableau Data Visualization

Read More
Design of a web application to map NYPL library locations for any book in the online catalog. 2023-01

A Python-based web-scraper is deployed via Flask app to gather information from the NYPL catalog to determine which libraries in the system stock a specified book. HTML/CSS/Javascript is used to render an interactive map to geographically position the found libraries and denote availability status of the particular item at that location.

Python Flask Google Cloud Platform Google Maps API HTML CSS Javascript

Read More
Voter analysis of NY-3 and NY-4 congressional districts shows reduced Democratic turnout in 2022 elections. 2023-01

Data aggregation of large voter files is performed with Pandas and mapped using GIS in R-programming, while intregrating data from pre- and post- 2022 congressional district boundaries to study voter turnout in 2022 midterm elections.

RStudio R Python shapefile Python/pandas R/rgdal R/sf R/ggplot2 R/dpylr ETL (Extract, Transform, Load) Integration data visualization Geographic information systems

Read More

See all 5 projects

Essays

From experiments to algorithms: How my research experience has shaped my approach to data analysis?

29 Oct 2022

From experiments to algorithms: How my research experience has shaped my approach to data analysis Through my decade of experience working in research labs, I have cultivated a set of skills that are highly valuable and transferable to careers in...

Science Research Data Analysis

Exploring data and encouraging creativity as a data analyst or programmer

16 Feb 2022

Exploring Data and Encouraging Creativity as a Data Analyst or Programmer As a data analyst or programmer, my curiosity often drives me to explore data and look for interesting trends and patterns. This curiosity is an essential part of my...

Creativity Curiosity Data Analysis R/Python