Skills

Languages:
- Spanish
- Python
- R
- Spark

I am proficient in Python, and I extensively use it in my projects. I am also familiar with R. Regarding Computer Science, I have a particularly good understanding of data structures, searching and sorting algorithms.

Data Cleaning & Wrangling:
- Pandas
- NumPy

During a data science project, I spend 70% of the time in the cleaning and organizing data stage. I use Pandas and NumPy for those steps. Those tools are handy for exploratory data analysis (EDA). EDA usually helps me discover patterns, test hypotheses, and check for assumptions with the help of summary statistics and graphical representations.

Data Visualization
- Matplotlib
- Seaborn
- Plotly
- Ggplot
- Holoviews

Data visualization is one of my favorite parts of the project life cycle because you see the data come to life. I have experience using Seaborn, Matplotlib, and Plotly. I enjoy working with Geoplot, Folium, and Holoviews. When I code in R, I find Ggplot to be the best.

Machine Learning:
- Scikit-Learn
- Supervised ML
- Unsupervised ML

When it comes to machine learning models, I prefer working with Scikit-Learn, but I am open to learning anything. I have used Scikit-Learn in previous projects. I enjoy using it since I have a good understanding of most of the algorithms. I am familiar with Natural Language Processing (NLP), Decision Tree, Random Forest, Logistic and Linear Regression, Support Vector Machines (SVM), Naïve Bayes (NB), Stochastic Gradient Descent (SGD), and K Nearest Neighbors (KNN).

Databases:
- SQL
- MongoDB
- NoSQL

For databases, I am most skilled in SQL and MongoDB. In the past, I worked with PostgreSQL.

Others:
- Tableau
- Jupyter Notebook
- Databricks
- Google Colab

When I begin a new project, I make sure to plan accordingly and organize myself in the best viable way. For that, version control tools are necessary, I use Git and GitHub on a regular basis.

Recent Projects

World Bank Debt Statistics Analysis

SQL queries to answer interesting questions about international debt using data from The World Bank.

Golden Age of Video Games Analysis

Use SQL joins and set theory to discover the best years for video games! In this project, I analyzed video game critic and user scores as well as sales data for the top 400 video games released since 1977. I searched for a golden age of video games by identifying release years that users and critics liked best, and I explored the business side of gaming by looking at game sales data. My search involved joining datasets and comparing results with set theory. I also filtered, grouped, and ordered data.

Advent of Code 2022

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels.
- Day 1 Calorie Counting
- Day 2 Rock Paper Scissors
- Day 3 Rucksack Reorganization
- Day 4 Camp Cleanup

The Kronos Incident: Geospatial

Geospatial-temporal patterns of life analysis using Holoviews, along with Pandas to combine various types of data in sensible ways to describe common daily routines for GASTech employees and identify up to twelve unusual events or patterns in the data.

Tableau Taco Analysis

Do good tacos mean less crime? Let us find out.

Ugly Christmas Sweater Data Analysis

About Retail data analysis using machine learning techniques in Python and Sklearn packages to help Ugly Christmas Party determine how many sweaters to order for each ugly Christmas sweater design created.