Portfolio

This portfolio is a compilation of several (Personal and hosted) Data Science and Business Inteligence projects which I have carried out. Technologies are separated by category.


Business Iintelligence (BI) Projects

Adventure Works BI Report

report

Working with data from the popular Adventure Works database from Microsoft, I was able to transform the raw data, build a relational data model, add calculated fileds using DAX and finally deliver a professional-quality, end-to-end business intelligence report using Microsoft’s Power BI.


Time-Series Projects

Avocado Prices: Analysis and Predictions

notebook github

Analysing and predicting the trends in the average price of Avocados accross multiple US Markets uisng data from Kaggle


Machine Learning and AI Projects

Classify Song Genres from Audio Data

notebook github

Using a dataset comprised of songs of two music genres (Hip-Hop and Rock), a classifier was trained to distinguish between the two genres based only on track information derived from Echonest.

Restaurant Recommendation using Azure Matchbox Recommender

Azure Gallery

Cloud based project using sample data from Microsoft’s machine learning studio, a matchbox recommender algorithm(cognitive based filtering) was used to suggest restaurants for customers based off prior customer and restaurant data. The model scored a Normalized Discounted Cumulative Gain of 91%.

Predicting Credit Card Approvals

notebook github

Using data from the Credit Card Approval dataset to build a machine learning model with scikit-learn so as to predict if a credit card approval will get approved or not

Predicting User Engagement via Ads

notebook github

Working with a fake advertising data, I will use scikit-learn to predict if a particular internet user clicked on an ad based of certain certain features of each user.


Data Analysis Projects

Analyse Baseball Data

notebook github

Using MLBs statcast data to perform an Exploratory Data Analysis on Aaron Judge and Giancarlo Stanton.

Keeping Up With The Kardashians(and Jenners)

notebook github

Performing Exploratory Data Analysis using google trends data to find out who the most popular Kardashian is…. or Jenner.

Super-Bowl

notebook github

Analysing the Super-bowl over the years to find out if viewership, TV ratings, and ad cost evolved over time. Who the most prolific musicians in terms of halftime show performances and how the game affects television viewership


Natural Language Processing and Data Mining Projects

Word Frequency in Moby Dick

notebook github

Using the python libraries Beautiful Soup to scrape data off the web and Natural Language ToolKit (nltk) to process this data, we find the most frequent words in Herman Melville’s novel, Moby Dick, and how often these words occur


Design Projects

A/B Test on Mobile Game

notebook github

Analyzing an A/B test from a popular mobile puzzle game called ‘Cookie Cats’.