I am a Data Analyst working in a technology-based company for over a year now, with past experience of working with AI models for industry use cases during my past work.
As a Data Analyst, I usually have to handle to data requests by other departments as I have been the only data analyst in the company since I have joined until recently. Sometimes, I would just provide simple data according to their needs, while recommending some extra metrics that I find useful for their use cases. For more complicated tasks, I would have to plan the metrics to be tracked or used for analysis. Until the end of the tasks, I would provide reports or dashboards along with analysis results & recommendations. To sum up, these tasks would utilize the skills of data analysis, data visualization, machine learning, deep learning, and deployment. I have also written some articles on Medium publications to share some things I have learnt. Scroll down to find out about the portfolio projects I have done on my own. These are not company projects as they are confidential.
A case study done in the Google Data Analytics Certificate program.
Smartwatch data was analyzed to
provide high-level recommendations to the company. This project makes use of problem solving and data
analysis skills to tackle a real-world scenario in a company.
The six phases of data analysis life cycle are incorporated into the
case study. The life cycle
consists of Ask, Prepare, Process, Analyze, Share, and Act, which begins from ascertaining the
business task at hand, to providing analysis results and recommendations to the company.
Photo by Andres Urena on Unsplash
A recommendation system based on collaborative filtering and embeddings of
user ratings was built using Tensorflow and Keras.
Then, a website built with Flask was containerized and served with
Nginx server using Docker and Docker Compose.
The website was then deployed to a Google Compute Engine's Virtual Machine instance in
Google Cloud Platform.
NOTE: The website is only accessible until Sept 2021 due to free
trial.
This project actually consists of three different parts.
The first
part used SQL queries to explore and analyze worldwide COVID-19 data,
then the second
part used some of the query results to transfer to Tableau
for visualizations.
The third
part then consists of SQL queries used for data cleaning, including but not limited to
handling missing data and removing duplicates.
A case study done in the Google Data Analytics Certificate program. Smartwatch data was analyzed to provide high-level recommendations to the company.
Photo by Andres Urena on Unsplash
Creating a recommendation system using Tensorflow.
Built a website with Flask and deployed to Google Cloud Platform.
Website accessible until Sept 2021.
Created interactive visualizations of COVID-19 cases in Malaysia including choropleth map in Python and deployed online with Streamlit.
This project includes exploration and data cleaning using SQL, as well as a Tableau visualization of COVID-19 data.
Predicting positive or negative sentiments from IMDB movie reviews.
PyTorch implementation using BERT model was also tested.
NOTE: The Web App takes awhile to load due to Heroku's limitations.
Photo by Jakob Owens on Unsplash
Predicting used car prices based on the features of the car. The dataset used consists of many missing and inaccurate data as they were scraped from Craigslist, causing the modelling process to be more difficult.
Predicting Facebook stock price using time series analysis and algorithms including ARIMA, Prophet, and LSTM.
Predicting whether a person is suffering from stroke based on an imbalanced dataset.