Projects
Retrieval-Based QA System with News Articles
December 2024 - January 2025
Objective: Develop a Streamlit-based web application to extract, process, and analyze news articles by building embeddings using OpenAI models. The project involves creating a vector store with FAISS for efficient retrieval-based question answering (QA), allowing users to input news article URLs and interactively ask questions and get answers with sources displayed in the results.
Skills/Tools: Streamlit, Python, FAISS, LangChain, OpenAI API, Embeddings, Text Processing, Data Extraction, Vector Databases

Exploring Vital Health Factors in Heart Disease Risk Assessment
September 2024 - December 2024
Objective: Analyze heart disease risk factors using the Heart Disease UCI dataset. The study involves hypothesis testing in R to evaluate the individual contributions of age, gender, and key health factors, such as cholesterol levels, blood sugar levels, and resting blood pressure, to the occurrence of heart disease.
Skills/Tools: RStudio, R, Logistic Regression, Statistical Analysis, Hypothesis Testing, Correlation
_edited.png)
Credit Risk: Analyzing Financial and Behavioral Factors
September 2024 - December 2024
Objective: Develop and evaluate predictive models using the South German Credit dataset to classify individuals as “good” or “bad” credit risks. By analyzing demographic, financial, and account-related attributes, the project aims to identify key predictors of creditworthiness and compare the performance of models like Logistic Regression, KNN, Random Forest, Gradient Boosting, and Support Vector Classifier for accurate credit risk prediction.
Skills/Tools: Python, Machine Learning, Exploratory Data Analysis, Feature Engineering, Data Cleaning, Logistic Regression, Random Forest, Gradient Boosting, Support Vector Classifier, K-Neared Neighbours Classifier

Exploring Berkshire County: An Interactive Platform for Trails, Points of Interest, and Weather Insights
Feb 2024 - May 2024
Objective: To develop an interactive platform for Berkshire County, Massachusetts, that integrates geographic and weather data, including town layouts, trails, points of interest, and maximum temperature data, to support outdoor enthusiasts in planning their adventures. By leveraging ArcGIS and NOAA datasets, the project aims to provide a comprehensive, user-friendly resource for exploring the natural and cultural highlights of the region.
Skills/Tools: ArcGIS, ArcGIS Pro, Data Pipelines,
ArcGIS ModelBuilders, Data Preprocessing

King County House Sales Visualization
Jan 2024 - Feb 2024
Objective: Develop an interactive Tableau dashboard to provide real-time insights into the King County housing market using visualizations like line charts, histograms, maps, and heat maps. With filtering, sorting, and drill-down capabilities, it empowers real estate professionals, buyers, and investors to make data-driven decisions.
Skills/Tools : Tableau, Bar chart, Histograms, Calendar, Line Chart, Maps, Dynamic Dashboard, Data Visualization

Nashville Houses Data Cleaning Project
December 2023
Streamline and enhance data quality by performing advanced SQL-based data cleaning and transformation. This includes filling missing values using self-joins, deduplicating data with Common Table Expressions (CTEs) and ROW_NUMBER(), and reorganizing columns to deliver a structured, ready-to-analyze dataset.
Skills/Tools : MS SQL Server, MySQL, SQL, Self Join, CTE, Update, Alter
