Photo by Dan Cristian Pădureț on Unsplash
Pedestrian and Bicyclist Safety, Children's Road Safety, Disabled, Injury Prevention, Mode Choice, Transport Psychology, Survey Data Collection, and Quantitative Social Research.
Python, R, LaTeX, MySQL, and MongoDB.
Applied Data Science
Data Science Pipeline (Data collection, Exploratory analysis, Statistical Analysis, Modelling, Deployment, and Report making), Statistics (Experimental design, Exploratory and Confirmatory data analysis, Hypothesis testing, and Bayesian A/B testing), Geospatial data analysis and Visualisation, Time series forecasting, Big data analytics, OOP, Git and GitHub, Google sheet, and Excel.
Classification (Binary, multi-nominal and ordinal logit models) and Regression (Multiple linear regression, lasso and ridge regression), Count models (Poisson, negative binomial, and zero-inflated regression), Clustering (k-means, hierarchical), Survival analysis (KM estimate, COX-PH and AFT models), Mixed Effects Models (random intercepts and slopes), Time Series Forecasting, Association rule mining, Decision trees, Ensemble models (bagging and boosting) and Deep Learning.
Stata, Tableau, QGIS, LaTeX, MS Office Suite, AWS Elastic Beanstalk, Linux CLI, Docker, GitHub Actions (CI/CD), Notebooks (Jupyter, Google collab etc.) and IDEs (VS Code and PyCharm).
Data Manipulation (pandas, dplyr and dfply), Visualization (ggplot2, matplotlib, seaborn, plotnine, altair and plotly), Feature Engineering and Selection (sklearn and feature engine), Geospatial Analysis and Visualisation (folium and geopandas), Dashboard (plotly Dash and tableau), Text Analysis (regex), Time Series Forecasting (prophet and statsmodels), Machine Learning (scikit learn, statsmodels, tidymodels, pycaret, keras, tensorflow and H2o), ML Lifecycle Management (MLflow), Big Data Analytics (PySpark) and Web Application (streamlit and flask).