About me

Welcome to my portfolio! I'm Julian, a Polish-American with a multidisciplinary academic and professional background. I hold a data science MSc from Tilburg University (pending graduation), an MSc in psychology, a bachelor's in biology, and a minor in mathematics.

I specialize in deep learning, machine learning, Bayesian modeling, and spatiotemporal analysis, working primarily in Python and R. I have additional experience in SQL and visualization tools like Tableau and Power BI. I also hold a Microsoft Azure Fundamentals (AZ-900) certification.

My professional experience is diverse — ranging from managing program operations and conducting exposure therapy at the McLean Hospital OCD Institute to building a candidate database at Big Other Productions. Through these roles, I’ve strengthened the communication, organizational, and administrative skills necessary to bridge the gap between technical analysis and effective decision-making.

In this portfolio you'll find projects that reflect both the breadth and depth of my experience working with complex, real-world data.

Master's Thesis

Evaluating the Utility of Synthetic T1c Brain MRI Scans Generated by GAN and Diffusion Models

My MSc Data Science & Society thesis at Tilburg University investigated whether state-of-the-art generative models can synthesize high-fidelity T1-weighted post-contrast (T1c) brain MRI scans as substitutes for missing scans in downstream glioma segmentation tasks — a clinically significant problem given how frequently T1c scans are absent from datasets due to cost and contraindications.

Four generative models were evaluated: Pix2pixRAD and SynDiff (conditional models), StyleGAN2-ADA (unconditional GAN), and a custom diffusion model, with segmentation performed using nnU-Net. Synthetic T1c from the two conditional models improved segmentation over missing T1c baselines, though enhancing tumor regions remained the most challenging to synthesize across all models. Training was conducted on NVIDIA A40 GPUs via Tilburg University's GPU4EDU program.

Airbnb Revenue Prediction with XGBoost

A machine learning pipeline predicting Airbnb revenue using XGBoost. The pipeline includes preprocessing of numerical and categorical features (e.g., imputation and one-hot encoding), regression modeling, feature importance analysis, and learning curve visualization. An alternative pipeline using HistGradientBoostingRegressor is also provided for comparison.


Bayesian Multilevel Analysis of Cardiovascular Disease Mortality

A bayesian multilevel analysis of cardiovascular disease mortality among European women aged 65-69 implemented in R with the brms library. The pipeline covers preprocessing, Beta regression modeling with hierarchical structures for region and country, posterior predictive checks, k-fold cross-validation, and sensitivity analyses for prior specifications to ensure model robustbess.


Forecasting Emergency Room Visits Using Time Series Analysis

A time series forecasting project of emergency room visits at an Iowa hospital (Jan 2014–Aug 2017). The analyses include time series decomposition (STL), autocorrelation (ACF/PACF), ARIMA, exponential smoothing, and machine learning approaches such as KNN and XGBoost.


Analysis of 5 Largest Green ETFs

A comparative performance analysis of the top 5 largest sustainability-focused ETFs against the Vanguard Total Market ETF (VTI). Analysis pipeline includes moving average visualization, closing price correlation visualization via heat maps and KDE, and price distribution simulation using the Monte Carlo method.


Central Park Temperature Time-Series Analysis and Prediction

Analysis of the average monthly temperatures in Central Park, NY from 1870-2023 using seasonal decomposition, weighted moving average, and exponential smoothing. Additionally, temperature predictions for the remainder of 2024 were made using SARIMA modeling.


Geospatial Analysis of Voting Patterns in the Netherlands (2023)

A geospatial machine learning analysis predicting the percentage of votes for the Party for Freedom (PVV) per municipality in the Dutch 2023 elections. The pipeline includes spatial autocorrelation (Global Moran’s I), spatial weighting scheme evaluation (contiguity vs. KNN-based), autoregressive modeling (GM Lag), and regression modeling with Random Forest and Gradient Boosting. Nested group-wise cross-validation was used to assess accuracy with results visualized via choropleth maps displaying municipal-level prediction errors.

Global Progress Analysis

A three-part SQL analysis exploring global progress across electric vehicle adoption, worldwide education and literacy, and economic inequality. Conducted in MySQL Workbench, the project examines metrics such as EV sales trends by country, youth literacy rates, GDP per capita, and the Gini coefficient. Each component is complemented by an interactive Tableau dashboard visualizing the key findings.


U.S. Heart Disease Statistics

This Excel dashboard explores the effects of lifestyle and various health conditions on the rates of heart disease among Americans. Sepecifically, it examines these effects between men and ] women, and between difference races and age groups.


NYC Affordable Housing Visualization

This Excel dashboard organizes the statistics on the construction of affordable housing in New York City from 2014 to 2018 according to borough, start/finish year, and level of income.

NYC Affordable Housing Construction

An SQL data cleaning pipeline for NYC afforable housing construction data (2014-2018), which includes column renaming, type casting, deduplication, and reformatting.


Smaller Projects

        A collection of smaller scripts demonstrating core programming skills:

Elements

Text

This is bold and this is strong. This is italic and this is emphasized. This is superscript text and this is subscript text. This is underlined and this is code: for (;;) { ... }. Finally, this is a link.


Heading Level 2

Heading Level 3

Heading Level 4

Heading Level 5
Heading Level 6

Blockquote

Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.

Preformatted

i = 0;

while (!deck.isInOrder()) {
    print 'Iteration ' + i;
    deck.shuffle();
    i++;
}

print 'It took ' + i + ' iterations to sort the deck.';

Lists

Unordered

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Alternate

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Ordered

  1. Dolor pulvinar etiam.
  2. Etiam vel felis viverra.
  3. Felis enim feugiat.
  4. Dolor pulvinar etiam.
  5. Etiam vel felis lorem.
  6. Felis enim et feugiat.

Icons

Actions

Table

Default

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Alternate

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Buttons

  • Disabled
  • Disabled

Form