JP van Paridon, PhD

jvparidon@gmail.com
github.com/jvparidon
linkedin.com/in/jp-van-paridon
jvparidon.io/resume

Profile

I am a computational cognitive scientist and data scientist with a background in language sciences. I use online and in-person experiments and large public datasets to inform statistical and computational models of human behavior and cognition.

Employment

Postdoctoral Researcher, University of Wisconsin-Madison, 2020-present

Investigated how people learn about word meaning from other people's language use
Designed and analyzed online experiments using statistical and ML techniques including Bayesian hierarchical regression models, dimensionality reduction, and clustering algorithms
Used NLP models like word embeddings (e.g. word2vec) and transformer models (e.g. BERT, ALBERT) to model language use and learning mechanisms

PhD Candidate, Max Planck Institute for Psycholinguistics, the Netherlands, 2015-2019

Built computational and statistical models in Python to model the temporal dynamics of speech
Developed and taught Introduction to Python Programming course for graduate students in 2018 and 2019
Organized 2018 Conference of the International Max Planck Research School for Language Sciences
Engaged in international research collaborations as statistical and technical consultant

Athletics Commissioner, Leiden University Rowing Club Asopos de Vliet, 2013-2014

Fixed-term, full-time volunteer position supported financially by the University
Coordinated coaching staff and regatta participation as well as fleet transports, maintenance, and purchasing
During my tenure, the club won a Freshmen Eights National Championship for the first time in its forty-year history

Education

PhD, Cognitive Science, 2021, Radboud University & Max Planck Institute, the Netherlands
MSc (with honors), Cognitive Neuroscience, 2015, Leiden University, the Netherlands
BSc (honors thesis), Psychology, 2012, Leiden University, the Netherlands

Open source software

Lead developer and maintainer of lmerMultiMember, an R package for modeling multiple membership random effects in (generalized) linear mixed effects models
Developed subs2vec, a Python package with word embeddings from subtitles in 55 languages

Skills

Statistical modeling in both frequentist and Bayesian frameworks
Machine learning, including NLP, feature selection, clustering, and dimension reduction
Data extraction and cleaning using e.g. pandas, dplyr, and SQL
Data visualization in ggplot2, matplotlib and seaborn
Python programming, including NumPy, scikit-learn, Jupyter notebooks, and PyMC
R programming, including lme4, brms, rmarkdown, and various tidyverse packages
Version control and continuous integration using git & Github Actions
Designing and programming behavioral and online experiments using HTML/CSS/JavaScript
Presenting technical concepts to non-technical audiences
Coordinating international and multidisciplinary research collaborations

Published papers

J. van Paridon & G. Lupyan (2022). Implicit communication of word meaning through co-occurrence. Proceedings of the Joint Conference on Language Evolution. [pdf]
G. Montero-Melis, J. van Paridon, M. Ostarek, & E. Bylund (2022). No evidence for embodiment: The motor system is not needed to keep action verbs in working memory. Cortex. [pdf][doi]

For this project, we developed a sequential testing framework using Bayesian multilevel regression models to efficiently sample participants without violating statistical assumptions. I also integrated a Python interface for a MIDI drumkit into our behavioral experiment protocol as a cheap and effective way to track motor activity in participants' hands and feet.
M. Arunkumar, J. van Paridon, M. Ostarek, & F. Huettig (2021). Do illiterates have illusions? A conceptual (non) replication of Luria (1976). Journal of Cultural Cognitive Science. [pdf][doi]
J. van Paridon, Q. Liu, & G. Lupyan (2021). How do blind people know that blue is cold? Distributional semantics encode color-adjective associations. Proceedings of the Annual Meeting of the Cognitive Science Society. [pdf][doi]

For this project, I used a variety of NLP and statistical techniques to demonstrate that color knowledge in both blind and sighted people can be predicted from word embeddings. I also reworked the original word2vec algorithm to gain access to word embeddings during model training and track how specific sentences in the training corpus affect the final state of the embedding model.
J. van Paridon, M. Ostarek, M. Arunkumar, & F. Huettig (2021). Does neuronal recycling result in destructive competition? The influence of learning to read on the recognition of faces. Psychological Science. [pdf][doi]

In this project, I used Bayesian multilevel regression models to analyze behavioral data collected from illiterate participants to show that acquiring literacy does not degrade performance on other visual tasks (a claim that had been made repeatedly in the literature in prior years).
J. van Paridon & B. Thompson (2021). subs2vec: Word embeddings from subtitles in 55 languages. Behavior Research Methods. [pdf][doi]

In this project, we produced a novel set of word embeddings in 55 languages from a large archive of film and television subtitles, using the fastText algorithm. Using several classic evaluation metrics (predicting similarity ratings using cosine distances; solving lexical analogies) and a novel lexical norm prediction task (implemented using ridge regression), we demonstrate that embeddings trained on subtitles contain information not well-represented in embeddings trained on e.g. Wikipedia, for instance about how offensive a given word is. The subs2vec package I developed alongside this project also provides a lightweight framework for working with word embeddings in Python.
G. Montero-Melis, P. Isaksson, J. van Paridon, & M. Ostarek (2020). Does using a foreign language reduce mental imagery? Cognition. [pdf][doi]
M. Ostarek, J. van Paridon, & G. Montero-Melis (2019). Sighted people’s language is not helpful for blind individuals’ acquisition of typical animal colors. Proceedings of the National Academy of Sciences. [pdf][doi]
J. van Paridon, A. Roelofs, & A. Meyer (2019). A lexical bottleneck in shadowing and translating of narratives. Language, Cognition and Neuroscience. [pdf][doi]

In this project, I built a computational model to simulate the temporal coordination of speaking and listening processes during simultaneous interpreting, demonstrating that one significant limit on the rate of speech during this task is imposed by the demand to access lexical networks for both speech production and speech comprehension processes.
C. M. Warren, K. D. Tona, L. Ouwerkerk, J. van Paridon, F. Poletiek, J. A. Bosch, & S. Nieuwenhuis (2019). The neuromodulatory and hormonal effects of transcutaneous vagus nerve stimulation as evidenced by salivary alpha amylase, salivary cortisol, pupil diameter, and the P3 event-related potential. Brain Stimulation. [pdf][doi]
Z. Shao, J. van Paridon,, F. Poletiek, & A. Meyer (2019). Effects of phrase and word frequencies in noun phrase production. Journal of Experimental Psychology: Learning, Memory, and Cognition. [pdf][doi]

Technical whitepapers and preprints

J. Sulik, J. van Paridon, & G. Lupyan (2021). Explanations in the Wild. Preprint. [pdf][doi]
P. M. Alday & J. van Paridon (2020). Away from arbitrary thresholds: using robust statistics to improve artifact rejection in ERP. Technical whitepaper. [pdf][doi]

In this paper, we make recommendations for improving EEG analyses by using robust models which can account for outliers without having to reject data after setting arbitrary thresholds. We provide examples of analyses using robust models in both frequentist (using a robust estimator) and Bayesian (using a heavy-tailed likelihood) frameworks.
J. van Paridon & P. M. Alday (2020). A note on co-occurrence, transitional probability, and causal inference. Technical whitepaper. [pdf][doi]

In this paper, we explain that the high degree of multicollinearity between various measures of word frequency and transitional probability in linear regression models used in psycholinguistic analyses means that coefficients are often misinterpreted. It is hard to make blanket recommendations for analytical decisions, but we argue that in many cases, a theoretically motivated choice for one specific predictor leads to the most interpretable outcome.
M. Ostarek, J. van Paridon, & F. Huettig (2018). Cross-decoding reveals shared brain activity patterns between saccadic eye-movements and semantic processing of implicitly spatial words. Preprint. [pdf][doi]

In this project, my role was to set up an analysis pipeline using (cross-validated) SVM classifiers trained on fMRI images from an eye-saccade task to predict the spatial orientation of words processed by the same participants in a different task from their corresponding fMRI images.