I'm a graduate Research Assistant (GRA) majoring in Computer Science, with a focus (and minor) in Statistics and Statistical Modeling. I'm interested in Machine Learning (ML) and Artificial Intelligence (AI). My research interests are in statistical learning theory, unsupervised learning, and building software for the purpose of scientific computing and reproducible research. Topics that interest me include Clustering, Manifold Learning, Topology Theory and Topological Data Analysis, approaches to Density Estimation, Reinforcement Learning (such as Generative Adversarial learning!), etc. I have supplemental research interests in the fields of Network Science, Bayesian Statistics, Computational Geometry, Computational Geometry, and Parallel Computing.
I previously worked part time doing research at the Air Force Institute of Technology in the Low Orbitals Radar and Electromagnetism research group (starting 2013) doing either 1) research for an independent project the group received under supervision of Dr. Andrew Terzuoli or 2) supporting the graduate students' research efforts in the group. I worked with the group until early 2017.
In 2015, I started working for the Web and Complex Systems Lab as an undergraduate research assistant shortly after being introduced to Data Science in an elective class I took, CS 3250: Computational Tools and Techniques for Data Analysis taught by Derek Doran. I received a graduate research assistant position in the same lab shortly after, beginning my graduate degree working towards a M.S. in Computer Science.
I'm interested in the intersection between Machine Learning and Statistical Learning Theory. Broadly, the topics I've studied include: Bayesian Networks, Unsupervised types Deep Learning (e.g. SOMs and GANs), Nonparametric density estimation, Clustering techniques, Social Network Models, Information Theory, Bayesian inference techniques, and Markov Chain Monte Carlo (MCMC) methods. The specific research I've done, along with several of the (either class or personal) projects and presentations I've given, are detailed further down below.
My computational experience varies with what I'm doing. I use the R Project for Statistical Computing for nearly everything I do in ML. In my undergraduate years, I extensively used C++ (primarily C++11), for scientific computing projects, a few of which are listed below. Some of the projects actually required using regular ANSI-C89/C90. I spent about two years doing research into computational geometry and parallel computing with the Compute Unified Device Architecture (CUDA) and subsequent ports using OpenCL. These efforts lead to a few publications. I'm moderately proficient with Java, and I've had a number of class-or-personal projects requiring the use of other languages, i.e. Python and others.
|Wright State University||Masters of Science in Computer Science||(In Progress)|
Expected Fall 2017 or Spring 2018
|Wright State University||Bachelor of Science in Computer Science|
Minor in Statistics
CEG 7900:Network Science
CS 7830:Machine Learning
CS 3250:Computational Tools and Techniques for Data Analysis
STT 7020:Applied Stochastic Processes
CS 7230:Information Theory
CS 4850:Foundations of Artificial Intelligence
STT 3600/3610:Applied Statistics I & II
STT 4610:Theoretical Statistics I
CS 7200:Algorithm Design and Analysis
Density-based clustering algorithms, Discrete and continuous-time Markov Chains, Poisson Process Modeling, Brownian Motion, Adaptive Markov Chain Monte Carlo (MCMC) optimization techniques, [Dynamic] Bayesian Network modeling, Bayesian inference, parameter estimation techniques (EM/MAP), Random Graph Modeling (ERGMs, ER Model, etc.), Bayesian Linear and Logistic Regression, (simple) Artificial Neural Networks, internal cluster validation measures, non-parametric density estimation techniques, information theory
I submitted a successful funding proposal under the Google Summer of Code (GSOC) Initiative to the R Project for Statistical Computing to explore, develop, and unify recent developments related the theory of density-based clustering. This involved a mixture of code development which culminated in the form of an R package, as well as deep research to further understand the theory and utility of the cluster tree. There was also a WSU newsroom piece that describes the proposal in a non-technical way.Project Link
As I read more into theoretical foundations of density-based clustering, my research began to intersect Topology Theory and Manifold Learning. During this time, I started working in a minor capacity with a local research group studying how to combine techniques from the fields of topology and machine learning for the purpose of data analysis. Primarily, I researched theoretical extensions to the Mapper framework, a common algorithm used for performing TDAs.
Branch-and-bound spatial indexing data structures (kd-trees, cover trees, locality sensitive hashing), the k-nearest neighbor problem, finite mixture modeling, general parameter estimation techniques (Expectation Maximization/ MAP estimates), Dirichlet Process Modeling
Various random graph models such as Erdős–Rényi models and Exponential Random Graph Models (ERGMs), entropy measures over networks, density-based clustering techniques (DBSCAN and OPTICS), non-parametric models (ARMA + ARIMA models)
Gauss–Newton Method, approximation algorithms for unsplittable flow problems, graph theory (by extension), relational (Oracle/PostGreSQL/SQLite) and document-based database interaction (MongoDB), Natural language processing techniques for SEO (PageRank), asynchronous vs. synchronous client-server communication strategies with AJAX and NodeJS/PHP servers, XML Schema and XML Technologies [Xlink, XPath, etc.]