Working on large scale natural language processing and data mining of technical literature, primarily in math, using a variety of techniques in supervised and unsupervised learning, and creating scalable data processing and retrieval pipelines with e.g. Java, Python, and SQL.

I have also built different classifiers of various textual data using neural networks to predict characteristics such as the character encoding or the mathematical subject classification.