Efficient and exhaustive search for conserved short motifs in genomic datasets, using C++
Automated metadata generation for archiving from inconsistent/incomplete record keeping, in python scripts
Custom tools to identify disk space usage by file types and comnpression formats, in C++ and bash.
Recreated bedtools, a fast, flexible, linux command line toolset for genomic arithmetic, written in C++ 60x faster than previous versions in key benchmarking cases
Structural variation detection and classification using long read technology