- Efficient and exhaustive search for conserved short motifs in genomic datasets, using C++
- Automated metadata generation for archiving from inconsistent/incomplete record keeping, in python scripts
- Custom tools to identify disk space usage by file types and comnpression formats, in C++ and bash.
- Recreated bedtools, a fast, flexible, linux command line toolset for genomic arithmetic, written in C++ 60x faster than previous versions in key benchmarking cases
- Structural variation detection and classification using long read technology