Bioinformatics: The Missing Semester

Since having pivoted from an academic computational genomics environment to commercial data science and now engineering, gaps between skills and expectations became apparent.
PhD students weren’t being taught skills that would make them attractive, or even competitive in commercial spaces. What’s more, the “perfectionist” academic mindset didn’t scale; dissertations do not build or market effective consumer-facing products.
That’s when Bioinformatics: The Missing Semester was born.
The Missing Semester is a Substack series intended to teach the fundamentals of bioinformatics; from modality-specific data analysis to containerization and foundation models in genomics, the series builds from individual analyses to R&D-team scale, equipping learners with practical toolsets for starting and navigating bioinformatics in an industry setting.
The series is ideal for students pursuing careers in industry, existing bio-professionals looking for introductory material to up-skill, or for folks looking to pivot to biotech and bioscience from adjacent fields.
There is an onslaught of introductory bioinformatics material on the web. This series distinguishes itself by presenting material on a “need-to-know” basis and building conceptually from boilerplate genomics-specific data analysis to containerization and parallel computing and finally, to AI in genomics.
All code referenced in the series can be found here, complete with package installation and environment set up instructions so you can reproduce everything yourself.
Happy coding.