/pandas_sheet_validation

Proof of concept for validating bioinformatics spreadsheet data with Python + Pandas

Primary LanguageJupyter Notebook

Bioinformatics spreadsheet validation with pandas

Once upon a time I was asked to manually, as in just by eyeballing it, validate some bioinformatics data in a spreadsheet to spot rows that did not conform to a given set of rules. Yes, that's a silly thing to ask someone to do. It's a long story, don't ask.

Anyway, I'd been looking for opportunities to play around with data processing in pandas, and decided that taking this request to the next level would be a great way to do that. It's just a first pass, and I'll discuss things I would have done differently at the end, but I'm happy enough with the results to put them up here.

Enjoy!