This is a sample technical coding test for entry-level bioinformatics Engineer posted on the Medium.
The goal of this question is to assess the candidate’s familiarity with sequence algorithms and Python programming skills. Accordingly, there are several levels to the solution to this problem.
Reading in a fastq file and extract the sequences
Computing the spacer counts
| spacer | count | |
|---|---|---|
| 0 | AAGTGAGCTCTTACGGGAAT | 29 |
| 1 | GAATGTAGTTTTAGCCCTCC | 16 |
| 2 | GGGATTCTATTTAGCCCGCC | 17 |
| 3 | TCTGATGATACCCATGCCAC | 25 |
| 4 | AGATTCACAGAGGCAACCTG | 28 |
Arranging the results into specific formats (csv, text, excel, sql)