Structure of the code¶
- mirtop/bam
- bam.py
read_bam: reads BAM files with pysamtools and store in a key - value object
- filter.py
tune: if option--cleanis on, filter according generic rulesclean_hits: get the top hits
- bam.py
- mirtop/gff
- init.py wraps the convertion process to GFF3
- body.py
createwill create the line according GFF format established.read_gff_line: Inside a for loop to read line of the file. It’ll return and structure key:value dictionary for each column.
- header.py generate header and read header section.
- check.py checks header and single lines to be valid according GFF format (NOT IMPLEMENTED)
- stats.py GFF stats counting number of isomiR, their total and average expression
- query.py accept SQlite queries after option -q “”
- convert.py
create_countstable of counts- allow filtering by attribute
- allow collapse by miRNA/isomiR type
- filter.py, parse from query (NOT IMPLEMENTED)
- mirtop/mirna
- fasta.py:
read_precursorfasta file: key - value
- realign.py:
hits: class that defines hitsisomir: class that defines each sequencecigar_correction: function that use CIGAR to make sequence to miRNA alignemtread_idandmake_id: shorter ID for sequencesmake_cigar: giving an alignment return the CIGAR of itreverse_complement: return the reverse complement of a sequencealign: uses biopython to align two sequences of the same sizeexpand_cigar: from a 12M to MMMMMMMMMMMMcigar2snp: from CIGAR code to list of changes with position and reference and target nts
- mapper.py:
read_gtffile: map genomic miRNA position to precursos position, then it needs genomic position for the miRNA and the precursor. Return would be like {mirna: [start, end]}
- annotate.py:
annotate: read isomiRs and populate all attributes related to isomiRs
- fasta.py:
- mirtop/importer:
- seqbuster.py
- prost.py
- srnabench.py
- isomirsea.py
- mirtop/exporter:
- isomirs.py: export file to match isomiRs BioC package.
- data/examples/
- check gff files: example of correct, invalid, warning GFF files
- check BAM file
- check mapping from genome position to precursor position, example of +/- strand. Using
mirtop/mirna/map.read_gtf. - check clean option: sequence mapping to multiple precursors/mirna, get the best score. Using
mirtop/bam/filter.clean_hits.
To add new sub-commands, modify the following:
- mirtop/lib/parse.py
- query: TODO
- transform: TODO
- create: TODO
- check: TODO