Structure of the code¶

mirtop/bam
- bam.py
  - read_bam: reads BAM files with pysamtools and store in a key - value object
- filter.py
  - tune: if option --clean is on, filter according generic rules
  - clean_hits: get the top hits
mirtop/gff
- init.py wraps the convertion process to GFF3
- body.py create will create the line according GFF format established.
  - read_gff_line: Inside a for loop to read line of the file. It’ll return and structure key:value dictionary for each column.
- header.py generate header and read header section.
- check.py checks header and single lines to be valid according GFF format (NOT IMPLEMENTED)
- stats.py GFF stats counting number of isomiR, their total and average expression
- query.py accept SQlite queries after option -q “”
- convert.py
  - create_counts table of counts
  - allow filtering by attribute
  - allow collapse by miRNA/isomiR type
- filter.py, parse from query (NOT IMPLEMENTED)
mirtop/mirna
- fasta.py:
  - read_precursor fasta file: key - value
- realign.py:
  - hits: class that defines hits
  - isomir: class that defines each sequence
  - cigar_correction: function that use CIGAR to make sequence to miRNA alignemt
  - read_id and make_id: shorter ID for sequences
  - make_cigar: giving an alignment return the CIGAR of it
  - reverse_complement: return the reverse complement of a sequence
  - align: uses biopython to align two sequences of the same size
  - expand_cigar: from a 12M to MMMMMMMMMMMM
  - cigar2snp: from CIGAR code to list of changes with position and reference and target nts
- mapper.py:
  - read_gtf file: map genomic miRNA position to precursos position, then it needs genomic position for the miRNA and the precursor. Return would be like {mirna: [start, end]}
- annotate.py:
  - annotate: read isomiRs and populate all attributes related to isomiRs
mirtop/importer:
- seqbuster.py
- prost.py
- srnabench.py
- isomirsea.py
mirtop/exporter:
- isomirs.py: export file to match isomiRs BioC package.
data/examples/
- check gff files: example of correct, invalid, warning GFF files
- check BAM file
- check mapping from genome position to precursor position, example of +/- strand. Using mirtop/mirna/map.read_gtf.
- check clean option: sequence mapping to multiple precursors/mirna, get the best score. Using mirtop/bam/filter.clean_hits.

To add new sub-commands, modify the following:

mirtop/lib/parse.py
- query: TODO
- transform: TODO
- create: TODO
- check: TODO

Structure of the code¶

mirtop

Navigation

Related Topics