Parses a FASTA file to extract the sequences and header information, if any.
fasta_file : Fasta file to be parsed.
>>> import sys
>>> input_file = sys.argv[1]
>>> out = FastaParser(input_file)
>>> seqDict = out.sequenceDict()
>>> print len(seqDict.keys())
Masks the sequences in the FASTA file based on the given intervals.
intervals: A list of tuples containing the start and end positions for the masking. toLower: If True, the sequence in the interval is converted to lower case bases.
Default is False.
maskingChar : Masking character. Default is ‘N’.
Masked sequences.
Masks the sequence based on the given interval.
name: Name/header of the sequence. interval: A tuple containing the start and end positions for the masking. toLower: If True, the sequence in the interval is converted to lower case bases.
Default is False.
maskingChar : Masking character. Default is ‘N’.
Masked sequence.
Reads and parser the FASTA file.
fastaFile - A FASTA file.
Generator object containing sequences.
Compute the reverse complement of a given sequence.
sequence: Name of the sequence whose reverse complement is to be computed.
sequence which is the reverse complement of the input sequence.
Compute the reverse complements of all the sequences in the given FASTA file.
sequence: Name of the sequence whose reverse complement is to be computed.
Prints the reverse complements.
Extract the sequence corresponding to the given name.
name : Name of the sequence to be retrieved.
Sequence corresponding to the input name.
Names/Headers of all the sequences.
A list of names of all the sequences in the FASTA file.
Creates a dictionary of sequences with their header.
A dictionary of sequences.
Parses a FASTQ file to extract the sequences and the base qualities.
fasta_file : Fastq file to be parsed.
>>> import sys
>>> input_file = sys.argv[1]
>>> out = FastqParser(input_file)
>>> seqDict = out.sequenceDict()
>>> print len(seqDict.keys())
Creates a dictionary of base qualities of the sequences.
A dictionary of base qualities.
Masks the sequences in the FASTA file based on the given intervals.
intervals: A list of tuples containing the start and end positions for the masking. toLower: If True, the sequence in the interval is converted to lower case bases.
Default is False.
maskingChar : Masking character. Default is ‘N’.
Masked sequences.
Masks the sequence based on the given interval.
name: Name/header of the sequence. interval: A tuple containing the start and end positions for the masking. toLower: If True, the sequence in the interval is converted to lower case bases.
Default is False.
maskingChar : Masking character. Default is ‘N’.
Masked sequence.
Reads and parser the FASTQ file.
fastqFile - A FASTQ file.
Generator object containing sequences.
Compute the reverse complement of a given sequence.
sequence: Name of the sequence whose reverse complement is to be computed.
sequence which is the reverse complement of the input sequence.
Compute the reverse complements of all the sequences in the given FASTA file.
sequence: Name of the sequence whose reverse complement is to be computed.
Prints the reverse complements.
Names/Headers of all the sequences.
A list of names of all the sequences in the FASTQ file.
Creates a dictionary of sequences with their header.
A dictionary of sequences.
Trims all the sequence in the FASTA file from both sides based on the intervals.
interval : A list of tuples containing the number of bp’s to be trimmed from left and right side respectively.
Trimmed sequences.
Trims the sequence.
name : Name/header of the sequence to be trimmed. qualityCutOff : Threshold value of the quality for trimming sequence based on removing low quality bases. byInterval : If True, the sequence will be trimmed by removing bases according to the given interval. interval : The interval containing the number of bp’s to be trimmed from left and right side respectively.
Need byInterval to be True.
mott : If True, the sequence will be trimmed according to the Mott’s algorithm. limitValue : Numerical value of the limit to be used in Mott’s algorithm.
Requires mott to be True.
Trimmed sequence.
Edits/Modifies a DNA sequence.
input_sequence : A nucleotide sequence.
Masks the sequence based on the interval.
interval : A tuple containing the start and end positions for the masking. toLower: If True, the sequence in the interval is converted to lower case bases.
Default is False.
maskingChar : Masking character. Default is ‘N’.
Masked sequence.
Trimming a FASTQ sequence.
sequence : The sequence to be trimmed. qualities : Base qualities of the bp’s in the sequence.
Trims a sequence by removing low quality bp’s.
Trimmed sequence.