python - Match nucleotide position to sequence from fasta file -

September 15, 2011

i have list of positions:

chr1 1000 chr2 2000 chr3 4000

and able transform position in nucleotide sequence giving custom fasta file. such as:

chr1 1000 chr2 2000 t chr3 4000 g

is there written tool in python can job?

given fasta file chromosomes.fasta:

>chr1 gattaca >chr2 attacga >chr3 gccaacg

and positions file positions.txt:

chr1 3  chr2 4  chr3 5

you can use following code:

from bio import seqio record_dict = seqio.to_dict(seqio.parse('chromosomes.fasta', "fasta"))  chromosome_positions = {} open('positions.txt') f:     line in f.read().splitlines():         if line:             chromosome, position = line.split()             chromosome_positions[chromosome] = int(position)   chromosome in chromosome_positions:     seq = record_dict[chromosome]     position = chromosome_positions[chromosome]     base = seq[position]     print chromosome, position, base

which output:

chr3 5 c chr2 4 c chr1 3 t

note python uses zero-based indexing, position 5 in positions.txt give sixth base in corresponding sequence.

Search This Blog

Force Net

python - Match nucleotide position to sequence from fasta file -

Comments

Post a Comment

Popular posts from this blog

python - Operations inside variables -

Generic Map Parameter java -

arrays - What causes a java.lang.ArrayIndexOutOfBoundsException and how do I prevent it? -