[go: up one dir, main page]

Menu

#53 Improved SAM support for colour space

open
bowtie (72)
5
2010-06-24
2010-06-24
Anonymous
No

I would like to see improved SAM support for colour-space input, including particularly the use of the CS and CQ tags, which I believe are required by the SAM specification (http://samtools.sourceforge.net/SAM1.pdf) for colour-space alignments.

In addition, I would also like the option of using lower-case letters to signify changed bases in the query sequence of matched reads, and the (default) option to output ambiguous colours as '.' instead of 'N' when using colour-space.

So for example, a .csfasta read:
>1_418_546_F3
T12321333312002203220011312001130102213030222023322

with quality:
>1_418_546_F3
28 31 31 32 32 27 28 28 29 31 29 28 20 30 31 29 22 22 28 31 26 24 25 29 29 26 26 18 27 28 25 26 23 24 25 22 11 0 0 19 25 24 18 25 25 18 22 20 28 24

which matches GATCATATACTTTCTTAGAAACATGAAACATTGGAGTAATTCTCCTATTC in the reference (with a single colour difference) would outputted as (using --col-keepends):
1_418_546_F3 0 DQ379370 314 255 50M * 0 0 GATCATATACTTTCTTAGAAACATGAAACATTGGAGTAATTCTCCTATtC @_`a\XYZ]]ZQS^]TMS\ZSRW[XUMNXVTRPRPB,!4MRKLSLIK!!9 XA:i:0 MD:Z:50 NM:i:0 CM:i:1 CS:Z:T12321333312002203220011312001130102213030222023322 CQ:Z:=@@AA<==>@>=5?@>77=@;9:>>;;3<=:;89:7,!!4:93::375=9

instead of:
1_418_546_F3 0 DQ379370 314 255 50M * 0 0 GATCATATACTTTCTTAGAAACATGAAACATTGGAGTAATTCTCCTATTC @_`a\XYZ]]ZQS^]TMS\ZSRW[XUMNXVTRPRPB,!4MRKLSLIK!!9 XA:i:0 MD:Z:50 NM:i:0 CM:i:1

An unmatched colour-space read:
>1_9_25_F3
T01.0010...200.003..000..1000.001..011.......0.1..0

with quality:
>1_9_25_F3
5 5 -1 5 5 2 5 -1 -1 -1 3 10 11 -1 2 4 11 -1 -1 3 5 18 -1 -1 3 5 7 5 -1 5 5 3 -1 -1 5 3 3 -1 -1 -1 -1 -1 -1 -1 5 -1 3 -1 -1 5

would be outputted as (using --col-keepends):
1_9_25_F3 4 * 0 0 * * 0 0 C.AACA...GAA.AAT..AAA..CAAA.AAC..ACC.......A.C..A * XM:i:0 CS:Z:T01.0010...200.003..000..1000.001..011.......0.1..0 CQ:Z:&&!&&#&!!!$+,!#%,!!$&3!!$&(&!&&$!!&$$!!!!!!!&!$!!&

instead of:
1_9_25_F3 4 * 0 0 * * 0 0 CNAACANNNGAANAATNNAAANNCAAANAACNNACCNNNNNNNANCNNA &!&&#&!!!$+,!#%,!!$&3!!$&(&!&&$!!&$$!!!!!!!&!$!!& XM:i:0

Thanks.

Discussion


Log in to post a comment.