Writing fasta files using R package seqinr?
Asked Answered
A

4

10

When I use write.fasta in seqinr, the file that it outputs looks like this:

>Sequence name 1

>Sequence name 2

>Sequence name 3
...etc

Sequence 1 Sequence 2 Sequence 3 ...etc

In other words, the sequence names are all at the beginning of the file, and then the sequences are output together at the end of the file.

What I'd like to do is this:

>Sequence name 1
Sequence 1
>Sequence name 2
Sequence 2
>Sequence name 3
Sequence 3
...etc

Is that possible with write.fasta?

Argumentative answered 6/8, 2012 at 0:0 Comment(1)
Could you please post a reproducible example? For example, post the code that you use to call write.fasta, and use dput to show what you pass to it?Tholos
C
30

I was having a similar problem. What I did was to convert the vector that contained the sequences to a list and it worked fine.

e.g., write.fasta(as.list(seq),names,file)

Crossbar answered 9/10, 2012 at 3:48 Comment(2)
Thanks for this -- it would be great if the package just threw an error when it was given a vector of sequences instead of making the weird output!React
They should also change the documentationRaincoat
S
1

A minimal example:

write.fasta(as.list(c('AAA', 'CCC')), names=c("a", "b"), 
                               as.string=FALSE, file.out="foo.fa")
Sixfooter answered 8/1, 2019 at 18:11 Comment(0)
R
1

I got stuck with this and got some help from a friend. You need to define the sequences in a list here is an example of code where the input from maxquant output is a csv with a column called sequence and a name column called 'leading razor protein':

library(tidyverse)
library(seqinr)
MU = read_csv('data.csv')
seqs = as.list(dplyr::pull(MU, Sequence))
names = dplyr::pull(MU, `Leading razor protein`)
write.fasta(seqs, names, "MU.fasta",
            open = "w", as.string = FALSE)
Radiochemical answered 27/9, 2019 at 13:41 Comment(0)
J
0

Actually, I always obtain it the right way, and it never happened to me to have a problem similar to yours. Try this.

Copy this text below:

>seq1
agctgtagtc
>seq2
agtctctctt
>seq3
atgtataaaa

Save it as "test.fasta". Then in R do the following

my.dna<-read.fasta("test.fasta")
write.fasta(sequences=my.dna,names=names(my.dna),file.out="write.my.dna.fasta")

If you open "write.my.dna.fasta" you will then obtain the following:

>seq1
agctgtagtc
>seq2
agtctctctt
>seq3
atgtataaaa
Jeffereyjefferies answered 3/10, 2012 at 15:0 Comment(1)
My own answer was wrong, please disregard it. It worked only because I already read using seqinr, which reads sequences as lists. The thing to do is what mayela suggested.Jeffereyjefferies

© 2022 - 2024 — McMap. All rights reserved.