Revision as of 12:24, 16 June 2022 (view source) Hansoft (talk \| contribs) (Added uBasic/4tH version) ← Older edit		Revision as of 16:31, 4 July 2022 (view source) Rdm (talk \| contribs) m (→‎{{header\|J}}) Newer edit →
Line 906: parseFasta Fafile Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</lang> Nowadays, most machines have gigabytes of memory. However, if it's necessary to process FASTA content on a system with inadequate memory we can use files to hold intermediate results. For example: <lang J>bs=: 2 chunkFasta=: {{ r=. EMPTY bad=. a.-.a.{~;48 65 97(+i.)each 10 26 26 dir=. x,'/' off=. 0 siz=. fsize y block=. dest=. '' while. off < siz do. block=. block,fread y;off([, [ -~ siz<.+)bs off=. off+bs while. LF e. block do. line=. LF taketo block select. {.line case. ';' do. case. '>' do. start=. }.line-.CR r=.r,(head=. name,'.head');<name=. dir,start -. bad start fwrite head '' fwrite name case. do. (line-.bad) fappend name end. block=. LF takeafter block end. end. r }}</lang> Here, we're using a block size of 2 bytes, to illustrate correctness. If speed matters, we should use something significantly larger. The left argument to <code>chunkFasta</code> names the directory used to hold content extracted from the FASTA file. The right argument names that FASTA file. The result identifies the extracted headers and contents Thus, if '~/fasta.txt' contains the example file for this task and we want to store intermediate results in the '~temp' directory, we could use: <lang J> fasta=: '~temp' chunkFasta '~/fasta.txt'</lang> And, to complete the task: <lang J> ;(,': ',,&LF)each/"1 fread each fasta Rosetta_Example_1: THERECANBENOSPACE Rosetta_Example_2: THERECANBESEVERALLINESBUTTHEYALLMUSTBECONCATENATED</lang>

FASTA format: Difference between revisions

FASTA format (view source)

Revision as of 16:31, 4 July 2022