Bioinformatics/Sequence mutation: Difference between revisions

Line 15:

* Give more information on the individual mutations applied.

* Allow mutations to be weighted and/or chosen.

=={{header|11l}}==

<syntaxhighlight lang=11l>UInt32 seed = 0

<syntaxhighlight lang="11l">UInt32 seed = 0

F nonrandom(n)

:seed = 1664525 * :seed + 1013904223

Line 113:

Line 112:

TOT= 249

</pre>

=={{header|Ada}}==

<syntaxhighlight lang=~~Ada~~>with Ada.Containers.Vectors;

<syntaxhighlight lang="ada">with Ada.Containers.Vectors;

with Ada.Numerics.Discrete_Random;

with Ada.Text_Io;

Line 264:

Line 262:

Count of G is 56

Count of T is 51</pre>

=={{header|Arturo}}==

<syntaxhighlight lang=rebol>bases: ["A" "T" "G" "C"]

<syntaxhighlight lang="rebol">bases: ["A" "T" "G" "C"]

dna: map 1..200 => [sample bases]

Line 370:

Line 367:

200 : CC

Total count => A: 46 T: 47 G: 55 C: 54</pre>

=={{header|C}}==

Adenine ( A ) is always swapped for Thymine ( T ) and vice versa. Similarly with Cytosine ( C ) and Guanine ( G ).

#include<stdlib.h>

#include<stdio.h>

Line 672:

Line 668:

Total:513

</pre>

=={{header|C++}}==

<syntaxhighlight lang=cpp>#include <array>

<syntaxhighlight lang="cpp">#include <array>

#include <iomanip>

#include <iostream>

Line 829:

Line 824:

A: 65, C: 66, G: 64, T: 56, Total: 251

</pre>

=={{header|Common Lisp}}==

Usage :

Line 836:

Line 830:

:: :genome <Genome Sequence>)

All keys are optional. <Genome length> is discarded when :genome is set.

(defun random_base ()

(random 4))

Line 979:

Line 973:

T : 137 G : 119

</pre>

=={{header|Factor}}==

<syntaxhighlight lang=factor>USING: assocs combinators.random formatting grouping io kernel

<syntaxhighlight lang="factor">USING: assocs combinators.random formatting grouping io kernel

macros math math.statistics namespaces prettyprint quotations

random sequences sorting ;

Line 1,101:

Line 1,094:

TOTAL: 204

</pre>

=={{header|Go}}==

<syntaxhighlight lang=go>package main

<syntaxhighlight lang="go">package main

import (

Line 1,257:

Line 1,249:

</pre>

=={{header|Haskell}}==

<syntaxhighlight lang=haskell>import Data.List (group, sort)

<syntaxhighlight lang="haskell">import Data.List (group, sort)

import Data.List.Split (chunksOf)

import System.Random (Random, randomR, random, newStdGen, randoms, getStdRandom)

Line 1,377:

Line 1,369:

------

Σ: 203</pre>

=={{header|J}}==

<syntaxhighlight lang=J>ACGT=: 'ACGT'

<syntaxhighlight lang="j">ACGT=: 'ACGT'

MUTS=: ;: 'del ins mut'

Line 1,506:

Line 1,497:

│ │ 200│GGC │

└─────┴────┴──────────────────────────────────────────────────┘</pre>

=={{header|Java}}==

<syntaxhighlight lang=java>import java.util.Arrays;

<syntaxhighlight lang="java">import java.util.Arrays;

import java.util.Random;

Line 1,652:

Line 1,642:

A: 71, C: 62, G: 58, T: 61, Total: 252

</pre>

=={{header|JavaScript}}==

<syntaxhighlight lang=javascript>// Basic set-up

<syntaxhighlight lang="javascript">// Basic set-up

const numBases = 250

const numMutations = 30

Line 1,876:

Line 1,865:

Σ: 261

</pre>

=={{header|Julia}}==

<syntaxhighlight lang=julia>dnabases = ['A', 'C', 'G', 'T']

<syntaxhighlight lang="julia">dnabases = ['A', 'C', 'G', 'T']

randpos(seq) = rand(1:length(seq)) # 1

mutateat(pos, seq) = (s = seq[:]; s[pos] = rand(dnabases); s) # 2-1

Line 1,991:

Line 1,979:

Total 502

</pre>

=={{header|Lua}}==

Using the <code>prettyprint()</code> function from [[Bioinformatics/base_count#Lua]] (not replicated here)

<syntaxhighlight lang=lua>math.randomseed(os.time())

<syntaxhighlight lang="lua">math.randomseed(os.time())

bases = {"A","C","T","G"}

function randbase() return bases[math.random(#bases)] end

Line 2,053:

Line 2,040:

121 gcatagagtg gattccttta acctaggaga aacgcccttc cggttcagca tggcgagtgc

181 gtacaacgat gacccagat</pre>

=={{header|Mathematica}} / {{header|Wolfram Language}}==

BioSequence is a fundamental data type in Mathematica:

<syntaxhighlight lang=~~Mathematica~~>SeedRandom[13122345];

<syntaxhighlight lang="mathematica">SeedRandom[13122345];

seq = BioSequence["DNA", "ATAAACGTACGTTTTTAGGCT"];

randompos = RandomInteger[seq["SequenceLength"]];

Line 2,104:

Line 2,090:

201-246: ACTTTGGTCCAAGATAGTTAGATATCAATCCGTATAATGTAGGCTT

{{"T", 60}, {"A", 70}, {"G", 67}, {"C", 49}}</pre>

=={{header|Nim}}==

<syntaxhighlight lang=~~Nim~~>import random

<syntaxhighlight lang="nim">import random

import strformat

import strutils

Line 2,258:

Line 2,243:

TCGTGACTGC CAGTCGAC 198

//</pre>

=={{header|Perl}}==

<syntaxhighlight lang=perl>use strict;

<syntaxhighlight lang="perl">use strict;

use warnings;

use feature 'say';

Line 2,330:

Line 2,314:

G: 51

T: 51</pre>

=={{header|Phix}}==

string dna = repeat(' ',200+rand(300))

for i=1 to length(dna) do dna[i] = "ACGT"[rand(4)] end for

Line 2,408:

Line 2,391:

Base counts: A:128, C:110, G:119, T:123, total:480

</pre>

=={{header|PureBasic}}==

<syntaxhighlight lang=~~PureBasic~~>#BASE$="ACGT"

<syntaxhighlight lang="purebasic">#BASE$="ACGT"

#SEQLEN=200

#PROTOCOL=#True

Line 2,494:

Line 2,476:

Base counts:

A: 51, C: 55, G: 43, T: 53, Total: 202</pre>

=={{header|Python}}==

In function seq_mutate argument kinds selects between the three kinds of mutation. The characters I, D, and S are chosen from the string to give the kind of mutation to perform, so the more of that character, the more of that type of mutation performed.

Similarly parameter choice is chosen from to give the base for substitution or insertion - the more any base appears, the more likely it is to be chosen in any insertion/substitution.

<syntaxhighlight lang=python>import random

<syntaxhighlight lang="python">import random

from collections import Counter

Line 2,587:

Line 2,568:

T: 72

TOT= 251</pre>

=={{header|Quackery}}==

<code>prettyprint</code> and <code>tallybases</code> are defined at [[Bioinformatics/base count#Quackery]].

<syntaxhighlight lang=~~Quackery~~> [ $ "ACGT" 4 random peek ] is randomgene ( --> c )

<syntaxhighlight lang="quackery"> [ $ "ACGT" 4 random peek ] is randomgene ( --> c )

[ $ "" swap times

Line 2,640:

Line 2,620:

total 201

</pre>

=={{header|Racket}}==

<syntaxhighlight lang=racket>#lang racket

<syntaxhighlight lang="racket">#lang racket

(define current-S-weight (make-parameter 1))

Line 2,784:

Line 2,763:

T : 42

TOTAL: 193</pre>

=={{header|Raku}}==

(formerly Perl 6)

Line 2,791:

Line 2,769:

<syntaxhighlight lang=raku line>my @bases = <A C G T>;

<syntaxhighlight lang="raku" line>my @bases = <A C G T>;

# The DNA strand

Line 2,847:

Line 2,825:

G 43

T 53</pre>

=={{header|Ring}}==

row = 0

dnaList = []

Line 3,008:

Line 2,985:

A: 83, T: 32, C: 36, G: 49, Total: 200

</pre>

=={{header|Ruby}}==

<syntaxhighlight lang=ruby>class DNA_Seq

<syntaxhighlight lang="ruby">class DNA_Seq

attr_accessor :seq

Line 3,064:

Line 3,040:

Total 199: {:A=>52, :C=>50, :G=>49, :T=>48}

</pre>

=={{header|Rust}}==

use rand::prelude::*;

use std::collections::HashMap;

use std::fmt::{Display, Formatter, Error};

pub struct Seq<'a> {

alphabet: Vec<&'a str>,

distr: rand::distributions::Uniform<usize>,

pos_distr: rand::distributions::Uniform<usize>,

seq: Vec<&'a str>,

}

impl Display for Seq<'_> {

fn fmt(&self, f: &mut Formatter) -> Result<(), Error> {

let pretty: String = self.seq

.iter()

.enumerate()

.map(|(i, nt)| if (i + 1) % 60 == 0 { format!("{}\n", nt) } else { nt.to_string() })

.collect();

let counts_hm = self.seq

.iter()

.fold(HashMap::<&str, usize>::new(), |mut m, nt| {

*m.entry(nt).or_default() += 1;

m

});

let mut counts_vec: Vec<(&str, usize)> = counts_hm.into_iter().collect();

counts_vec.sort_by(|a, b| a.0.cmp(&b.0));

let counts_string = counts_vec

.iter()

.fold(String::new(), |mut counts_string, (nt, count)| {

counts_string += &format!("{} = {}\n", nt, count);

counts_string

});

write!(f, "Seq:\n{}\n\nLength: {}\n\nCounts:\n{}", pretty, self.seq.len(), counts_string)

}

impl Seq<'_> {

pub fn new(alphabet: Vec<&str>, len: usize) -> Seq {

let distr = rand::distributions::Uniform::new_inclusive(0, alphabet.len() - 1);

let pos_distr = rand::distributions::Uniform::new_inclusive(0, len - 1);

let seq: Vec<&str> = (0..len)

.map(|_| {

alphabet[thread_rng().sample(distr)]

})

.collect();

Seq { alphabet, distr, pos_distr, seq }

}

pub fn insert(&mut self) {

let pos = thread_rng().sample(self.pos_distr);

let nt = self.alphabet[thread_rng().sample(self.distr)];

println!("Inserting {} at position {}", nt, pos);

self.seq.insert(pos, nt);

}

pub fn delete(&mut self) {

let pos = thread_rng().sample(self.pos_distr);

println!("Deleting {} at position {}", self.seq[pos], pos);

self.seq.remove(pos);

}

pub fn swap(&mut self) {

let pos = thread_rng().sample(self.pos_distr);

let cur_nt = self.seq[pos];

let new_nt = self.alphabet[thread_rng().sample(self.distr)];

println!("Replacing {} at position {} with {}", cur_nt, pos, new_nt);

self.seq[pos] = new_nt;

}

fn main() {

let mut seq = Seq::new(vec!["A", "C", "T", "G"], 200);

println!("Initial sequnce:\n{}", seq);

let mut_distr = rand::distributions::Uniform::new_inclusive(0, 2);

for _ in 0..10 {

let mutation = thread_rng().sample(mut_distr);

if mutation == 0 {

seq.insert()

} else if mutation == 1 {

seq.delete()

} else {

seq.swap()

}

println!("\nMutated sequence:\n{}", seq);

}

</syntaxhighlight>

<pre>

Initial sequnce:

Seq:

TAAGTTTAGTCTGTTTACGAGATCTAGAGGAGGACACCGTGTAGAGGGGATTTGTCAGGA

CACATGCATGGCACCCTAGTCAAATAGTGCCGAGAACAGGCTCTCCTGAGAAAGTTAGGT

CTGCCGAAGTGACGAAGTGCACGTTATAGCTCTATTAAGTATGTTCGTTAACAGGTATTA

ATGCTCTTAGCCAAGACCGT

Length: 200

Counts:

A = 56

C = 38

G = 53

T = 53

Deleting C at position 197

Inserting T at position 157

Replacing C at position 149 with G

Replacing A at position 171 with G

Replacing T at position 182 with G

Deleting C at position 124

Inserting T at position 128

Replacing G at position 175 with C

Deleting A at position 35

Replacing A at position 193 with G

Mutated sequence:

Seq:

TAAGTTTAGTCTGTTTACGAGATCTAGAGGAGGACCCGTGTAGAGGGGATTTGTCAGGAC

ACATGCATGGCACCCTAGTCAAATAGTGCCGAGAACAGGCTCTCCTGAGAAAGTTAGGTC

TGCGAAGTTGACGAAGTGCACGTTATAGGTCTATTATAGTATGTTCGTTAGCAGCTATTA

AGGCTCTTAGCCAGGACGT

Length: 199

Counts:

A = 53

C = 36

G = 56

T = 54</pre>

=={{header|Swift}}==

<syntaxhighlight lang=swift>let bases: [Character] = ["A", "C", "G", "T"]

<syntaxhighlight lang="swift">let bases: [Character] = ["A", "C", "G", "T"]

enum Action: CaseIterable {

Line 3,151:

Line 3,267:

G: 56

T: 45</pre>

=={{header|Vlang}}==

<syntaxhighlight lang=vlang>import rand

<syntaxhighlight lang="vlang">import rand

import rand.seed

Line 3,301:

Line 3,416:

======

</pre>

=={{header|Wren}}==

<syntaxhighlight lang=ecmascript>import "random" for Random

<syntaxhighlight lang="ecmascript">import "random" for Random

import "/fmt" for Fmt

import "/sort" for Sort

Line 3,438:

Line 3,552:

======

</pre>

=={{header|Yabasic}}==

<syntaxhighlight lang=~~Yabasic~~>// Rosetta Code problem: http://rosettacode.org/wiki/Sequence_mutation

<syntaxhighlight lang="yabasic">// Rosetta Code problem: http://rosettacode.org/wiki/Sequence_mutation

// by Galileo, 07/2022

Line 3,526:

Line 3,639:

Base counts: A: 71, C: 84, G: 75, T: 82, total: 312

---Program done, press RETURN---</pre>

=={{header|zkl}}==

<syntaxhighlight lang=zkl>var [const] bases="ACGT", lbases=bases.toLower();

<syntaxhighlight lang="zkl">var [const] bases="ACGT", lbases=bases.toLower();

dna:=(190).pump(Data().howza(3),(0).random.fp(0,4),bases.get); // bucket of bytes

Line 3,588:

Line 3,700:

Base Counts: 191 : A(49) C(45) G(57) T(40)

</pre>

=={{header|Rust}}==

use rand::prelude::*;

use std::collections::HashMap;

use std::fmt::{Display, Formatter, Error};

pub struct Seq<'a> {

alphabet: Vec<&'a str>,

distr: rand::distributions::Uniform<usize>,

pos_distr: rand::distributions::Uniform<usize>,

seq: Vec<&'a str>,

}

impl Display for Seq<'_> {

fn fmt(&self, f: &mut Formatter) -> Result<(), Error> {

let pretty: String = self.seq

.iter()

.enumerate()

.map(|(i, nt)| if (i + 1) % 60 == 0 { format!("{}\n", nt) } else { nt.to_string() })

.collect();

let counts_hm = self.seq

.iter()

.fold(HashMap::<&str, usize>::new(), |mut m, nt| {

*m.entry(nt).or_default() += 1;

m

});

let mut counts_vec: Vec<(&str, usize)> = counts_hm.into_iter().collect();

counts_vec.sort_by(|a, b| a.0.cmp(&b.0));

let counts_string = counts_vec

.iter()

.fold(String::new(), |mut counts_string, (nt, count)| {

counts_string += &format!("{} = {}\n", nt, count);

counts_string

});

write!(f, "Seq:\n{}\n\nLength: {}\n\nCounts:\n{}", pretty, self.seq.len(), counts_string)

}

impl Seq<'_> {

pub fn new(alphabet: Vec<&str>, len: usize) -> Seq {

let distr = rand::distributions::Uniform::new_inclusive(0, alphabet.len() - 1);

let pos_distr = rand::distributions::Uniform::new_inclusive(0, len - 1);

let seq: Vec<&str> = (0..len)

.map(|_| {

alphabet[thread_rng().sample(distr)]

})

.collect();

Seq { alphabet, distr, pos_distr, seq }

}

pub fn insert(&mut self) {

let pos = thread_rng().sample(self.pos_distr);

let nt = self.alphabet[thread_rng().sample(self.distr)];

println!("Inserting {} at position {}", nt, pos);

self.seq.insert(pos, nt);

}

pub fn delete(&mut self) {

let pos = thread_rng().sample(self.pos_distr);

println!("Deleting {} at position {}", self.seq[pos], pos);

self.seq.remove(pos);

}

pub fn swap(&mut self) {

let pos = thread_rng().sample(self.pos_distr);

let cur_nt = self.seq[pos];

let new_nt = self.alphabet[thread_rng().sample(self.distr)];

println!("Replacing {} at position {} with {}", cur_nt, pos, new_nt);

self.seq[pos] = new_nt;

}

fn main() {

let mut seq = Seq::new(vec!["A", "C", "T", "G"], 200);

println!("Initial sequnce:\n{}", seq);

let mut_distr = rand::distributions::Uniform::new_inclusive(0, 2);

for _ in 0..10 {

let mutation = thread_rng().sample(mut_distr);

if mutation == 0 {

seq.insert()

} else if mutation == 1 {

seq.delete()

} else {

seq.swap()

}

println!("\nMutated sequence:\n{}", seq);

}

</syntaxhighlight>

<pre>

Initial sequnce:

Seq:

TAAGTTTAGTCTGTTTACGAGATCTAGAGGAGGACACCGTGTAGAGGGGATTTGTCAGGA

CACATGCATGGCACCCTAGTCAAATAGTGCCGAGAACAGGCTCTCCTGAGAAAGTTAGGT

CTGCCGAAGTGACGAAGTGCACGTTATAGCTCTATTAAGTATGTTCGTTAACAGGTATTA

ATGCTCTTAGCCAAGACCGT

Length: 200

Counts:

A = 56

C = 38

G = 53

T = 53

Deleting C at position 197

Inserting T at position 157

Replacing C at position 149 with G

Replacing A at position 171 with G

Replacing T at position 182 with G

Deleting C at position 124

Inserting T at position 128

Replacing G at position 175 with C

Deleting A at position 35

Replacing A at position 193 with G

Mutated sequence:

Seq:

TAAGTTTAGTCTGTTTACGAGATCTAGAGGAGGACCCGTGTAGAGGGGATTTGTCAGGAC

ACATGCATGGCACCCTAGTCAAATAGTGCCGAGAACAGGCTCTCCTGAGAAAGTTAGGTC

TGCGAAGTTGACGAAGTGCACGTTATAGGTCTATTATAGTATGTTCGTTAGCAGCTATTA

AGGCTCTTAGCCAGGACGT

Length: 199

Counts:

A = 53

C = 36

G = 56

T = 54</pre>