Struct rustml::datasets::Mixture [] [src]

pub struct Mixture {
    // some fields omitted
}

Generates random multi-dimensional data points (a population) from different normally distributed sources (subpopulations).

Example

use rustml::datasets::*;

let seed = [2, 3, 5, 7];
let m = 
    mixture_builder()
        .add(100, normal_builder(seed).add(1.0, 1.2).add(2.0, 1.2))
        .add(100, normal_builder(seed).add(5.0, 1.5).add(6.0, 1.5))
        .add(100, normal_builder(seed).add(6.0, 1.5).add(0.0, 1.5))
        .as_matrix();
assert_eq!(m.rows(), 300);
assert_eq!(m.cols(), 3);

Methods

impl Mixture

fn add(&self, n: usize, src: NormalData) -> Mixture

Adds a normally distributed data source (subpopulation) which generates n samples.

fn as_matrix(&mut self) -> Matrix<f64>

Returns a matrix which contains the population consisting of one or more subpopulations.

Each row of the matrix represents a sample generated from one of the subpopulations.

The value in the first column of a row denotes the subpopulation from which this sample has been generated. If the value is 0 this sample has been created from the first subpopulation (i.e. the first data source that has been added with the add method). If the value is 1 this sample has been created from the second subpopulation and so on.

The following columns denote the dimensions of the data sources.

Example

Let's assume you create a mixture model with two data sources each with two dimensions as follows:

let m = 
    mixture_builder()
        .add(100, normal_builder(seed).add(1.0, 0.2).add(2.0, 0.2))
        .add(100, normal_builder(seed).add(5.0, 0.2).add(6.0, 0.2))
        .as_matrix();

Then, the matrix could look like: