Chemical synthesis (reagent ratios/conditions/catalysts)

Brewing

Synthetic Biology

Computational problems

Finding the optimal assembly (according to some metric)

Training detection algorithms e.g. motifs, genes etc.

Optimising clustering methods

Really any optimisation problem (especially non-convex)

Specific examples

Given a protein of interest, which sequence will maximise expression? (e.g. 900nt = 300 codons ≈2^{300} possible sequences)

Given a set of sequencing data, which preprocessing/assembly parameters will produce the most likely assembly?

Naive experimental design (Grid Search)

So how would you more efficient choose your datapoints?

Probabilistically!

Step through of optimisation

Results

SpearSeq found the optimal assembly parameters for a test assembly in 4-5 trials

Bayesian Optimisation of synthetic construct expression found 5' UTR free fold energy and 5' UTR length are the two most importantfeatures in expression of the synthetic gene (Gonzalez, 2015)

Conclusion

Never just use a grid search, even a naive random search is better (Bergstra and Bengio, 2012)

Bayesian Optimisation for more efficient experimental design

Not limited to a single parameter (or objective)

Several recent relatively easy to use libraries and implementations (e.g. GPyOpt, Spearmint)

Tell me about experimental ideas that you think this might be applied to