Machine Learning on noisy genome data. Scikit-learn python


4 hours ago by



I want to classify data using three dimensions, lets call them: A,B, and C

B and C are almost always positively correlated. B+C and A are usually negatively correlated. However C is usually an “all or none” statistic; we see it sometimes but not always.

With this in mind I chose to classify data using Linear Discriminant Analysis in the scikit-learn python library.

I’m not entirely married to LDA but my PI would like to keep a linear model.

I would like to train the data but apply a weight expressed in this pseudo-code

   lda = LDA.()
   lda.train(trainX,trainY, weights=('None','None',"all_or_none") )
   # "all_or_none" indicates that when C is absent to NOT penalize the prediction

I’m a little naive in machine learning, maybe there’s another way to do this in scikit-learn?


Source: Machine Learning on noisy genome data. Scikit-learn python

Via: Google Alert for ML

Pin It on Pinterest

Share This