java.lang.Object

Stats.LogRegWeka

public class LogRegWeka extends Object

Performs a multinomial logistic regression using WEKA library

Author:: mo55

Constructor Summary

Constructors

Constructor

Description

LogRegWeka()

Main Constructor - create a Multinomial Logistic Regression Model based on the passed training data, using the WEKA library
Method Summary

Modifier and Type

Method

Description

Double

getAttFromProb(double p)

Important: this method ONLY works for binomial (2-class) datasets with a single attribute x, and will return null if that is not true

double[][]

getCoefficients()

Return the coefficients of the classifier

double[][]

getCoeffUncertainty()

double[]

getDistribution(double[] x)

Passing the input array into the current regression model, a double array is passed back which contains the percentage values for each of the possible output classifications.

String

getModelError()

Return exception string from model if it failed to fit.

Double

getPrediction(double[] x)

Get a prediction (output variable) based on the passed input array.

static void

main(String[] args)

Test model using data from https://machinelearningmastery.com/logistic-regression-tutorial-for-machine-learning/

boolean

setTrainingData(double[][] x, double[] y)

Take the training data passed in, convert to something that WEKA understands and then create a logistic regression model

boolean

setTrainingData(double[] xVar, double[] yVar)

Logistic regression where y variable i spassed in as a single array (e.g.

boolean

setTrainingData(double[] xVar, int[] yVar)

Logistic regression where y variable i spassed in as a single array (e.g.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- LogRegWeka
  
  public LogRegWeka()
  
  Main Constructor - create a Multinomial Logistic Regression Model based on the passed training data, using the WEKA library
Method Details
- setTrainingData
  
  public boolean setTrainingData(double[] xVar, int[] yVar)
  
  Logistic regression where y variable i spassed in as a single array (e.g. array of ranges) and y Variable may not be 0's or 1's.
  
  Parameters:
  
  xVar -
  
  yVar -
  
  Returns:
- setTrainingData
  
  public boolean setTrainingData(double[] xVar, double[] yVar)
  
  Logistic regression where y variable i spassed in as a single array (e.g. array of ranges) and y Variable may not be 0's or 1's.
  
  Parameters:
  
  xVar -
  
  yVar -
  
  Returns:
- setTrainingData
  
  public boolean setTrainingData(double[][] x, double[] y)
  
  Take the training data passed in, convert to something that WEKA understands and then create a logistic regression model
  
  Parameters:
  
  x - a 2d array of training data. Columns are any number of input variables (x1, x2, x3... aka attributes) and rows are data points
  
  y - the output variable. Length of array should match number of rows in x parameter. Since this is a logistic regression, the output is considered 'nominal' and not numeric - a distinct classification, and not a continuous variable. It's odd that nominal values should be passed as doubles, but that's what WEKA wants. For best results, use continuous integers starting at 0 - e.g. 0, 1, 2, 3 etc.
  Also, there can't be any gaps in the output of the training dataset - you can't have 0, 1, 2, 4. WEKA will throw an error.
  Keep track in your own code of what each value represents (e.g. for a binomial problem, 0=yes and 1=no; for a weather problem, 0=cold, 1=warm, 2=hot, etc).
  
  Returns:
  
  true=successful, false=unsuccessful
- getPrediction
  
  public Double getPrediction(double[] x)
  
  Get a prediction (output variable) based on the passed input array. The order of the elements in the x array must match the order that was used in the training data. The Double output references the unique values that were used in the training data (0, 1, 2, etc).
  
  Parameters:
  
  x - an array containing the input variables to use in the regression
  
  Returns:
- getDistribution
  
  public double[] getDistribution(double[] x)
  
  Passing the input array into the current regression model, a double array is passed back which contains the percentage values for each of the possible output classifications. Thus, if there are 3 potential classes (0, 1 and 2) then the method will return a 3-element array with a percentage in each index corresponding to the probability of the input variable falling into the corresponding category.
  
  Parameters:
  
  x - an array containing the input variables to use in the regression
  
  Returns:
- getCoefficients
  
  public double[][] getCoefficients()
  
  Return the coefficients of the classifier
  
  Returns:
- getCoeffUncertainty
  
  public double[][] getCoeffUncertainty()
- getAttFromProb
  
  public Double getAttFromProb(double p) throws ArithmeticException
  
  Important: this method ONLY works for binomial (2-class) datasets with a single attribute x, and will return null if that is not true
  
  Given the probability p of classification as the second class, this method returns the attribute x required. If interested in the probability of classification as the first class, pass the value 1-p instead.
  
  The equation solved is P = 1 / (1 + e^{-(b0 + b1x)}), where P is the probability desired for the second class, b0 and b1 are the coefficients, and x is the value that is solved for. Let's say there are 2 possible classes: undetected (y=0) and detected (y=1), and there is a single dependent attribute 'range'. If we want to know the range required for a 70% probability of classification as detected (y=1), we would call this method and pass it 0.7. If we wanted to know the range required for a 70% probability of classification as undetected (y=0), we would call this method and pass it 0.3.
  
  Parameters:
  
  p - the probability desired
  
  Returns:
  
  a Double value for the attribute, or null if this method fails
  
  Throws:
  
  ArithmeticException - thrown if the attribute calculation returns infinity or NaN
- main
  
  public static void main(String[] args)
  
  Test model using data from https://machinelearningmastery.com/logistic-regression-tutorial-for-machine-learning/
  
  Parameters:
  
  args -
- getModelError
  
  public String getModelError()
  
  Return exception string from model if it failed to fit.
  
  Returns:
  
  the modelError

Class LogRegWeka

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

LogRegWeka

Method Details

setTrainingData

setTrainingData

setTrainingData

getPrediction

getDistribution

getCoefficients

getCoeffUncertainty

getAttFromProb

main

getModelError