postgresql - How to use non-numeric independent variables while training a Linear Regression Model with MADlib-postgre? -


my table contains character field , 2 numeric fields:

create table lr_source (char01 varchar(250) ,plnumeric01 numeric ,plnumeric02 numeric); 

i want train linear regression model char01 , plnumeric01 independent variables , plnumeric02 dependent variable.

select madlib.linregr_train( 'lr_source',    --source table                              'lr_model',--model table                              'plnumeric02',  --dependent variable                              'array[plnumeric01, char01 ]' --independent variables                            ); 

when running above query, fails following error:

error:  spiexceptions.datatypemismatch: array types numeric , character varying cannot matched 

how can use non-numeric fields independent variable?

i suggest encode categorical variables per http://madlib.apache.org/docs/master/group__grp__encode__categorical.html make them numeric, , can pass them linear regression.

also, want add explicit intercept in user doc examples:

select madlib.linregr_train( 'houses',                              'houses_linregr_bedroom',                              'price',                              'array[1, tax, bath, size]',                              'bedroom'                            ); 

Comments

Popular posts from this blog

networking - Vagrant-provisioned VirtualBox VM is not reachable from Ubuntu host -

c# - ASP.NET Core - There is already an object named 'AspNetRoles' in the database -

ruby on rails - ArgumentError: Missing host to link to! Please provide the :host parameter, set default_url_options[:host], or set :only_path to true -