postgresql - How to use non-numeric independent variables while training a Linear Regression Model with MADlib-postgre? -
my table contains character field , 2 numeric fields:
create table lr_source (char01 varchar(250) ,plnumeric01 numeric ,plnumeric02 numeric);
i want train linear regression model char01 , plnumeric01 independent variables , plnumeric02 dependent variable.
select madlib.linregr_train( 'lr_source', --source table 'lr_model',--model table 'plnumeric02', --dependent variable 'array[plnumeric01, char01 ]' --independent variables );
when running above query, fails following error:
error: spiexceptions.datatypemismatch: array types numeric , character varying cannot matched
how can use non-numeric fields independent variable?
i suggest encode categorical variables per http://madlib.apache.org/docs/master/group__grp__encode__categorical.html make them numeric, , can pass them linear regression.
also, want add explicit intercept in user doc examples:
select madlib.linregr_train( 'houses', 'houses_linregr_bedroom', 'price', 'array[1, tax, bath, size]', 'bedroom' );
Comments
Post a Comment