mysql - How to Only select the Columns which can be casted to Numeric Else ignore the column in Spark Sql? -
i have tried lot of things not find proper solution of how select columns can casted double or numeric when have columns of type string
. , ignore rest of columns.
suppose have 100 columns of type string , want check columns can casted numeric , select them , ignore other columns.
for example -
structtype( structfield(snapshotdate,stringtype,true), structfield(country,stringtype,true), structfield(region,stringtype,true), structfield(probability,stringtype,true), structfield(bookingamount,stringtype,true), structfield(revenueamount,stringtype,true) )
here want select revenueamount
, probability
, bookingamount
. other columns either string type or date, ignore them.
is there way that?
thank in advance.
you can use schema
method in dataframe
@ structtype
describe schema of dataframe.
a structtype
set of structfield
s composing dataframe schema, can return seq[structfield]
schema (returned schema
on dataframe) using seq
method.
now can iterate on schema , keeping desired columns using map
method.
val df = //your dataframe definition val schema = df.schema val desiredcolumns = schema.seq.map(structfield => if (structfield.datatype.simplestring //pass condition) structfiled.name val newdf = df.select(desiredcolumns: _*)
notes :
- the
: _*
notation using in scala tell compiler pass values inside , array 1 one (not whole array single parameter) - the
simplestring
method returnstring
representationdatatype
here's examples (string
=>string
,int
=>int
) note representation start lower case instead of uppercase
Comments
Post a Comment