r - Spliting then plotting uneven vector lengths to a single graph -


i'm using data in format shown: actual data set longer. column labels are: date | variable 1 | variable 2 | failed ?

i'm sorting data date order. dates may missing, ordering function should sorting out. there, i'm trying split data sets new sets denoted far right column registering 1. i'm trying plot these sets on single graph number of days passed on x-axis. i've looked using ggplot function, seems require frames length of each vector known. tried creating matrix of length based on maximum number of days passed sets , fill spare cells nan values plotted, took ages data set quite large. wondering whether there more elegant way of plotting values against days past sets on single graph, , iterate process additional variables.
appreciated.

code reproducible example included here:

test <-matrix(c( "01/03/1997",   0.521583294,    0.315170092,    0, "02/03/1997",   0.63946859, 0.270870821,    0, "03/03/1997",   0.698687101,    0.253495021,    0, "04/03/1997",   0.828754157,    0.233024574,    0, "05/03/1997",   0.87078867, 0.214507537,    0, "06/03/1997",   0.883279874,    0.212268627,    0, "07/03/1997",   0.952083969,    0.062663598,    0, "08/03/1997",   0.991100195,    0.054875256,    0, "09/03/1997",   0.992490126,    0.026610776,    1, "10/03/1997",   0.020707391,    0.866874513,    0, "11/03/1997",   0.32405139, 0.778696984,    0, "12/03/1997",   0.32665243, 0.703234151,    0, "13/03/1997",   0.603941956,    0.362869647,    0, "14/03/1997",   0.944046386,    0.026992527,    1, "15/03/1997",   0.108246142,    0.939363715,    0, "16/03/1997",   0.152195386,    0.907458966,    0, "17/03/1997",   0.285748169,    0.765212667,    0), ncol = 4, byrow=true) colnames(test) <- c("date", "variable 1", "variable 2", "failed") test <-as.table(test) test 

i've managed hash solution, looks messy. i'm convinced there far more elegant way of solving this.

z = as.data.frame.matrix(test) attach(z)   x = as.numeric(as.character(failed)) x = cumsum(x) #variable names recycled 

a corrected cumulative failure sum puts data sets of number of preceding failures

z <- within(z, acc_sum <- x) attach(z) z$acc_sum <- as.numeric(as.character(z$acc_sum))-as.numeric(as.character(z$failed))  attach(z)  z = data.frame(z, group_index=ave(acc_sum==acc_sum,acc_sum,fun=cumsum) 

an row created has number of days passed since start of measurement. it's easier read code keep new variable names keep indexing directly.

attach(z)  x = (max(acc_sum)+1) #this number of sets of variable results 

current columns read: date|variable.1|variable.2|failed|acc_sum|group_index

library(ggplot2)  n = data.frame(acc_sum, group_index)     

this initialises frame , should make faster group_index , acc_sum aren't read-in each time.

for(j in 1:(ncol(z)-4)){    #this iterates through variables generate new set of lists. -4 removing date, failed, group_index , acc_sum n$variable <- z[,(j+1)] #this reads in new variable data, requires variables next each other     n[] <- lapply(n,function(x)as.numeric(as.character(x))) #this ensures values numeric plotting  plot <- ggplot(n, aes(x = group_index, y = variable, colour = acc_sum)) +     theme_bw() +     geom_line(aes(group=acc_sum))   #linetype = "dotted" print(plot) #this ensures graph presented in every iteration  cat ("press [enter] continue")   #this waits user input before moving next variable     line <- readline() } 

one of outputs shown here graph improved actual variable name change being plotted. done including ylabel in for loop.


Comments

Popular posts from this blog

networking - Vagrant-provisioned VirtualBox VM is not reachable from Ubuntu host -

c# - ASP.NET Core - There is already an object named 'AspNetRoles' in the database -

android - IllegalStateException: Cannot call this method while RecyclerView is computing a layout or scrolling -