Figures (3)  Tables (4)
    • Figure 1. 

      Proposed pipeline. Top: Derivation of the network properties to network property features; Bottom left: Cross-Validation (CV) tuning and permutation-assisted tuning; Bottom right: Summary of the feature selection models (Lasso and Group Lasso).

    • Figure 2. 

      The results for the lung cancer dataset. (a) and (b) Networks for two representative patients. Within each network figure, each node represents TCR and nodes are connected if their distance is less than or equal to 1. (c) ROC curves for each of the approaches. (d) Boxplots of the composite scores.

    • Figure 3. 

      Performance evaluation based on simulation. (a) Sensitivity. (b) FDR. (c) F-1. (d) Stability. In each panel, x-axis stands for different simulation scenario listed in Table 3 and color-coded for different approaches.

    • Network properties Description Function
      Count at baseline Summation of the TCR counts of a given membership at baseline. sum(input_data[input_datamembership == membership_id, Count_baseline])
      Count post treatment Summation of the TCR counts of a given membership post treatment. sum(input_data [input_data$membership == membership_id, Count_post_treatment])
      Cluster size The number of node in a given membership. table(input_data $membership)
      Diameter length The length of the longest geodesic. get_diameter(net, directed = F)
      Assortativity The assortativity coefficient which is positive if similar vertices (based on some external property) tend to connect to each, or negative otherwise. assortativity_degree(net, directed = F)
      Transitivity The probability that the adjacent vertices of a vertex are connected. transitivity(net, type = "global")
      Density The ratio of the number of edges and the number of possible edges. edge_density(net, loops = F)
      Degree centrality Graph centralization based on the degrees of vertices. centr_degree(net, mode = "in", normalized = T) $centralization
      Closeness centrality Graph centralization based on the closeness of vertices. centr_clo(net, mode = "all", normalized = T) $centralization
      Eigenvector centrality Graph centralization based on the eigenvector centralities of positions v within it. eigen_centrality(net, directed = F, weights = NA) $value
      Central Eigenvector Graph centralization based on the Eigenvector centrality of vertices. centr_eigen(net, directed = F, normalized = T) $centralization

      Table 1. 

      Parameters used for feature selection methods.

    • Network properties Network property features
      Node count Min, Q1, Median, Mean, Q3, Max
      *Count pre infusion Min, Q1, Median, Mean, Q3, Max
      Count dose 2 Min, Q1, Median, Mean, Q3, Max
      *Diameter length Min, Q1, Median, Mean, Q3, Max
      Assortativity prob(NA), Min, Q1, Median, Mean, Q3, Max
      Transitivity prob(NA), Min, Q1, Median, Mean, Q3, Max
      Density Min, Q1, Median, Mean, Q3, Max
      Degree centrality Min, Q1, Median, Mean, Q3, Max
      Closeness centrality prob(NA), Min, Q1, Median, Mean, Q3, Max
      *Eigenvector centrality Min, Q1, Median, Mean, Q3, Max
      *Central Eigen prob(NA), Min, Q1, Median, Mean, Q3, Max
      * Properties/property features in bold font are considered as the causal properties/features in simulation studies.

      Table 2. 

      TCR network properties and derived network property features.

    • Scenario # *n **P ***Balanced data ****Linear/non-linear
      Causal Non-causal Additional
      n200_p270_baln_lin 200 25 45 200 Balanced Linear
      n200_p270_baln_nonlin 200 25 45 200 Balanced Non-linear
      n200_p70_baln_lin 200 25 45 NA Balanced Linear
      n200_p70_baln_nonlin 200 25 45 NA Balanced Non-linear
      n300_p270_baln_lin 300 25 45 200 Balanced Linear
      n300_p270_baln_nonlin 300 25 45 200 Balanced Non-linear
      n300_p270_unbaln_lin 300 25 45 200 Imbalanced Linear
      n300_p270_unbaln_nonlin 300 25 45 200 Imbalanced Non-linear
      n300_p70_baln_lin 300 25 45 NA Balanced Linear
      n300_p70_baln_nonlin 300 25 45 NA Balanced Non-linear
      n300_p70_unbaln_lin 300 25 45 NA Imbalanced Linear
      n300_p70_unbaln_nonlin 300 25 45 NA Imbalanced Non-linear
      * n: # of simulated patients. ** P: # of simulated signals (causal TCR network property features, non-causal TCR n/w property features, additional correlated multivariate noise signals). *** Balanced data: denotes the simulated dataset has balanced proportion of shorter and longer survival groups. **** Linear: denotes the simulated response variable Y consist of a linear combination of the causal variables. Non-linear: denotes the simulated variable Y consists of linear and interaction terms generated using the causal variables.

      Table 3. 

      Simulation scenarios.

    • Network properties CV-Lasso* P-Lasso CV-Group
      Lasso**
      P-Group
      Lasso**
      Count pre infusion Max Max All All
      Count dose2 All All
      Node count All
      Diameter length Max Max
      Assortativity
      Transitivity
      Density
      Degree centrality
      Closeness centrality
      Eigenvector centrality Max Max All All
      Central Eigen Max
      * The network property features extracted using CV-Lasso model are used as the causal variables for simulation study. ** The value 'All' represents the entire set of descriptive summary statistics derived from the TCR network property.

      Table 4. 

      Results from real data analysis. The table lists the network properties and corresponding network property features selected by each approach.