Figures (7)  Tables (2)
    • Figure 1. 

      Huber loss function (a) and Berhu penalty function (b); The 2D contours of Huber loss function (c) and Berhu penalty function (d).

    • Figure 2. 

      Estimation picture for the Huber-Berhu regression (a) when least absolute shrinkage and selection operator (LASSO) (b) and ridge (c) regressions are used as a comparison.

    • Figure 3. 

      Comparison of running time for Algorithm 1 and CVX. $p $ is the number of independent variables in TF-matrix ($X $).

    • Figure 4. 

      The implementation of Huber-Berhu-Partial Least Squares (HB-PLS) to identify candidate regulatory genes controlling lignin biosynthesis pathway. (a) HB-PLS; (b) SPLS. Green nodes (inside the circles) represent lignin biosynthesis genes. Coral nodes represent positive lignin pathway regulators supported by existing literature, and shallow purple nodes contain other predicted transcription factors that are not supported by current available literature. (c) The lignin biosynthesis pathway.

    • Figure 5. 

      The implementation of Huber-Berhu-Partial Least Squares (HB-PLS) to identify candidate regulatory genes (purple and coral nodes) controlling photosynthesis and related pathway genes. (a) was compared with the sparse partial least squares (SPLS) method (b) in identifying regulators that affects maize photosynthesis light reaction and Calvin cycle pathway genes. The green and yellow nodes within the cycles represent photosynthesis light reaction pathway genes and Calvin cycle pathway genes, respectively. Coral nodes in the circles represent positive predicted biological process or pathway regulators that are supported by existing literature, and shallow purple nodes contain other predicted TFs that do not have experimentally validated supporting evidence at present.

    • Figure 6. 

      The receiver operating characteristic (ROC) curves of Huber-Berhu-partial least squares (HB-PLS) and sparse partial least squares (SPLS) methods for identifying pathway regulators in Arabidopsis thaliana. (a) Lignin biosynthesis pathway; (b) a merged pathway of light reaction pathway and Calvin cycle pathway.

    • Figure 7. 

      An integrative framework for identifying biological process and pathway regulators from high-throughput gene expression data by integration of statistics, machine learning and convex optimization. PLS: Partial least squares.

    • Algorithm 1: Accelerated proximal gradient descent method to minimize $ f\left({\boldsymbol{\beta }}\right) $ in equation (7) respected to $ {\beta }_{0} $ and ${\boldsymbol{\beta}}$
      Input: predictor matrix ($X $), dependent vector ($y $), and penalty constant ($ {\boldsymbol{\lambda}}$)
      Output: regression coefficient ($ {\boldsymbol{\beta }} $)
      1Initiate $ {\boldsymbol{\beta }}={\bf{0}} $, $\boldsymbol{t}$ = 1, $ {{\boldsymbol{\beta }}}_{\boldsymbol{p}\boldsymbol{r}\boldsymbol{e}\boldsymbol{v}}={\bf{0}} $
      2For $k $ in 1… MAX_ITER
      3$v={\boldsymbol{\beta }}+\left(k/\left( {k + 3} \right)\right)\boldsymbol{*}\left({\boldsymbol{\beta }}-{{\boldsymbol{\beta }}}_{\boldsymbol{p}\boldsymbol{r}\boldsymbol{e}\boldsymbol{v}}\right)$
      4compute the gradient of Huber loss at $ v $ using (5), denoted as $ {\boldsymbol{G}}_{\boldsymbol{v}} $
      5while TRUE
      6compute $ {\boldsymbol{p}}_{1}={\boldsymbol{P}\boldsymbol{r}\boldsymbol{o}\boldsymbol{x}}_{\boldsymbol{t},\boldsymbol{\lambda }\left|\cdot \right|}\left(\boldsymbol{v}\right) $ using (10)
      7compute ${\boldsymbol{p}}_{2}={\boldsymbol{P}\boldsymbol{r}\boldsymbol{o}\boldsymbol{x}}_{\boldsymbol{t},\boldsymbol{\lambda }\boldsymbol{u}}\left(\boldsymbol{p}_1\right)$ using (9)
      8if ${\bf\sum }_{i=1}^{n}{\boldsymbol{H}}_{\boldsymbol{M}}\left({\boldsymbol{y}}_{\boldsymbol{i}} -{\boldsymbol{\beta}}_{\boldsymbol{0}}- {\boldsymbol{x}}_{\boldsymbol{i}}^{\boldsymbol{T}}{\boldsymbol{p}}_{2}\right)\le {\sum }_{i=1}^{n}{\boldsymbol{H}}_{\boldsymbol{M}}\left({\boldsymbol{y}}_{\boldsymbol{i}} -{\boldsymbol{\beta}}_{\boldsymbol{0}}- {\boldsymbol{x}}_{\boldsymbol{i}}^{\boldsymbol{T}}\boldsymbol{v}\right)+$${\boldsymbol{G}}_{\boldsymbol{v}}'({\boldsymbol{p}}_{\bf 2} -\boldsymbol{v})+ \frac{\bf 1}{\bf 2\boldsymbol{t}}{\left|\right|{\boldsymbol{p}}_{\bf 2}-\boldsymbol{v}\left|\right|}_{\bf 2}^{\bf 2}$
      9break
      10else $ t=t*0.5 $
      11$ {{\boldsymbol{\beta }}}_{\boldsymbol{p}\boldsymbol{r}\boldsymbol{e}\boldsymbol{v}}={\boldsymbol{\beta }} $, $ {\boldsymbol{\beta }}={\boldsymbol{p}}_{2} $
      12if converged
      13break
    • Algorithm 2: Finding the solution of the Huber-Berhu PLS regression
      Input: TF matrix ($ X $), pathway matrix ($ Y $), penalty constant (${\boldsymbol{\lambda}}$), and number of components ($ K $)
      Output: regression coefficient matrix ($ A $)
      1$ {\boldsymbol{X}}_{0}=\boldsymbol{X},{\boldsymbol{X}}_{0}={\boldsymbol{Y}} $, $ {\boldsymbol{c}\boldsymbol{F}}={\boldsymbol{I}} $, $ {\boldsymbol{A}}={\bf{0}} $
      2For $ k $ in 1,...,$K $
      3set $ {\boldsymbol{M}}_{\boldsymbol{k}-1}={\boldsymbol{X}}_{\boldsymbol{k}-1}'{\boldsymbol{Y}}_{\boldsymbol{k}-1} $
      4Initialize $ {\boldsymbol{u} }$ to be the first left singular vector and initialize $ {\boldsymbol{v}} $ to be the product of first right singular vectors and first singular value.
      5until convergence of $ {\boldsymbol{u}} $ and $ {\boldsymbol{v}} $
      6update ${ \boldsymbol{u}} $ using (16)
      7update $ {\boldsymbol{v}} $ using (17)
      8extract component $ { {{ξ}}} ={\boldsymbol{X}\boldsymbol{u}} $
      9compute regression coefficients in (8) ${\boldsymbol c}={\boldsymbol X}'{ {{ξ}}}/({ {{ξ}}}'{ {{ξ}}}), \;{\boldsymbol d}={\boldsymbol Y}'{ {{ξ}}}/$$({ {{ξ}}}'{ {{ξ}}}) $
      10update $\boldsymbol{A}=\boldsymbol{A}+\boldsymbol{c}\boldsymbol{F}\cdot \boldsymbol{u}\cdot \boldsymbol{d}'$
      11update ${\boldsymbol{c}\boldsymbol{F}}={\boldsymbol{c}\boldsymbol{F}}\cdot ({\bf{I}}-{\boldsymbol{u}}\cdot {\boldsymbol{c}}'$)
      12compute residuals for $X $ and $ Y$, ${\boldsymbol{X}}={\boldsymbol{X}}- { {{ξ}}}{\boldsymbol c}'$, $ {\boldsymbol{Y}}= {\boldsymbol{Y}}- { {{ξ}}}{\boldsymbol d}$