Figures (8)  Tables (4)
    • Figure 1. 

      Architecture of the SPTIoT network.

    • Figure 2. 

      The framework of the MADDPG algorithm. The blue arrows represent the interaction among the agents and the environment. The black arrows are used to update the parameters of the networks.

    • Figure 3. 

      The deployment result of LSCs for a given instance obtained by the MLCCA algorithm. The deployment result (a) when m = 1, (b) when m = 2, (c) when m = 3, (d) when m = 4.

    • Figure 4. 

      The convergence performance of the MADDPG approach. (a) The convergence evaluation under different learning rates. (b) Comparison between MADDPG and DDPG.

    • Figure 5. 

      Performance comparison by changing the number of IoTDs n.

    • Figure 6. 

      Performance comparison by changing the charging radius Rc.

    • Figure 7. 

      Performance comparison by varying the maximum battery capacity of IoTDs EM.

    • Figure 8. 

      Performance comparison between SPTIoT and the general IoT network by varying n, Rc, Pc, Tw. Performance comparison by varying (a) the number of IoTDs n, (b) the charging radius Rc, (c) the source power of LSCs Pc, and (d) the working period Tw.

    • Input: The number of LSCs m, the set C of m LSCs, the set S of n loTDs, time slot length l. The set $ \mathcal{T} $ of $ \Gamma $ time slots;
      Output: Tc;
      1: Initialize Tc = 0;
      2: For each $ {s}_{i}\in S $, compute $ {\mathcal{N}}_{i}^{c}=\{{c}_{j}|{d}_{i,j}^{c}\le {R}_{c}\} $;
      3: For each $ {s}_{i}\in S $, compute $ l{y}_{i} $ by using Dijkstra algorithm;
      4: for $ \tau $ from $ 1 $ to $ \mathrm{\Gamma } $ do
      5: For each $ {s}_{i}\in S $, let $ {E}_{i}^{r}(\tau )={E}_{i}^{r}(\tau -1)-{\delta }_{i}l $.
      6: Sort all loTDs and obtain $ {S}_{o}=\{{s}_{{\rho }_{1}},\cdots ,{s}_{{\rho }_{n}}\} $;
      7: for i from 1 to n do
      8: ${\rm{Let}}\; {C}_{{\rho }_{i}}=\{{c}_{j}|{c}_{j}\in {\mathcal{N}}_{{\rho }_{i}}^{c}\wedge {\mathrm{\Sigma }}_{q=1}^{n}{b}_{q}^{j}(\tau )=0\} ;$
      9: $ {{\rm{Let}}\, {S}_{{\rho }_{i}}^{u} = \{{s}_{k}|{s}_{k}\in {\mathcal{N}}_{{\rho }_{i}}^{s} \wedge l{y}_{k} = l{y}_{{\rho }_{i}} - 1\wedge {\mathrm{\Sigma }}_{q=1}^{n}{a}_{q}^{k}(\tau ) = 0 \wedge {E}_{k}^{r}(\tau ) - {P}_{s}l > {E}_{{\rho }_{i}}^{r}(\tau)\} ;}$
      10: ${\rm{Let}}\; {S}_{{\rho }_{i}}^{e}=\{{s}_{k}|{s}_{k}\in {\mathcal{N}}_{{\rho }_{i}}^{s}\wedge l{y}_{k}=l{y}_{{\rho }_{i}}\wedge {\mathrm{\Sigma }}_{q=1}^{n}{a}_{q}^{k}(\tau )=0\wedge {E}_{k}^{r}(\tau )-{P}_{s}l > {E}_{{\rho }_{i}}^{r}(\tau )\} ;$
      11: if $ {C}_{{\rho }_{i}}\ne \mathrm{\varnothing } $ then
      12: $ {c}_{v}=\mathrm{arg}min\{{d}_{{\rho }_{i},v}^{c}|{c}_{v}\in {C}_{{\rho }_{i}}\} $;
      13: $ {b}_{{\rho }_{i}}^{v}(\tau )=1 $, $ {E}_{{\rho }_{i}}^{r}(\tau )=min{\{}{E}_{M},{E}_{{\rho }_{i}}^{r}(\tau )+{P}_{c}^{v,{\rho }_{i}}l{\}} $;
      14: else if $ {S}_{{\rho }_{i}}^{u}\ne \mathrm{\varnothing } $ then
      15: $ {s}_{v}=\mathrm{arg}max\{{E}_{v}^{r}(\tau )|{s}_{v}\in {S}_{{\rho }_{i}}^{u}\} $ ;
      16: $ {E}_{{\rho }_{i}}^{r}(\tau )={E}_{{\rho }_{i}}^{r}(\tau )+{P}_{s}^{v,{\rho }_{i}}l $;
      17:
      $ {a}_{{\rho }_{i}}^{v}(\tau )=1,{E}_{v}^{r}(\tau )={E}_{v}^{r}(\tau )-{P}_{s}l $;
      18: else if $ {S}_{{\rho }_{i}}^{e}\ne \mathrm{\varnothing } $ then
      19: $ {s}_{v}=\mathrm{arg}max\{{E}_{v}^{r}(\tau )|{s}_{v}\in {S}_{{\rho }_{i}}^{e}\} $ ;
      20: $ {E}_{{\rho }_{i}}^{r}(\tau )={E}_{{\rho }_{i}}^{r}(\tau )+{P}_{s}^{v,{\rho }_{i}}l $ ;
      21: $ {a}_{{\rho }_{i}}^{v}(\tau )=1,{E}_{v}^{r}(\tau )={E}_{v}^{r}(\tau )-{P}_{s}l $;
      22: end if
      23: Let $ {S}_{o}={S}_{o}{\setminus }\{{s}_{{\rho }_{i}}\} $ and re-sort $ {S}_{o}=\{{s}_{{\rho }_{i+1}},{s}_{{\rho }_{i+2}},\cdots ,{s}_{{\rho }_{n}}\} $;
      24: end for
      25: if $ min{\{}{E}_{i}^{r}(\tau )|1\le i\le n{\}} < {E}_{T} $ then
      26: break;
      27: end if
      28: $ {T}_{c}=l\tau $;
      29: end for
      30: return Tc;

      Table 1. 

      Layered charging scheduling strategy.

    • Input: The number of episodes Ne, time steps T, a small constant $ \in $, the number of LSCs m, the set S of loTDs, the number of grids L × L;
      Output: Positions of m LSCs, Tc;
      1: for each $ {c}_{j}\in C $ do
      2: Initialize parameters of actor and target network $ {\theta }_{j}^{\mu } $ and $ {\theta }_{j}^{{\mu }'} $;
      3: Initialize parameters of actor and target network $ {\theta }_{j}^{Q} $ and $ {\theta }_{j}^{{Q}'} $;
      4: end for
      5: Initially place m agents in the center of the intermediate grid;
      6: Obtain $ t{f}^{0} $ by calling Algorithm 1;
      7: Clear out the replay buffer $ D $;
      8: for episode from 1 to Ne do
      9: Reset the positions of agents;
      10: Initialize the exploration noise $ \mathcal{N} $;
      11: for time step t from 1 to T do
      12: for agent j from 1 to m do
      13: Get the observation $ {o}_{j}^{t} $;
      14: Choose action $ {a}_{j}^{t}={\mu }_{j}({o}_{j}^{t};{\theta }_{j}^{\mu })+\mathcal{N} $;
      15: end for
      16: Execute joint actions $ {\mathit{a}}^{\mathit{t}}= \{{a}_{1}^{t},... ,{a}_{m}^{t}\} $ of all agents;
      17:
      Execute Algorithm l to obtain $ t{f}^{t} $;
      18: Obtain reward $ {\mathit{r}}^{\mathit{t}} $ and next observations $ {\mathit{o}}^{\mathit{t}+1} $;
      19: Store $ ({\mathbf{o}}^{\mathbf{t}},{\mathbf{a}}^{\mathbf{t}},{\mathbf{r}}^{\mathbf{t}},{\mathbf{o}}^{\mathbf{t}+1}) $ in replay buffer $ D $;
      20: Select random transitions $ \mathcal{K} $ from $ D $;
      21: for agent j from 1 to m do
      22: Update $ {\theta }_{j}^{\mu } $ and $ {\theta }_{j}^{Q} $ by equation (11) and (12);
      23: Update $ {\theta }_{j}^{{\mu '}} $ and $ {\theta }_{j}^{{Q'}} $ by equation (13) and (14);
      24: end for
      25: end for
      26: end for
      27: Reset the positions of agents;
      28: for time step t from 1 to T do
      29: for agent j from 1 to m do
      30: Get the observation $ {o_j^t} $;
      31: Obtain the action $ {a}_j^t={\mu }_{j}({o}_{j}^{t};{\theta }_{j}^{\mu }) $;
      32: Execute the action $ {a}_{j}^{t} $;
      33: end for
      34: end for
      35: Execute Algorithm l to obtain Tc;
      36: return positions of m agents and Tc;

      Table 2. 

      Deploy chargers based on MADDPG.

    • Input: The set of IoTDs S;
      Output: m;
      1: Divide A evenly into L × L grids;
      2: for m from 1 to n do
      3: Execute Algorithm 2 to deploy m LSCs and obtain Tc;
      4: if Tc == Tw then
      5: break;
      6: else
      7: Let m = m + 1;
      8: end if
      9: end for
      10: return m;

      Table 3. 

      MLCCA.

    • Parameters Value Parameters Value
      A [200 m, 200 m]2 Ne 6,000
      L 20 T 20
      ET 200 J $ \left|\mathcal{D}\right| $ 25,600
      $ l $ 1 s $ \left|\mathcal{K}\right| $ 256
      $ {\delta }_{s} $ 10−5 $ \epsilon $ 0.005
      $ {\eta }_{el} $ 0.3 $ \gamma $ 0.95
      $ \omega {A}_{c}\chi $ 0.004 m2 $ \mathcal{N} $ 0.1
      $ \alpha $ 10−6 m $ {P}_{b} $ 50
      $ D $ 0.1 m Layer type Fully connected
      $ {\mathrm{\Delta }}_{\theta } $ 3.4 × 105 Optimizer Adam
      $ {\delta }_{i} $ [5 W, 20 W]

      Table 1. 

      Experimental parameters.