Deployment optimization of laser chargers in self-organizing power transfer internet of things

Junzhe Hu; Chuanwen Luo; Yi Hong; Guopeng Wang; Xin Fan; Deying Li; Junzhe Hu; Chuanwen Luo; Yi Hong; Guopeng Wang; Xin Fan; Deying Li

doi:10.48130/wpt-0025-0004

Figures (8) Tables (4)

Figure 1.
Architecture of the SPTIoT network.
Figure 2.
The framework of the MADDPG algorithm. The blue arrows represent the interaction among the agents and the environment. The black arrows are used to update the parameters of the networks.
Figure 3.
The deployment result of LSCs for a given instance obtained by the MLCCA algorithm. The deployment result (a) when m = 1, (b) when m = 2, (c) when m = 3, (d) when m = 4.
Figure 4.
The convergence performance of the MADDPG approach. (a) The convergence evaluation under different learning rates. (b) Comparison between MADDPG and DDPG.
Figure 5.
Performance comparison by changing the number of IoTDs n.
Figure 6.
Performance comparison by changing the charging radius R_c.
Figure 7.
Performance comparison by varying the maximum battery capacity of IoTDs E_M.
Figure 8.
Performance comparison between SPTIoT and the general IoT network by varying n, R_c, P_c, T_w. Performance comparison by varying (a) the number of IoTDs n, (b) the charging radius R_c, (c) the source power of LSCs P_c, and (d) the working period T_w.

Input: The number of LSCs m, the set C of m LSCs, the set S of n loTDs, time slot length l. The set $ \mathcal{T} $ of $ \Gamma $ time slots;
Output: T_c;
1:	Initialize T_c = 0;
2:	For each $ {s}_{i}\in S $, compute $ {\mathcal{N}}_{i}^{c}=\{{c}_{j}\|{d}_{i,j}^{c}\le {R}_{c}\} $;
3:	For each $ {s}_{i}\in S $, compute $ l{y}_{i} $ by using Dijkstra algorithm;
4:	for $ \tau $ from $ 1 $ to $ \mathrm{\Gamma } $ do
5:	For each $ {s}_{i}\in S $, let $ {E}_{i}^{r}(\tau )={E}_{i}^{r}(\tau -1)-{\delta }_{i}l $.
6:	Sort all loTDs and obtain $ {S}_{o}=\{{s}_{{\rho }_{1}},\cdots ,{s}_{{\rho }_{n}}\} $;
7:	for i from 1 to n do
8:	${\rm{Let}}\; {C}_{{\rho }_{i}}=\{{c}_{j}\|{c}_{j}\in {\mathcal{N}}_{{\rho }_{i}}^{c}\wedge {\mathrm{\Sigma }}_{q=1}^{n}{b}_{q}^{j}(\tau )=0\} ;$
9:	$ {{\rm{Let}}\, {S}_{{\rho }_{i}}^{u} = \{{s}_{k}\|{s}_{k}\in {\mathcal{N}}_{{\rho }_{i}}^{s} \wedge l{y}_{k} = l{y}_{{\rho }_{i}} - 1\wedge {\mathrm{\Sigma }}_{q=1}^{n}{a}_{q}^{k}(\tau ) = 0 \wedge {E}_{k}^{r}(\tau ) - {P}_{s}l > {E}_{{\rho }_{i}}^{r}(\tau)\} ;}$
10:	${\rm{Let}}\; {S}_{{\rho }_{i}}^{e}=\{{s}_{k}\|{s}_{k}\in {\mathcal{N}}_{{\rho }_{i}}^{s}\wedge l{y}_{k}=l{y}_{{\rho }_{i}}\wedge {\mathrm{\Sigma }}_{q=1}^{n}{a}_{q}^{k}(\tau )=0\wedge {E}_{k}^{r}(\tau )-{P}_{s}l > {E}_{{\rho }_{i}}^{r}(\tau )\} ;$
11:	if $ {C}_{{\rho }_{i}}\ne \mathrm{\varnothing } $ then
12:	$ {c}_{v}=\mathrm{arg}min\{{d}_{{\rho }_{i},v}^{c}\|{c}_{v}\in {C}_{{\rho }_{i}}\} $;
13:	$ {b}_{{\rho }_{i}}^{v}(\tau )=1 $, $ {E}_{{\rho }_{i}}^{r}(\tau )=min{\{}{E}_{M},{E}_{{\rho }_{i}}^{r}(\tau )+{P}_{c}^{v,{\rho }_{i}}l{\}} $;
14:	else if $ {S}_{{\rho }_{i}}^{u}\ne \mathrm{\varnothing } $ then
15:	$ {s}_{v}=\mathrm{arg}max\{{E}_{v}^{r}(\tau )\|{s}_{v}\in {S}_{{\rho }_{i}}^{u}\} $ ;
16:	$ {E}_{{\rho }_{i}}^{r}(\tau )={E}_{{\rho }_{i}}^{r}(\tau )+{P}_{s}^{v,{\rho }_{i}}l $;
17:	$ {a}_{{\rho }_{i}}^{v}(\tau )=1,{E}_{v}^{r}(\tau )={E}_{v}^{r}(\tau )-{P}_{s}l $;
18:	else if $ {S}_{{\rho }_{i}}^{e}\ne \mathrm{\varnothing } $ then
19:	$ {s}_{v}=\mathrm{arg}max\{{E}_{v}^{r}(\tau )\|{s}_{v}\in {S}_{{\rho }_{i}}^{e}\} $ ;
20:	$ {E}_{{\rho }_{i}}^{r}(\tau )={E}_{{\rho }_{i}}^{r}(\tau )+{P}_{s}^{v,{\rho }_{i}}l $ ;
21:	$ {a}_{{\rho }_{i}}^{v}(\tau )=1,{E}_{v}^{r}(\tau )={E}_{v}^{r}(\tau )-{P}_{s}l $;
22:	end if
23:	Let $ {S}_{o}={S}_{o}{\setminus }\{{s}_{{\rho }_{i}}\} $ and re-sort $ {S}_{o}=\{{s}_{{\rho }_{i+1}},{s}_{{\rho }_{i+2}},\cdots ,{s}_{{\rho }_{n}}\} $;
24:	end for
25:	if $ min{\{}{E}_{i}^{r}(\tau )\|1\le i\le n{\}} < {E}_{T} $ then
26:	break;
27:	end if
28:	$ {T}_{c}=l\tau $;
29:	end for
30:	return T_c;

Table 1.

Layered charging scheduling strategy.

Input: The number of episodes N_e, time steps T, a small constant $ \in $, the number of LSCs m, the set S of loTDs, the number of grids L × L;
Output: Positions of m LSCs, Tc;
1:	for each $ {c}_{j}\in C $ do
2:	Initialize parameters of actor and target network $ {\theta }_{j}^{\mu } $ and $ {\theta }_{j}^{{\mu }'} $;
3:	Initialize parameters of actor and target network $ {\theta }_{j}^{Q} $ and $ {\theta }_{j}^{{Q}'} $;
4:	end for
5:	Initially place m agents in the center of the intermediate grid;
6:	Obtain $ t{f}^{0} $ by calling Algorithm 1;
7:	Clear out the replay buffer $ D $;
8:	for episode from 1 to N_e do
9:	Reset the positions of agents;
10:	Initialize the exploration noise $ \mathcal{N} $;
11:	for time step t from 1 to T do
12:	for agent j from 1 to m do
13:	Get the observation $ {o}_{j}^{t} $;
14:	Choose action $ {a}_{j}^{t}={\mu }_{j}({o}_{j}^{t};{\theta }_{j}^{\mu })+\mathcal{N} $;
15:	end for
16:	Execute joint actions $ {\mathit{a}}^{\mathit{t}}= \{{a}_{1}^{t},... ,{a}_{m}^{t}\} $ of all agents;
17:	Execute Algorithm l to obtain $ t{f}^{t} $;
18:	Obtain reward $ {\mathit{r}}^{\mathit{t}} $ and next observations $ {\mathit{o}}^{\mathit{t}+1} $;
19:	Store $ ({\mathbf{o}}^{\mathbf{t}},{\mathbf{a}}^{\mathbf{t}},{\mathbf{r}}^{\mathbf{t}},{\mathbf{o}}^{\mathbf{t}+1}) $ in replay buffer $ D $;
20:	Select random transitions $ \mathcal{K} $ from $ D $;
21:	for agent j from 1 to m do
22:	Update $ {\theta }_{j}^{\mu } $ and $ {\theta }_{j}^{Q} $ by equation (11) and (12);
23:	Update $ {\theta }_{j}^{{\mu '}} $ and $ {\theta }_{j}^{{Q'}} $ by equation (13) and (14);
24:	end for
25:	end for
26:	end for
27:	Reset the positions of agents;
28:	for time step t from 1 to T do
29:	for agent j from 1 to m do
30:	Get the observation $ {o_j^t} $;
31:	Obtain the action $ {a}_j^t={\mu }_{j}({o}_{j}^{t};{\theta }_{j}^{\mu }) $;
32:	Execute the action $ {a}_{j}^{t} $;
33:	end for
34:	end for
35:	Execute Algorithm l to obtain T_c;
36:	return positions of m agents and T_c;

Table 2.

Deploy chargers based on MADDPG.

Input: The set of IoTDs S;
Output: m;
1:	Divide A evenly into L × L grids;
2:	for m from 1 to n do
3:	Execute Algorithm 2 to deploy m LSCs and obtain T_c;
4:	if T_c == T_w then
5:	break;
6:	else
7:	Let m = m + 1;
8:	end if
9:	end for
10:	return m;

Table 3.

MLCCA.

Parameters	Value	Parameters	Value
A	[200 m, 200 m]²	N_e	6,000
L	20	T	20
E_T	200 J	$ \left\|\mathcal{D}\right\| $	25,600
$ l $	1 s	$ \left\|\mathcal{K}\right\| $	256
$ {\delta }_{s} $	10⁻⁵	$ \epsilon $	0.005
$ {\eta }_{el} $	0.3	$ \gamma $	0.95
$ \omega {A}_{c}\chi $	0.004 m²	$ \mathcal{N} $	0.1
$ \alpha $	10⁻⁶ m	$ {P}_{b} $	50
$ D $	0.1 m	Layer type	Fully connected
$ {\mathrm{\Delta }}_{\theta } $	3.4 × 10⁻5	Optimizer	Adam
$ {\delta }_{i} $	[5 W, 20 W]

Table 1.

Experimental parameters.