Robust position sensing with wave fingerprints in dynamic complex propagation environments

The author demonstrates that wave fingerprints harnessing the extreme sensitivity of wave chaos to perturbations can be used to localize an object inside a complex scattering environment.


Robust position sensing with wave fingerprints in dynamic complex propagation environments
Philipp del Hougne

To cite this version:
Philipp del Hougne. Robust position sensing with wave fingerprints in dynamic complex propagation environments. Physical Review Research, American Physical Society, 2020, 2 (4), pp.043224. 10.1103/PhysRevResearch.2.043224. hal-03127270 some cases, the position to be identified may not even be within the sensor's line of sight, but hidden around a corner. In such complex environments, a propagating wave front can get completely scrambled such that its angle or time of arrival cannot be used for position sensing with conventional ray-tracing analysis. Considerable research effort thus goes into overcoming the issues posed by multipath effects, for instance, using distributed sensor networks encircling the region of interest in combination with a statistical analysis of shadowing effects and/or geometry-based environment models to account for reflections as virtual anchors [1][2][3][4][5].
A completely different approach consists in embracing the complexity of the propagation medium as virtue rather than obstacle. An indoor environment is electrically large compared with the wavelength and can be characterized as ray chaotic: The separation of two rays launched from the same location in slightly different directions will increase exponentially in time. A wave-chaotic field is extremely sensitive to both source location and the enclosure's geometry. Inspired by the quantum-mechanical concept of fidelity loss [6], this sensitivity has been leveraged to distinguish nominally identical enclosures [7], to detect the presence or motion of small changes in the enclosure's geometry (without localizing them) [8,9], and to quantify volumechanging perturbations [10]. For the problem of position sensing, the wave-chaotic field's sensitivity implies that different positions are associated with distinguishable wave fields that can act like wave fingerprints (WFPs) for the positions.
Wave fingerprinting can be be applied to the localization of cooperative objects (emitting a beacon signal or equipped with a tag) [11][12][13][14][15][16] as well as to noncooperative objects (no compliance with localization task) [17][18][19]. While the former leverages the sensitivity of ray chaos to the source location, the latter leverages its sensitivity to geometrical perturbations. From the wave's point of view, different object positions inevitably correspond to different geometries of the propagation environment. To ensure the distinguishability of WFPs, the chaotic wave field must be probed in a number of "independent" ways. Traditionally, this is achieved using spatial or spectral diversity with a network of sensors or broadband measurements, respectively. Given the hardware cost of radio-frequency chains, single-detector schemes that do not rely on spatial degrees of freedom are appealing. A more recent single-detector alternative to the reliance on spectral diversity is to use configurational diversity by reprogramming the propagation environment with a "reconfigurable intelligent surface" (RIS). Using a programmable metasurface as RIS, Ref. [19] leveraged configurational diversity to localize multiple noncooperative objects outside the line of sight with single-port single-frequency measurements.
With real-life applications in mind, a fundamental challenge for indoor localization with WFPs arises: How does one handle a dynamic evolution of the propagation environment independent of the objects of concern? Indeed, given the extreme sensitivity of the chaotic wave field to geometrical details, one could expect that a perturbation not related to the object to be localized alters the wave field to an extent that makes it unrecognizable in light of a previously established WFP dictionary.
Here, we systematically study the impact of perturbations of the propagation environment on the localization accuracy, considering a frequency-diverse model system both in simulation and experiment. We investigate an interpretation of the perturber as effective source of noise and the extent to which the perturber affects the diversity of the WFP dictionary. We demonstrate that the reduction of the amount of information that can be obtained per measurement as the perturber size is increased can be compensated by taking more measurements, even in the regime where the perturber's scattering strength exceeds that of the object to be localized. Our results stress the importance of appreciating the informationtheoretic encoding-decoding cycle of the sensing process in its entirety and reveal that machine-learning decoders outperform traditional decoding techniques especially in the regime of low signal-to-noise ratio (SNR).

II. EXPERIMENTAL SETUP AND WFP FORMALISM
Our experimental setup is shown in Fig. 1: An object is located on one of P = 5 possible predefined positions in an irregular metallic enclosure. N = 51 complex-valued transmission measurements between two simple monopole antennas are taken in the interval 1 GHz < f < 2.58 GHz with a software-defined radio (SDR, LimeSDR Mini). Note that the predefined object positions are clearly outside the line of sight of the antenna pair. Dynamic perturbations of the propagation environment are introduced in our experiment with a metallic object of variable size mounted on a stepper FIG. 1. Experimental setup. The triangular object to be localized (base 9 × 9 cm 2 , height 6.5 cm) is placed on one of P = 5 predefined positions (here, position 3) inside a complex scattering enclosure of dimensions 0.8 × 0.83 × 0.5 m 3 (top wall removed to show interior). The transmission between two monopole antennas is measured with a LimeSDR Mini. The object to be localized is outside the antenna pair's line of sight. A dynamic perturber consists of a metallic object of variable size (here, the third largest) mounted on a stepper motor. The top inset shows the different considered sizes of the dynamic perturber in comparison to the object size. The three largest perturbers are obtained by mounting a U-shaped extension on a smaller perturber, similar to the spirit of Matryoshka dolls. The bottom inset illustrates the WFP multiplexing mechanism. motor which can place the object in an arbitrary angular orientation.
A measured transmission spectrum S( f ) can be decomposed into four contributions: S pert ( f ) accounts for rays that encountered the perturber, S obj ( f ) accounts for rays that encountered the object but not the perturber, S cav ( f ) accounts for rays that bounced around in the cavity without encountering either the object or the perturber, and N ( f ) denotes the measurement noise. Given the chaotic nature of the complex scattering enclosure, it is customary to assume that real and imaginary components of the entries of the first three terms are drawn from zero-mean Gaussian distributions. The measurement noise is typically also zero-mean Gaussian. The decomposition in Eq. (1) has several subtleties. First, we note that if the perturber size is increased, more rays will encounter the perturber such that not only will the elements of S pert ( f ) be drawn from a distribution with larger standard deviation but also at the same time the standard deviation of the distributions of S cav ( f ) and S obj ( f ) will decrease. In other words, S cav ( f ) and S obj ( f ) are not independent of the perturbing object. Second, since all the terms are assumed to be drawn from zero-mean distributions, in principle one would expect that by averaging over an ensemble of realizations of the perturber one can estimate S cav ( f ) + S obj ( f ) and by additionally averaging over an ensemble of object positions one can identify S cav ( f ). In practice, proper averaging requires a sufficient number of realizations, and P = 5 may be insufficient for averaging over an ensemble of object positions. In Eq. (1), only the term S obj ( f ) encodes information about the object position. To determine a WFP in the presence of a perturber, we therefore average S( f ) over an ensemble of representative perturber realizations. Here, it is relatively easy to ensure that the ensemble is sufficiently large to estimate S cav ( f ) + S obj ( f ) properly. We can then either define the WFP as being S cav ( f ) + S obj ( f ) or intend to approximate S obj ( f ) with We will consider both options below and see that, counterintuitively, the former one can be advantageous in certain cases. Moreover, Eq. (1) naturally suggests that we interpret the perturber as an effective source of noise. We can quantify the scattering strength of the perturber relative to that of the object via an effective perturber-induced SNR ρ p . Ideally, to that end, we would define s s and s n to be the standard deviation of the distributions from which the entries of S obj ( f ) and S pert ( f ), respectively, are drawn, to define ρ p = s 2 s /s 2 n . In practice, we do not know S obj ( f ). Depending on whether we choose to use S cav ( f ) + S obj ( f ) or S (2) obj ( f ) as WFP, we can define s (1) s and s (2) s to be the respective standard deviations, yielding ρ (1) p and ρ (2) p . These effective SNRs quantify to what extent the perturber acts as noise on our chosen WFP but do not directly reflect the ratio of scattering strengths of object and perturber.
The P × N WFP dictionary H merges the P WFPs (each WFP is an N-element vector) into a single matrix. The WFP approach can then also be framed as a multiplexing problem Y = HX + N , where X is a 1 × P vector identifying the object position, Y is the complex-valued 1 × N measurement vector, and N is a 1 × N noise vector.

III. INFORMATION-THEORETIC PERSPECTIVE
One prerequisite for successful wave fingerprinting is the diversity of H. In our case, the complexity of the propagation environment naturally provides this diversity. The lower the correlations between different WFPs are, the better they can be distinguished. To get a quantitative grasp of the diversity of H, it is instructive to consider its singular value (SV) decomposition: H = U V T , where is a diagonal matrix whose ith entry is the ith SV σ i of H. The flatter the SV spectrum is, the more diverse is H. A convenient metric of diversity is the effective rank of H, which is defined as [20]. Note that only perfectly orthogonal channels with zero correlation yield R eff = n.
Unfortunately, much of the compressed-sensing literature is exclusively focused on the diversity of H to understand the achievable performance. For instance, compression ratios are often provided without even indicating at what SNR they are valid. In principle, in the absence of any noise, the tiniest amount of diversity could be sufficient to ensure complete distinguishability even with N = 1. Here, we argue that the achievable performance depends on the amount of (useful) information that can be extracted per measurement. In the physical layer, besides diversity the SNR is a second crucial ingredient. Moreover, high diversity and low SNR only ensure good performance if the deployed decoding method in the digital layer is capable of extracting much of the relevant encoded information from the measurement. WFP-based sensing in its entirety as schematically summarized in Fig. 2 can be interpreted as a process consisting of physical encoding and digital decoding of information. Wave propagation through the complex scattering environment naturally (and inevitably) encodes information about the object position in measurements of the wave field. Data processing seeks to retrieve this information. Various decoding methods exist that we will compare later on: (i) Correlation. Identify which row of H has the highest correlations with Y. If the WFPs rely on spectral diversity, this procedure can be interpreted as "virtual time reversal" [16].
(ii) Inversion. Compute an inverse of H, for instance, via Tikhonov regularization, and identify the entry of H −1 Y with the largest magnitude.
(iv) Learning. Train an artificial neural network (ANN) to map Y to the corresponding object position. ANN-based approaches have not been studied in the multiplexing literature to date. Besides their potential for superior decoding performance, inference is extremely fast. One forward pass through an ANN requires only a few matrix multiplications but no correlations, matrix inversions, or nonlinear optimization routines.
From an information-theoretic perspective, it is important to understand fundamental bounds on the sensing performance. A simple bound to compute is the generalized Shannon capacity which has been mentioned on a few occasions in a sensing context [23,24]. Nonetheless, the meaningfulness of C for a specific sensing scheme is limited for two reasons. First, an ideal input distribution is assumed for X , but in reality all entries of X are zero except for one which is unity. Second, an ideal decoding method is assumed. Below we will see examples where a system with nominally lower C nonetheless 043224-3 yields a higher sensing accuracy for certain decoding methods. It is thus essential to appreciate the sensing process in its entirety, including both encoding and decoding as illustrated in Fig. 2.
Having introduced the notion of diversity and SNR, we can now briefly comment on how faithfully the metallic enclosure in our experiment represents real-life scenarios. Without a doubt, certain cases like the inside of a vessel or a bank vault are very well represented. Other environments like the inside of a building are less reverberant than a metallic enclosure. Essentially, the quality factor of these "cavities" is lower. This implies more correlations within a fixed frequency interval of the transmission spectrum, as well as a lower SNR due to more attenuation. Both result in a decrease of the information that can be extracted per measurement; this effect can be compensated by taking more measurements, for instance, with a wider bandwidth. Nonetheless, from a fundamental perspective, the physics of an indoor system is entirely captured by our metallic enclosure. From a practical point of view, we note that in scenarios with already existing wireless communication infrastructure, the beacon signals thereof could be used to implement position sensing with WFPs, saving energy and reducing the amount of electromagnetic radiation.

IV. SEMIANALYTICAL SIMULATIONS
To begin with, we consider a two-dimensional (2D) version of our experiment simulated as a 2D system of coupled dipoles [25] which contains all the essential physical ingredients to simulate wave propagation, reverberation, and scattering in our experiment. These simulations are not intended to directly approximate our experimental setup and reproduce the experimental data. Instead, they offer an ideal platform to identify the general effect of dynamic perturbations of the propagation environment on the sensing accuracy without any measurement noise or errors due to imperfect object positioning on the predefined positions, i.e., N ( f ) = 0. As shown in Fig. 3(a), a perturber of variable size with arbitrary orientation and location (within a specified area) simulates dynamic changes of the environment. Our simulation setup evaluates the transmission between an antenna pair at 25 distinct frequencies. We use an ensemble of 150 random perturber realizations (random orientation and random location of its center within the allowed area) to estimate H, R eff , and ρ p . The probability density functions (PDFs) of real and imaginary parts of S pert is seen in Figs. 3 to be zero-mean single-peaked and tends towards a Gaussian distribution for larger perturbers.

A. Impact of perturbation on diversity and effective SNR
In Fig. 4 we contrast the use of S cav ( f ) + S obj ( f ) or S (2) obj ( f ) as WFP in terms of the resulting diversity (R eff ), effective SNR (ρ p ), and sensing capacity (C). As we will see below, neither of these quantities is a reliable predictor of the sensing accuracy, since they do not take the decoding method into account. For the case of using S cav ( f ) + S obj ( f ) as WFP, the observed trend is clear: As the perturber size increases, both R eff and ρ p as well as C decrease. While the impact on ρ p was clearly expected, the reduction of diversity is more subtle. It becomes intuitive by considering the extreme case in which the perturbation alters the entire enclosure. Then, averaging over realizations yields the result that would have been obtained in an anechoic environment such that no diversity thanks to wave chaos is left.
Using S obj ( f ) as opposed to S cav ( f ) + S obj ( f ) would certainly improve the diversity by removing unnecessary correlations (possibly at the expense of a better SNR such that the overall effect on capacity is unclear), but this is not possible in practice. Our closest option to that effect is to use S (2) obj ( f ). Straightforward simulations with random Gaussian matrices show that the effective rank of S cav ( f ) + S obj ( f ) may exceed that of S (2) obj ( f ) in cases where P is small (preventing proper averaging over realizations of the object position) and where the ratio of the standard deviations of the distributions of S obj and S cav is large. Nonetheless, in our semianalytical simulations, we observe in Fig. 4(a) a higher effective rank for S (2) obj ( f ) than for S cav ( f ) + S obj ( f ). Yet, since ρ (2) p is substantially lower than ρ (1) p , the effect of using S (2) obj ( f ) on the capacity is unfavorable.
Complex scattering enclosures are often seen as randomfield generators [26]. R eff is a measure of the number of independent samples, and for N P one expects R eff → P. Yet, in our simulations, R eff saturates below 4. This observation can be attributed to field correlations, here in the frequency domain, that prevent the field observables from being purely random variables [27]. Similar effects have been observed for the case of using configurational diversity in a complex scattering enclosure [28].

B. Dependence of sensing accuracy on perturber size, number of measurements, and decoding method
The general trend is clear: The larger the perturbation, the less information can be extracted per measurement, as reflected by the sensing-capacity values plotted in Fig. 4(b). However, this decrease in information per measurement can be compensated with more measurements. At first sight, one may expect that such a compensation is only feasible as long as the object's scattering signature is stronger than the perturber's effect, i.e., for ρ p > 0 dB. Our findings in Fig. 5, however, reveal that there is no abrupt phase change in the relation between achievable accuracy versus perturber size. Instead, using more measurements, successful position sensing is feasible at effective SNRs well below 0 dB.
We systematically compare the previously outlined decoding methods for both choices of WFP. For the learning-based approach, we train a simple ANN consisting of two fully connected layers; the first layer consists of 256 neurons and is followed by a rectified-linear-unit (ReLU) activation, and the second layer consists of P = 5 neurons and is followed by a SoftMax activation. Using more neurons or an additional layer does not appear to notably impact the results. We consider two possibilities to provide training data from which the ANN can learn to decode the measurements. The first option is to simply use the raw data from all the perturber realizations that we generated without a need for extracting H or other quantities. This brute-force method may prove particularly useful in cases where measurements are restricted to intensityonly information which prevents averaging as simple means to extract H, but this scenario is outside the scope of the FIG. 5. Localization accuracy in semianalytical simulations. The color scale goes from 0 (black) to 1 (white). For each choice of WFP definition (columns) and decoding method (rows), the accuracy is plotted as a function of perturber size (horizontal axis) and the number of frequency points used to ink the WFP (vertical axis). ANN results are averaged over 20 training runs with randomly initialized weights; the standard deviation is below 2%. The black contour line corresponds to 95% accuracy. To aid comparison, the red contour is the same on all panels. present paper. Note that with this approach the WFPs are never explicitly evaluated, but only implicitly contained in the ANN weights. The second option is to synthesize training data with Y = HX + N using the estimated H and generating N with entries drawn from a complex Gaussian distribution whose standard deviations match those of the distribution of S pert extracted from the data. This second method relies on our hypothesis that S pert is normally distributed and offers the possibility of generating a training data set of unlimited size. In both cases we normalize the data (zero mean, unit variance) and use the Adam method for stochastic optimization [29] (step size 10 −3 ) to train the ANN weights.
In Fig. 5, we show how the achieved sensing accuracy depends on the perturber size and the number of measurements. We ensure that the spacing of the utilized frequency points is always the same and that they are always centered on the same frequency. For instance, for N = 7 measurements we pick the central frequency point out of the 25 available ones as well as its three closest neighbors to the left and right. Our results are thus for one specific system realization, which explains 043224-5 why the contours in Fig. 5 are not perfectly smooth. Several important observations and conclusions follow from Fig. 5: (i) WFP dictionaries with very different nominal sensing capacities can yield the same accuracy. This is the case for both ANN-based methods, in which the accuracy is (almost) identical for WFP (1) and WFP (2) .
(ii) The same WFP dictionary can yield very different accuracies depending on the decoding method. ANN-based decoders are seen to outperform correlation-and inversionbased decoders.
(iii) The choice of WFP definition is irrelevant for the optimized-inversion decoder as well as the ANN-based decoders. For correlation-and inversion-based decoders, however, using WFP (1) yields significantly better results.
(iv) Irrespective of the perturber size, we achieve an acceptable minimum accuracy (e.g., 95%). For larger perturbers, we need more measurements to compensate the reduction in the amount of information that can be extracted per measurement. Future information-theoretic work should seek to model the contour for a given accuracy in order to understand how the need for additional measurements scales with ρ p .
(v) At low effective noise levels, some decoders achieve compression ratios above unity; that is, they achieve accuracies 95% to localize P = 5 objects with N < P measurements. For instance, the ANN (raw data) decoder with WFP (2) achieves 96% accuracy with N = 3 at the lowest considered perturber size. However, as in any compressedsensing scenario, it is obvious that the compression ratio is heavily dependent on the noise level (here, the effective noise level due to the perturber size), the independence of different measurements (here, determined to a large extent by the interval between frequency points), and the decoding method (here, an ANN trained with raw data). Thus a general claim of achieving a compression ratio above unity is not presented as a key result of this work.
Overall, these results clearly demonstrate that it is fallacious to assume that the diversity or sensing capacity of H could be a reliable indicator of the sensing accuracy, hence the importance of considering the sensing process in its information-theoretic entirety as in Fig. 2.

V. EXPERIMENTAL RESULTS
Having established an understanding of the perturber's effect under idealistic conditions in simulation, we now analyze the experimental data. In an attempt to approximate the sort of low-cost radio hardware that indoor geolocalization schemes may leverage, our measurements are performed with a low-cost and lightweight SDR as opposed to high-end bulky measurement equipment such as a vector network analyzer. Measurements with our SDR entail a few practical issues. First, there is a ±π uncertainty in measured phase values, originating from random phase jumps every time the phaselocked loop (PLL) is locked (e.g., to change the frequency). To obtain reliable data, we transform each measured complex value z to |z| exp{2i mod[arg(z), π]}; the factor 2 in the exponent ensures that the transformed variable's phase explores the entire 2π range. Second, the transmitted energy is clearly frequency dependent, which can be caused by the frequency-dependent coupling of the monopole antennas to the cavity and/or frequency-dependent SDR components. The strong frequency dependence means that we cannot simply model our variables as being drawn from a unique distribution; instead the distribution's standard deviation becomes frequency dependent. To maintain the SDR's temperature constant throughout the experiment, we installed a simple CPU fan. We do not observe any significant amplitude or phase drifts over the course of the experiment.
We begin by quantifying two contributions to the N term in Eq. (1) that were not present in the simulations. First, we estimate the SNR due to measurement noise (by repeating the same measurement multiple times) as ρ 1 = 25.5 dB. Second, we estimate the SNR due to both measurement noise and imperfect positioning of the objects on the predefined locations (by repeating the same measurement multiple times after placing the object again on the same position) as ρ 2 = 15.8 dB.

A. Impact of perturbation on diversity and effective SNR
Based on 150 perturber realizations (random orientations) for each perturber size, in Figs. 6(a)-6(d) we plot the PDFs of real and imaginary parts of S pert for the smallest and largest perturber considered in our experiment. The zero-mean single-peaked distributions are identical for real and imaginary components but thinner than a Gaussian distribution. In Fig. 6(e) we plot R eff (H) versus the effective SNR. Since N ( f ) = 0 in the experiment, we plot two curves: The blue one only accounts for perturber-induced effective noise, and the red one additionally accounts for measurement and positioning noise. The difference between these two curves is appreciable only for small perturber sizes since for larger perturbers S pert dominates over N . Unlike in Fig. 4(a), using 043224-6 FIG. 7. Localization accuracy in experiments. The color scale goes from 0 (black) to 1 (white). For each choice of WFP definition (columns) and decoding method (rows), the accuracy is plotted as a function of perturber size (horizontal axis) and the number of frequency points used to ink the WFP (vertical axis). ANN results are averaged over 20 training runs with randomly initialized weights; the standard deviation does not exceed 10 and 3% for ANNs trained with synthetic and raw data, respectively. The black contour line corresponds to 95% accuracy. To aid comparison, the red contour is the same on all panels. S (2) obj lowers not only the effective SNR but also the effective rank. As in Fig. 4(b), we see in Fig. 6(f) that using S (2) obj is unfavorable in terms of the (normalized) sensing capacity. The impact of N on C is only noticeable for small perturbers.

B. Dependence of sensing accuracy on perturber size, number of measurements, and decoding method
In Fig. 7 we compare the achievable sensing accuracy in our experiment with the two considered definitions of the WFP and different decoding methods as a function of the perturber size and number of measured frequency points. Note that Figs. 5 and 7 should only be compared qualitatively since the simulations do not reproduce the exact experimental setup. The observations already made for the corresponding simulation results in Fig. 5 about the unsuitability of R eff or C to predict the sensing accuracy are confirmed once again by Fig. 7. The most notable difference compared with Fig. 5 is that in Fig. 7, except for the ANN trained with raw data, all decoding methods fail to achieve at least 95% accuracy once the perturber's surface is larger than 200 cm 2 . We attribute this to the ±π phase uncertainty of our SDR, which introduces errors in the estimation of H. Only the ANN trained with raw data does not rely on calculating H, so it is not affected by the phase issue and performs well even with the experimental data. Interestingly, we have thus a case in which it is better to feed the ANN raw data rather than to use physical insight to preprocess the ANN's training data. The ANN decoder trained with raw data is capable of achieving high sensing accuracies despite significant amounts of noise [the effective SNR is as low as −15 dB for the largest perturber; see Fig. 6(e)] and distorted data. Using the ANN decoder trained with raw data, we achieve 100% sensing accuracy with N = 3, i.e., a compression ratio of P/N = 5/3 > 1, for perturbers with a surface as large as 74 cm 2 . Again, we stress that the compression ratio depends on the effective SNR, the measurement independence, and the decoding method.

VI. CONCLUSION AND OUTLOOK
From a practical point of view, our experiments, in combination with an ANN-based decoder, demonstrated the feasibility of precise position sensing with WFPs in dynamically evolving scattering enclosures using a low-cost and lightweight SDR. This capability is crucial to enable situational awareness in a plethora of emerging applications. Our technique does not rely on detailed knowledge about the environment's geometry and only requires a one-off calibration phase with multiple representative realizations of the dynamic perturbations that are expected during operation. From a conceptual point of view, our work paves the way for a thorough information-theoretic analysis of sensing with WFPs. The dynamic perturber's unfavorable effect on diversity and effective SNR of the WFP dictionary, resulting in the acquisition of less useful information per measurement, can be fully compensated by taking more measurements-even in the regime in which the perturber's scattering strength clearly exceeds that of the object to be localized. We saw that the common practice in compressed sensing to only consider the diversity or capacity of H is insufficient to anticipate the achievable sensing accuracy. Our results are of very general nature: They can be applied to other types of wave phenomena (sound, light, etc.) and are equally valid for WFPs established not based on spectral degrees of freedom (DoF) but with other means such as using spatial, polarization, or configurational DoF by employing a sensor network, a dual-polarized antenna, or a RIS [19], respectively.
Our work bears great conceptual resemblance to the reconstruction of optical images after propagation through a multimode fiber or multiply scattering medium [30]. In these cases, a camera conveniently offers easy access to many spatial DoF so there is no need to use spectral DoF as in our work, but the measured data are also not "human readable" and require (typically machine-learning-based) processing. "Imaging" is the process of retrieving a representation of the scene based on how it scatters waves; modern computational imaging protocols heavily rely on a priori knowledge such as sparsity of the scene in compressed sensing or knowing that the object belongs to one class out of a set of predefined classes in machine-learning-based optical image retrieval. In 043224-7 fact, the position-sensing task we consider in this paper can be framed as an imaging problem: Assuming everything about the scene is known a priori except for the object's position, the task of imaging the scene collapses to determining the object's position. It is hence interesting to ask if using a convolutional neural network (CNN), as is customary in the literature on optical image retrieval [30], may be beneficial for position sensing. However, recent work [31,32] suggests that simple feed-forward neural networks similar to the one we used in this paper perform at least as well as CNNs because relevant local features of the scene are scrambled and hence encoded in long-range spatial structures, whereas CNNs are designed to extract local features.
A key result of the present work, the importance of seeing the entirety of the information-theoretic cycle, points towards jointly optimizing encoding in a programmable propagation environment and machine-learning-based decoding, as in the recently proposed "learned sensing" paradigm [33,34]. In contrast to compressed sensing, which indiscriminately encodes all information, learned sensing seeks to encode only task-relevant information in the measurements. For position sensing, one could carefully select the frequencies at which measurements are taken (as opposed to linear spacing) and/or engineer the propagation environment with a RIS [35].
Looking ahead, it appears interesting to extend the present work (i) to scenarios with multiple objects to be localized, where neglected interobject scattering is an additional effective source of noise [19], (ii) to deeply subwavelength position sensing [17], and (iii) to more complex tasks such as image transmission [36]. Moreover, ANN-based decoders could be enhanced with more advanced machine-learning techniques.
Transfer learning may enable one to pretrain the ANN on simulated data and then fine-tune it to the experimental situation based on a very small experimental training data set. While accurate simulations of wave propagation in 3D electrically large irregularly shaped enclosures are computationally very expensive, it would be interesting to see if the ANN could learn useful knowledge about the problem's underlying physics even from the sort of 2D simulations we used here. Transfer learning may also be applied to an ANN previously employed for the same task in a slightly different setting, e.g., in a different room.
In this paper, the perturber was seen as an obstacle for our task to localize an object. In other contexts, the objective may be to characterize size and motion of a perturber. Diffuse wave spectroscopy [37][38][39][40] analyzes changes of the broadband impulse response over time to estimate the number or scattering cross section of objects moving through a complex medium. Our work has evidenced that the perturber's scattering strength can also be clearly related to the capacity of a multiplexing channel matrix averaged over different realizations of the perturber's position. Considering configuration-to-configuration multiplexing with two dynamic metasurface transceivers [41] may thus enable similar characterizations of a moving perturber with single-port single-frequency measurements [9].