which can be used for motion detection during monitoring and which is consistent with the breath rates for healthy people. it's beyond the scope of our work.We should also perform For each experiment set,we repeat the processing for 10 times a re-synchronization in the next observation slot,especially and use the average errors as the experimental error to reduce when there are more than one users in the sensing range.In the impact of random sampling process in our algorithm. the re-synchronization process,we should match the candidate clusters in the current observation slot with those in the A.Experiments in the Single User Scenario previous slots.We use the mean-square of the difference Effective range:To evaluate the breath detection range between cluster centers to perform the matching so that when of RESPTRACKER,we conduct experiments in the hallway users are moving,static users will be first matched to their shown in Figure 6.at different distances from 0.5 m to historical clusters and moving users will be relocated to new 4.0 m.At each distance,we collect two minutes breathing positions. data for five repetitions for each subject.To compare with existing beamforming schemes,we implement a delay-and- D.Breath Rate Estimation sum algorithm to process and combine the same data.In the After we reconstructed the waveform of each observation delay-and-sum scheme,we conduct the delay process in the slot,we estimate the number of breath periods for each user. frequency domain and reuse the intermediate data from the Within each observation slot,we use a moving average filter demodulation process to reduce the computational cost.To to eliminate noise and false respiration peaks.Since the breath find the proper elevation and azimuth,we first search with rate may vary from 0.1Hz to 0.5Hz,we cannot use a fixed a stride of 10 degree to find a coarse-grained elevation and empirical window size for the moving average filter.To adapt azimuth and then fix the elevation and search with a stride of 1 the length of moving average filter,we extract the FFT energy degree surrounding found azimuth to find the final parameters. within the range of human respiration frequency as the feature. For each combination of these parameters,we calculate the We then use a Support Vector Machine (SVM)to select the breath SNR for all the possible paths to find the combination filter length based on the FFT energy features. to maximize SNR.After that,we get the combined path of After the smoothing process,we normalize the waveform six microphones and then calculate BPM and breath interval through min-max normalization and perform peak detection. time in a similar way as in RESPTRACKER. The detected peaks must satisfy two constraints.First,the Figure 7 shows that RESPTRACKER achieves an acceptable interval of two adjacent peaks must bigger than 20 sampling measurement error of less than 1 BPM at a distance of points because the breath frequency range in our system is 3.0 m.RESPTRACKER outperforms the traditional delay-and- from 0.1 Hz to 0.5 Hz and the smallest possible interval is 20 sum method in most cases.This is because the delay-and-sum sampling points.Second,the prominence of the peaks must be scheme can only combine the received data at one distance, bigger than an empirical threshold Thr.Since the waveform elevation,and azimuth.When reflected signal is quite weak, is already smoothed and normalized,the Thr is set to 0.05 to the single path set used in delay-and-sum is unstable.In avoid false alarms.We then further estimate the BPM and the comparison,RESPTRACKER combines multiple path sets thus breath interval time of users based on the detected peaks. it can enhance the reflected signal multiple times and the mean absolute errors within 3.0 m is less than 0.95 BPM. V.IMPLEMENTATION AND EVALUATION while the error of delay-and-sum at 3.0 m is 1.60 BPM.The We implement RESPTRACKER on Raspberry Pi 3B+[18]measurement errors for the breath interval time are shown in and desktop computers using Python.The Raspberry Pi is Figure 7(b).The measurement error increases rapidly for the equipped with a speaker and a 6-mic circular microphone array long distances because the ultrasound attenuates quickly in [19]to transmit and receive acoustic signal at a sample rate of the indoor environment.Although.RESPTRACKER can still 48 kHz.The captured sound signal is sent to PC through Wi- reliably work at a distance of three meters. Fi in real-time for further processing.The ground truth of the Figure 8 further shows the details of the reconstructed respiration signal is collected through a Vernier respiration belt waveform.Note that the respiration belt can only detect the [20]that measures the pressure of the chest.As the subjects inhale,due to the measured chest pressure should always be inhale and exhale,the sensor will record the pressure change non-negative,and the acoustic signal can detect both the inhale of the belt caused by the chest.The devices used in our and the exhale movements. experiments are shown in Figure 5. Robustness:To evaluate the robustness of our system in We use two key metrics to evaluate the performance of different environments,we conduct experiments at different RESPTRACKER.The first metric is the BPM that indicates locations in typical indoor environments,including hallway, the average frequency of the breath.The second metric is the office room,conference room,and student apartment.Fig- breath interval that gives more detailed information about each ure 6 shows the sample experimental environments.In each inhale and exhale,which is vital for diagnosing of chronic environment,we choose four different locations based on diseases.We recruit five volunteers in our evaluation,who the environments'condition,including facing wall,parallel are healthy graduate students from 21 to 24 years.During to wall,facing corner,and in the middle of room to cover the evaluation,all subjects are asked to breath normally and different reflection conditions.During these experiments,the we find that the resulting BPM are in the range of 10 to 20 distance between the user and the microphone/speaker is fixedwhich can be used for motion detection during monitoring and it’s beyond the scope of our work. We should also perform a re-synchronization in the next observation slot, especially when there are more than one users in the sensing range. In the re-synchronization process, we should match the candidate clusters in the current observation slot with those in the previous slots. We use the mean-square of the difference between cluster centers to perform the matching so that when users are moving, static users will be first matched to their historical clusters and moving users will be relocated to new positions. D. Breath Rate Estimation After we reconstructed the waveform of each observation slot, we estimate the number of breath periods for each user. Within each observation slot, we use a moving average filter to eliminate noise and false respiration peaks. Since the breath rate may vary from 0.1Hz to 0.5Hz, we cannot use a fixed empirical window size for the moving average filter. To adapt the length of moving average filter, we extract the FFT energy within the range of human respiration frequency as the feature. We then use a Support Vector Machine (SVM) to select the filter length based on the FFT energy features. After the smoothing process, we normalize the waveform through min-max normalization and perform peak detection. The detected peaks must satisfy two constraints. First, the interval of two adjacent peaks must bigger than 20 sampling points because the breath frequency range in our system is from 0.1 Hz to 0.5 Hz and the smallest possible interval is 20 sampling points. Second, the prominence of the peaks must be bigger than an empirical threshold T hr. Since the waveform is already smoothed and normalized, the T hr is set to 0.05 to avoid false alarms. We then further estimate the BPM and the breath interval time of users based on the detected peaks. V. IMPLEMENTATION AND EVALUATION We implement RESPTRACKER on Raspberry Pi 3B+ [18] and desktop computers using Python. The Raspberry Pi is equipped with a speaker and a 6-mic circular microphone array [19] to transmit and receive acoustic signal at a sample rate of 48 kHz. The captured sound signal is sent to PC through WiFi in real-time for further processing. The ground truth of the respiration signal is collected through a Vernier respiration belt [20] that measures the pressure of the chest. As the subjects inhale and exhale, the sensor will record the pressure change of the belt caused by the chest. The devices used in our experiments are shown in Figure 5. We use two key metrics to evaluate the performance of RESPTRACKER. The first metric is the BPM that indicates the average frequency of the breath. The second metric is the breath interval that gives more detailed information about each inhale and exhale, which is vital for diagnosing of chronic diseases. We recruit five volunteers in our evaluation, who are healthy graduate students from 21 to 24 years. During the evaluation, all subjects are asked to breath normally and we find that the resulting BPM are in the range of 10 to 20 which is consistent with the breath rates for healthy people. For each experiment set, we repeat the processing for 10 times and use the average errors as the experimental error to reduce the impact of random sampling process in our algorithm. A. Experiments in the Single User Scenario Effective range: To evaluate the breath detection range of RESPTRACKER, we conduct experiments in the hallway shown in Figure 6, at different distances from 0.5 m to 4.0 m. At each distance, we collect two minutes breathing data for five repetitions for each subject. To compare with existing beamforming schemes, we implement a delay-andsum algorithm to process and combine the same data. In the delay-and-sum scheme, we conduct the delay process in the frequency domain and reuse the intermediate data from the demodulation process to reduce the computational cost. To find the proper elevation and azimuth, we first search with a stride of 10 degree to find a coarse-grained elevation and azimuth and then fix the elevation and search with a stride of 1 degree surrounding found azimuth to find the final parameters. For each combination of these parameters,we calculate the breath SNR for all the possible paths to find the combination to maximize SNR. After that, we get the combined path of six microphones and then calculate BPM and breath interval time in a similar way as in RESPTRACKER. Figure 7 shows that RESPTRACKER achieves an acceptable measurement error of less than 1 BPM at a distance of 3.0 m. RESPTRACKER outperforms the traditional delay-andsum method in most cases. This is because the delay-and-sum scheme can only combine the received data at one distance, elevation, and azimuth. When reflected signal is quite weak, the single path set used in delay-and-sum is unstable. In comparison, RESPTRACKER combines multiple path sets thus it can enhance the reflected signal multiple times and the mean absolute errors within 3.0 m is less than 0.95 BPM, while the error of delay-and-sum at 3.0 m is 1.60 BPM. The measurement errors for the breath interval time are shown in Figure 7(b). The measurement error increases rapidly for the long distances because the ultrasound attenuates quickly in the indoor environment. Although, RESPTRACKER can still reliably work at a distance of three meters. Figure 8 further shows the details of the reconstructed waveform. Note that the respiration belt can only detect the inhale, due to the measured chest pressure should always be non-negative, and the acoustic signal can detect both the inhale and the exhale movements. Robustness: To evaluate the robustness of our system in different environments, we conduct experiments at different locations in typical indoor environments, including hallway, office room, conference room, and student apartment. Figure 6 shows the sample experimental environments. In each environment, we choose four different locations based on the environments’ condition, including facing wall, parallel to wall, facing corner, and in the middle of room to cover different reflection conditions. During these experiments, the distance between the user and the microphone/speaker is fixed