WO2024223361A1

WO2024223361A1 - Method for obtaining spectral data and sensor

Info

Publication number: WO2024223361A1
Application number: PCT/EP2024/060249
Authority: WO
Inventors: Jérôme MAYE; Hugues SALAMIN; Michael Notter
Original assignee: Ams Osram Asia Pacific Pte Ltd
Current assignee: Ams Osram Asia Pacific Pte Ltd
Priority date: 2023-04-25
Filing date: 2024-04-16
Publication date: 2024-10-31
Anticipated expiration: 2025-10-25

Abstract

A method for obtaining spectral data is specified herein, the method comprising the steps of: • - obtaining time series data from a detector, • - transforming the time series data to a frequency domain using a reduced set of basis functions, wherein • - for a given application of the detector, the reduced set of basis functions is selected using machine learning and feature selection. Further, a sensor is specified herein which is configured for frequency modulated continuous wave (FMCW) light detection and ranging (LIDAR). A discrete Fourier transform (DFT), a fast Fourier transform (FFT), or a sparse FFT can be used to transform the time series data to the frequency domain. A given application of the detector preferably classifies the spectral data. Machine learning trains a classifier with at least a part of the spectral data in order to obtain a classification model for the spectral data. Feature importance investigation can be performed in order to identify which basis functions are most relevant such that the classification task can be successfully performed. The detector preferably comprises a vertical cavity surface emitting laser diode (VCSEL) that is configured for self-mixing interferometry. A sparsified transformation matrix M_n,m can be used to compute the spectral data only in the frequency band of interest between 5 kHz and 25 kHz. In other words, the elements of the transformation matrix M_n,m corresponding to frequencies wn outside the frequency band of interest are set to zero. This reduces the computational effort for obtaining the spectral data.

Description

2022PF02461 P2023,0237 WO N - 1 - Description METHOD FOR OBTAINING SPECTRAL DATA AND SENSOR A method for obtaining spectral data and a sensor are specified herein. At least one object of certain embodiments is to specify a method for obtaining spectral data, wherein the method has a reduced computational cost. A further object of certain embodiments is to specify a sensor that provides spectral data, wherein providing the spectral data has a reduced computational cost. According to at least one aspect of the method for obtaining spectral data, time series data from a detector is obtained. For example, the detector provides an electrical signal that comprises the time series data. In particular, the time series data comprises data at a series of discrete time steps, and/or it comprises data that is continuous as a function of time. For example, the time series data is digital or analog. Analog time series data can be converted to digital time series data using an analog-digital converter before further processing, for example. For example, the detector detects an electric voltage, an electric current, electromagnetic radiation, a pressure, a displacement, and/or other physical quantities or changes thereof. In particular, the detector provides time series data that is proportional to the one or more detected physical quantities or changes thereof. For example, the time series data allows to track changes of the respective one or more physical quantity as a function of time. 2022PF02461 P2023,0237 WO N - 2 - According to at least one further aspect of the method for obtaining spectral data, the time series data is transformed to a frequency domain using a reduced set of basis functions. Here and in the following, “spectral data” refers to the time series data that has been transformed to the frequency domain. By using a reduced set of basis functions for transforming the time series data to the frequency domain, the computational cost of the transformation can be reduced. For example, the time series data that is transformed to the frequency domain comprises data obtained from the detector during a predetermined observation time period. The observation time period is fixed or comprises a sliding time window, for example. In particular, the sliding time window is a time interval of fixed length that moves with the current time and/or extends up to the current time. For example, the time series data obtained during the observation time period is stored and processed in order to obtain the spectral data by transforming the time series data to the frequency domain. For example, the transformation of the time series data to the frequency domain comprises or consists of a harmonic analysis of the time series data. For example, a Fourier transformation or a discrete Fourier transformation is used to transform the time series data to the frequency domain. In particular, the time series data is transformed to the frequency domain by choosing a set of frequency dependent basis functions in the time domain and by decomposing the time series data in terms of these basis functions. In other words, the time series data can be at least approximately 2022PF02461 P2023,0237 WO N - 3 - expressed as a linear combination of these basis functions. For example, coefficients of the respective basis functions in the linear combination correspond to the spectral data. _{For example, for discrete time series data} ^{^} _^^^ ^{^} _that comprises a number of ^^ data points obtained at corresponding time steps ^^_^ ൌ ^^ ^^/ ^^, i.e. ^^ denotes the observation time period and ^^ takes integer values from 1 to ^^, the set of basis functions can be chosen as

^^ the imaginary unit, ^^ the time, and frequencies ^^_^ ൌ 2 ^^ ^^/ ^^, where ^^ is an integer taking values from 0 to ^^-1, for example. For example, the time series data

can be transformed to spectral data ^ ^^_^^ in the frequency domain via

In particular, if ^^_^ in equation (1) is computed for each ^^ from 0 to ^^-1, the full set of basis functions is used for the transformation to the frequency domain. Correspondingly, the full spectral data ^ ^^_^^ corresponding to the time series data ^ ^^_^^ is obtained. For example, the reduced set of basis functions corresponds to a subset of the full set of basis functions ^ ^^^{^ ఠ^ ௧}^. In particular, only a reduced set of frequencies ^^_^ is chosen for transforming the time series data to the frequency domain. In other words, ^^_^ in equation (1) above is computed only for selected values of ^^. In particular, this reduces the computational effort, as equation (1) above needs to be evaluated only for a limited number of integers ^^. 2022PF02461 P2023,0237 WO N - 4 - According to at least one further aspect of the method for obtaining spectral data, the reduced set of basis functions is selected using machine learning and feature selection for a given application of the detector. For example, the given application of the detector comprises performing a given task, such as classifying or regressing the time series data and/or the spectral data. For example, the detector is part of a distance sensor and the given application comprises classifying a distance between the sensor and an external object in terms of different distance classes. For example, the detector is part of an electroacoustic transducer and the given application comprises assigning the time series data to different sound sources. In particular, the reduced set of frequencies is selected using machine learning and feature selection. For example, machine learning and feature selection are used to identify the frequencies that are most informative for the given task. In other words, machine learning and feature selection are _{used to identify a minimal subset of the spectral data} ^{^} _^^^ ^{^} that is necessary to perform the given task with a desired accuracy. This allows to reduce the number of basis functions or frequencies in a targeted way and thus reduces the computational effort when transforming the time series data to the frequency domain. For example, the given application requires to perform a specific task, such as classifying the time series data or a regression of the time series data. In particular, spectral data obtained by transforming at least a part of the time series data to the frequency domain using the full set of basis functions, i.e. the full spectral data, is used as training data for a machine learning model. For example, the 2022PF02461 P2023,0237 WO N - 5 - training data is used to train the machine learning model such that it performs the specific task with a desired accuracy. For example, the task may be to detect if a most prominent peak in the spectral data is above or below a given frequency. Once the training of the machine learning model is completed, the feature selection, for example a feature importance investigation, can be performed in order to identify which basis functions or frequencies are most relevant such that the task can be successfully performed, for example. Based on this feature selection the number of basis functions or frequencies can be reduced, for example, such that the reduced set of basis functions is obtained. In particular, only a fixed number of the most important basis functions as identified by the feature importance investigation are used for the transformation of time series data to the frequency domain during an intended use of the detector. This fixed number is determined by a minimal required model performance, for example. Subsequently, the machine learning model can be retrained using only the reduced set of basis functions, for example. In particular, machine learning and feature selection are performed before the intended use of the detector for the given application, for example during manufacturing and/or calibrating of the detector or of a sensor comprising the detector. For example, during the intended use of the detector for the given application, the spectral data is obtained only using the reduced set of basis functions or frequencies. Accordingly, the computational cost for 2022PF02461 P2023,0237 WO N - 6 - obtaining the spectral data during the intended use of the detector for the given application can be lowered. According to an embodiment, the method for obtaining spectral data comprises the steps of: - obtaining time series data from the detector, - transforming the time series data to the frequency domain using the reduced set of basis functions, wherein - for a given application of the detector, the reduced set of basis functions is selected using machine learning and feature selection. In particular, the method for obtaining spectral data is based on a machine learning informed sparse frequency decomposition. The method disclosed herein is based on the idea to reduce the computational cost of obtaining spectral data. For example, low-cost applications on edge devices, such as electroacoustic transducers, light detection and ranging (short: LIDAR) sensors, and/or self-mixing interferometry (short: SMI) sensors, may require analyzing time series data in the frequency domain. In particular, the time series data may be easier to analyze in the frequency domain than in the time domain. However, the computational cost of fully transforming the time series data to the frequency domain may be too high. For example, a processor, such as a field programmable gate array (short: FPGA) or an application specific integrated circuit (short: ASIC), that is integrated with the sensor lacks the computational resources for a full transformation of the time series data to the frequency domain. For example, a discrete Fourier transform (short: DFT), a fast Fourier transform (short: FFT), or a sparse FFT can be 2022PF02461 P2023,0237 WO N - 7 - used to transform the time series data to the frequency domain. For time series data comprising N data points, the full transformation to the frequency domain requires to perform a number of computations proportional to N² in the case of DFT, or proportional to N*log(N) in the case of FFT, for example. However, on low-cost devices with very low computational power and/or memory, the application of DFT, FFT, or sparse FFT might not be feasible or possible. By transforming the time series data to the frequency domain using the reduced set of basis functions, as described above, the number of required computations can be set to any desired value M. Accordingly, the computational complexity, a time requirement for the transformation to the frequency domain, and/or an energy consumption of the transformation can be advantageously reduced. Moreover, the computational cost can be reduced without reducing a frequency resolution of the spectral data, for example. According to at least one further aspect of the method, transforming the time series data to the frequency domain comprises a discrete Fourier transformation, DFT, with the reduced set of basis functions. For example, the time series data comprises data at a series of discrete time steps within the observation time period. In particular, the spectral data in the frequency domain is obtained from the time series data via equation (1) stated above, whereby the reduced set of basis functions comprises the functions ^^^{^ ఠ^ ௧} corresponding to the reduced set of frequencies ^^_^ as determined by machine learning and feature selection. According to at least one further aspect of the method, the reduced set of basis functions comprises cosine functions 2022PF02461 P2023,0237 WO N - 8 - and/or sine functions depending on time and on a reduced set of frequencies. For example, the reduced set of basis functions comprises the functions cos ^ ^^_^ ^^^ and/or sin^ ^^_^ ^^^, where ^^ denotes time and the frequencies ^^_^ are selected from the reduced set of frequencies as determined by machine learning and feature selection. According to at least one further aspect of the method, the time series data to be transformed to the frequency domain comprises at least 1000 data points, or at least 10000 data points, at a series of subsequent time steps. In other words, ^^ in equation (1) is equal to or larger than 1000, or equal to or larger than 10000. It is also possible that the number of data points is smaller than 1000, depending on the detector and/or the given application. In particular, the full transformation to the frequency domain would comprise a number of different frequencies that is equal to the number of data points, e.g. at least 1000 or 10000. By reducing the number of frequencies in the reduced set of frequencies to a number significantly smaller than ^^ using machine learning and feature selection, the computational cost for computing the spectral data can be substantially reduced. According to at least one further aspect of the method, the reduced set of frequencies comprises at most 50, preferably at most 20 different frequencies. For example, the reduced set of frequencies may comprise 10 different frequencies. For example, the number of data points in the time series data is larger than the number of frequencies in the reduced set of frequencies by at least a factor of 10, preferably by at least a factor of 100, and particularly preferably by at least a factor of 1000. 2022PF02461 P2023,0237 WO N - 9 - According to at least one further aspect of the method, the given application comprises classifying the spectral data. For example, the spectral data is classified into at least two different classes. As the spectral data comprises at least a part of the time series data, the given application can also comprise classifying the time series data. The given application can also comprise a regression of the spectral data and/or the time series data, for example. According to at least one further aspect of the method, machine learning comprises training a classifier with at least a part of the spectral data in order to obtain a classification model for the spectral data. In particular, the classifier is trained with the Fourier transformed time series data, wherein the training is performed before the intended use of the detector for its given application. For example, the classifier is trained during or after manufacturing of the detector, or during or after a calibration of the detector. Machine learning can also comprise training a regression algorithm with at least part of the spectral data in order to obtain a regression model for the spectral data, for example. In particular, the classifier or regression algorithm is trained using the spectral data obtained by Fourier transforming at least a part of the time series data using the full set of basis functions or the full set of frequencies, rather than the reduced set of basis functions or the reduced set of frequencies. In other words, the classifier or regression algorithm is trained with the full spectral data comprising a full frequency range, for example, rather than a reduced frequency range that corresponds to the reduced set of frequencies. 2022PF02461 P2023,0237 WO N - 10 - According to at least one further aspect of the method, the classifier or regression algorithm comprises a random forest classifier or a support vector machine. In particular, the random forest classifier is a machine learning model that comprises a plurality of decision trees that are trained using a bootstrap aggregating algorithm. The latter randomly draws subsets from the training data for training the respective decision trees, for example. An ensemble classifier, such as a majority vote, averages predictions from the individual decision trees, for example. In particular, the support vector machine is a machine learning model for binary classification that is trained using a supervised machine learning algorithm. For example, the training data comprises d-dimensional data points in specified classes and the support vector machine finds an optimal (d-1)-dimensional hyperplane that separates the training data points according to their classes. For example, the dimension d of each data point is equal to the number of different frequencies in the spectral data. According to at least one further aspect of the method, the feature selection comprises a feature importance investigation of a machine learning model that was trained with at least a part of the spectral data. For example, the feature importance investigation comprises computing an importance score for each frequency of the spectral data that is an input of the machine learning model. In other words, each input of the machine learning model corresponds to a specific basis function or frequency and the importance score is computed for each basis function or frequency. 2022PF02461 P2023,0237 WO N - 11 - In particular, the importance score represents an importance of each input of the machine learning model. In other words, the importance score measures a correlation between the respective input of the machine learning model and its output. This correlation can be quantified by measures such as mutual information, Pearson correlation coefficients, or Gini importance, for example. In particular, the importance score is high if the input has a large effect on the accuracy of the trained machine learning model, whereas the importance score is low if the input has a small or negligible effect on the accuracy of the trained machine learning model. According to at least one further aspect of the method, the reduced set of basis functions is selected such that their feature importance is higher than the feature importance of basis functions outside the reduced set. In particular, only the basis functions or frequencies with the largest importance score are included in the reduced set of basis functions or frequencies. For example, the total number of basis functions or frequencies in the reduced set is chosen such that a minimal desired accuracy for performing the given task is obtained. According to at least one further aspect of the method, the feature importance investigation comprises a recursive feature elimination and/or a permutation feature importance measurement. For example, during recursive feature elimination, the inputs of the machine learning model with the lowest importance score are iteratively removed until a specified number of inputs, i.e. a specified number of basis functions or frequencies in the reduced set, remains. For example, the basis function or frequency with the lowest importance score is removed and the machine learning model is 2022PF02461 P2023,0237 WO N - 12 - trained anew without the removed basis function or frequency. Subsequently new importance scores are computed. These steps are repeated until the specified number of basis functions or frequencies in the reduced set remain, and/or the desired minimal accuracy is obtained. In particular, the permutation feature importance measurement determines the importance score by permuting inputs, e.g. basis functions or frequencies, of the machine learning model. For example, if the permutation increases an error of the machine learning model, the importance score is high, and vice versa. According to at least one further aspect of the method, the time series data is downsampled before it is transformed to the frequency domain. For example, the time series data is sparsified by keeping only data points from each second, third, fourth or n-th time step within the observation time window. Accordingly, downsampling reduces the number of data points in the time series data corresponding to the given observation time window, thereby reducing the computational effort for transforming the time series data to the frequency domain. According to at least one further aspect of the method, the downsampling is performed at a frequency corresponding to twice a maximum frequency of the basis functions in the reduced set of basis functions. For example, using machine learning and feature selection as described above, the reduced set of frequencies was determined beforehand. By sparsifying the time series data such that the maximum frequency in the reduced set of frequencies can still be resolved from the sparsified time series data, the 2022PF02461 P2023,0237 WO N - 13 - computational complexity of obtaining the spectral data for the given application can be further reduced. Alternatively or in addition, the downsampling can be performed such, that a minimal frequency distance between neighboring frequencies in the reduced set of frequencies can be resolved from the downsampled time series data. According to at least one further aspect of the method, a Boolean data type is used for processing the time series data and/or the basis functions when transforming the time series data to the frequency domain. For example, the time series data and/or the basis functions are stored using a Boolean, Integer, or Float datatype. By using an Integer or Boolean datatype, for example, memory requirements and/or a computational cost of the transformation of the time series data to the frequency domain can be reduced. In particular, the data type can be adapted to the signal of the detector. For example, if the detector provides digital time series data with 8-bit or 16-bit resolution, the Fourier transform can be computed using a corresponding data type with the same or a similar bit resolution. Depending on the given application, it is also possible that a data type with a lower bit resolution, such as a Boolean data type, is sufficient to obtain the spectral data with sufficient accuracy and/or resolution. Further, a sensor is specified herein. In particular, the sensor uses the method described above for obtaining spectral data. All features of the method for obtaining spectral data are also disclosed for the sensor, and vice versa. 2022PF02461 P2023,0237 WO N - 14 - According to at least one aspect, the sensor comprises a detector that provides time series data. According to at least one further aspect, the sensor comprises a processing unit that transforms the time series data to a frequency domain by using a predetermined sparsified transformation matrix stored in a memory of the processing unit. For example, the processing unit comprises a field programmable gate array and/or an application specific integrated circuit. For example, the transformation of the time series data ^^_^ to the spectral data ^^_^ in the frequency domain is performed by arranging the time series data in the form of a vector and multiplying this vector with the sparsified transformation matrix ^^_^,^, in particular using the equation

For example, elements of the transformation matrix ^^_^,^ can be expressed in terms of the basis functions via ^^_^,^= ^^^{^ ఠ^ ௧^}, where ^^_^ and ^^_^, as well as the integers ^^ and ^^, have been defined in relation to equation (1) above. In particular, the transformation matrix is sparsified by only keeping the elements corresponding to the reduced set of basis functions or frequencies. In other words, elements of the ^{transformation matrix that do not correspond to a frequency} i_{n the reduced set of frequencies are set to zero. In} other words, rows of the transformation matrix ^^_^,^, where the row index ^^ does not correspond to a frequency ^^_^ in the reduced set of frequencies, are set to zero, for example. 2022PF02461 P2023,0237 WO N - 15 - According to at least one further aspect of the sensor, the transformation matrix is sparsified using machine learning and feature selection for a given application of the sensor. In particular, the sparsification is performed by setting matrix elements of the transformation matrix that correspond basis functions outside the reduced set of basis functions or that correspond to frequencies outside the reduced set of frequencies to zero. For example, machine learning and feature selection are performed as described with regard to the method disclosed above. According to an embodiment, the sensor comprises: - the detector providing time series data, - the processing unit that transforms the time series data to the frequency domain by using the predetermined sparsified transformation matrix stored in the memory of the processing unit, wherein - for a given application of the sensor, the transformation matrix is sparsified using machine learning and feature selection. According to at least one further aspect of the sensor, the detector comprises a vertical cavity surface emitting laser diode (short: VCSEL) that is configured for self-mixing interferometry. In particular, the VCSEL comprises a semiconductor layer stack with an active layer for converting an electrical current into electromagnetic radiation. For example, the active layer comprises a pn-junction. The semiconductor layer stack is arranged inside an optical resonator, for example. In particular, an optical axis of the optical resonator is parallel or substantially parallel to a growth direction of the semiconductor layer stack. In other words, the optical axis of the optical resonator is 2022PF02461 P2023,0237 WO N - 16 - perpendicular or substantially perpendicular to a main extension plane of semiconductor layers of the semiconductor layer stack. For example, the optical resonator of the VCSEL comprises a front reflector and a back reflector that are arranged on opposite main surfaces of the semiconductor layer stack. The front reflector and/or the back reflector may comprise a metallic mirror or a dielectric Bragg reflector, for example. Electromagnetic radiation generated during operation of the VCSEL may be emitted primarily via the front reflector. In particular, the VCSEL emits coherent electromagnetic radiation during operation. Coherent electromagnetic radiation has a larger coherence length, a smaller spectral linewidth and/or a higher degree of polarization compared to incoherent electromagnetic radiation as emitted by a light emitting diode, for example. For example, the VCSEL emits electromagnetic radiation in a spectral range between infrared and ultraviolet light. In particular, during self-mixing interferometry at least a part of the electromagnetic radiation that is emitted by the VCSEL during operation is reflected back to the VCSEL by an external object. For example, the reflected electromagnetic radiation is coupled back into the optical resonator of the VCSEL, such that an amplitude and/or a frequency of the emitted electromagnetic radiation is altered. For example, the altered amplitude and/or frequency of the emitted electromagnetic radiation can be detected and comprises information about a distance between the detector and the external object. 2022PF02461 P2023,0237 WO N - 17 - According to at least one further aspect, the sensor is an optical distance sensor or an electroacoustic transducer. For example, the sensor is a LIDAR sensor or an optical microphone. For example, self-mixing interferometry can be used to measure the distance between the external object and the detector in a LIDAR sensor. For example, self-mixing interferometry can be used for measuring the displacement of a vibrating membrane in an electroacoustic transducer, wherein the vibrating membrane is the external object, for example. According to at least one further aspect, the sensor is configured for frequency modulated continuous wave (short: FMCW) light detection and ranging. In other words, the sensor is an FMCW LIDAR sensor. In particular, an FMCW LIDAR sensor determines a distance between the sensor and an external object by continuously emitting electromagnetic radiation whose frequency is modulated such that it increases or decreases linearly as a function of time within a given time period. The emitted electromagnetic radiation is at least partially reflected by an external object and subsequently received and detected by the sensor. By comparing the frequency of the emitted electromagnetic radiation with the frequency of the detected electromagnetic radiation, the distance to the external object can be determined, for example. For example, the emitted electromagnetic radiation and the received electromagnetic radiation are superimposed on a photodetector and a resulting beating frequency that is proportional to the distance can be determined by transforming the detector signal to the frequency domain using the method described above. 2022PF02461 P2023,0237 WO N - 18 - According to at least one further aspect, the sensor is configured for a classification of a distance between the sensor and an external object into at least two distance classes. For example, the sensor classifies the distance as “close” or “far”, depending on the measured distance being smaller or larger than a threshold distance, respectively. According to at least one further aspect of the sensor, the method described above is used for transforming the time series data to the frequency domain, and a matrix element of the sparsified transformation matrix corresponds to a value of a basis function of the reduced set of basis functions at given frequency and time step, wherein the given frequency and time step determine a row index and a column index of the transformation matrix, respectively. In particular, elements of the transformation matrix are determined via ^^_^,^= ^^^{^ ఠ^ ௧^}, as disclosed with regard to equation (2) above. Further advantageous embodiments and further embodiments of the method for obtaining spectral data and the sensor become apparent from the following exemplary embodiments described in connection with the figures. Figure 1 illustrates a full transformation matrix for transforming time series data to the frequency domain, according to an example. Figure 2 illustrates full spectral data obtained by transforming time series data to the frequency domain using a full set of basis functions, according to an example. Figures 3A to 3C illustrate transformation matrices for transforming time series data to the frequency domain 2022PF02461 P2023,0237 WO N - 19 - according to examples and to an exemplary embodiment of the sensor, respectively. Figures 4A to 4D illustrate spectral data according to different examples and exemplary embodiments of the method for obtaining spectral data. Figure 5 illustrates classes for classifying spectral data according to an exemplary embodiment of the method. Figure 6 illustrates an importance score according to an exemplary embodiment of the method for obtaining spectral data. Figure 7 illustrates an accuracy of the method for obtaining spectral data according to an exemplary embodiment. Figure 8 shows a schematic cross section of a sensor according to an exemplary embodiment. Elements that are identical, similar, or have the same effect, are denoted by the same reference signs in the figures. The figures and the proportions of the elements shown in the figures are not to be regarded as true to scale. Rather, individual elements, may be shown exaggeratedly large for better representability and/or better understanding. Figure 1 shows a schematic contour plot of a real part Re and an imaginary part Im of an exemplary transformation matrix M_n,m of a discrete Fourier transform for transforming time series data 2 (not shown) to the frequency domain in order to obtain spectral data 1 (not shown). The column index m of the transformation matrix M_n,m corresponds to different data 2022PF02461 P2023,0237 WO N - 20 - points of the time series data 2 obtained at a series of subsequent time steps t_m, whereas the row index n of the transformation matrix M_n,m corresponds to different frequencies ω_n in the frequency domain. In particular, Figure 1 illustrates a full, non-sparsified transformation matrix M_n,m with N×N entries corresponding to time series data 2 including N=200 data points at different time steps t_m and spectral data 1 (not shown) comprising N=200 different frequencies ω_n. Computing the spectral data 1 from the time series data 2 comprises a matrix multiplication of the time series data 2 with the transformation matrix M_n,m. Accordingly, the transformation to the frequency domain requires to perform a number of computations that is proportional to N². Figure 2 illustrates spectral data 1 in terms of a power spectral density P as function of frequency ω_n that was obtained by multiplying the time series data 2 (not shown) with the full, non-sparsified transformation matrix M_n,m. In other words, Figure 2 shows spectral data as obtained using a discrete Fourier transformation using a full set of basis functions. In particular, the time series data 2 (not shown) comprises N=10000 data points recorded at a sampling rate of 2 MHz. Figure 2 shows spectral data 1 in the limited frequency range between 0 kHz and 100 kHz that would actually extend up to 1 MHz. As an example, for a given application the frequencies of interest could be located in a frequency band between 5 kHz and 25 kHz. Using a standard approach, a FFT could be used to compute the full spectral data 1 between 0 MHz and 1 MHz and subsequently the frequency band of interest between 5 kHz and 2022PF02461 P2023,0237 WO N - 21 - 25 kHz would be analyzed, for example. This would require a lot of unnecessary computational effort to determine the spectral data 1 outside the frequency band of interest, however. Similarly, using a sparse FFT for example, coarse spectral data 1 with a low frequency resolution could be computed first and subsequently the spectral data 1 in the vicinity of the dominant peak around 45 kHz, which may be a noise component, would be computed with an increased frequency resolution. However, the sparse FFT approach would not be suitable to analyse the frequency band of interest between 5 kHz and 25 kHz in sufficient detail. By contrast, the method specified herein allows to focus on the frequency band of interest and to compute the spectral data 1 within this frequency band with high resolution. For example, it is also possible that the frequencies of interest are distributed over multiple, separated frequency bands. The method described herein would first identify the most relevant frequencies ω_n and only then would compute the spectral data 1 corresponding to these specific frequencies ω_n during an intended use of the detector 3 and/or sensor 10. Figures 3A to 3C illustrate the real part of different transformation matrices M_n,m. While Figure 3A shows an example of the full, non-sparsified transformation matrix M_n,m as in Figure 1, Figure 3B shows an example of a sparsified transformation matrix M_n,m that can be used to compute the spectral data 1 only in the frequency band of interest between 5 kHz and 25 kHz, in this example. In other words, the elements of the transformation matrix M_n,m in Figure 3B corresponding to frequencies ω_n outside the frequency band of interest are set to zero. This already reduces the computational effort for obtaining the spectral data 1. 2022PF02461 P2023,0237 WO N - 22 - Figure 3C illustrates a sparsified transformation matrix M_n,m according to an exemplary embodiment of the method for obtaining spectral data 1. In particular, machine learning and feature selection have been used to identify a reduced set 4 of relevant frequencies ω_n that are located in the frequency band of interest. Only matrix elements of the transformation matrix M_n,m corresponding to this reduced set 4 of frequencies ω_n are kept, while all other matrix elements of the transformation matrix M_n,m are set to zero. Compared to the transformation matrix M_n,m shown in Figure 3B, this reduces the computational cost for obtaining the spectral data 1 further. Imaginary parts of the respective transformation matrices M_n,m are not shown in Figures 3A to 3C. However, the imaginary parts of the respective transformation matrices M_n,m can have a similar structure as the respective real parts, in particular with regard to the frequency sparsification described in connection with Figures 3A to 3C. Figure 4A shows the same spectral data 1 as in Figure 2 that was obtained from the time series data 2 by a discrete Fourier transformation using the full set of frequencies ω_n. In particular, the spectral data 1 was computed by storing and processing both the time series data 2 and the transformation matrix M_n,m using a Float32 data type, i.e. a single-precision floating-point data format. In other words, each data point of the time series data 2 as well as each matrix element of the transformation matrix M_n,m is stored and processed using a resolution of 32 bits in a floating-point data format. 2022PF02461 P2023,0237 WO N - 23 - By contrast, Figure 4B shows the data of Figure 4A together with the resulting spectral data 1, if the time series data 2 is processed using a Boolean data type, i.e. using a resolution of only a single bit, whereas the transformation matrix M_n,m is stored and processed using the Float32 data type. Figure 4C shows the data of Figure 4A together with the resulting spectral data 1, if the time series data 2 is processed using the Float32 data type, whereas the transformation matrix M_n,m is stored and processed using the Boolean data type. Finally Figure 4D shows the data of Figure 4A together with the resulting spectral data 1, if both the time series data 2 and the transformation matrix M_n,m are stored and processed using the Boolean data type with a single bit resolution. Figure 4D shows that the transformation of the time series data 2 to the frequency domain using a data format with a single bit resolution for the time series data 2 as well as for the transformation matrix M_n,m allows to compute the spectral data 1 with sufficient accuracy, depending on the given application. Using the data format with the single bit resolution reduces the computational cost for obtaining the spectral data 1 further. Figure 5 shows classes 81, 82, 83 for classifying spectral data 1 according to an exemplary embodiment of the method for obtaining spectral data 1. The method is used in a distance sensor, such as a FMCW LIDAR sensor, and the given application or task comprises detecting if an external object is “close” or “far” from the detector. In particular, the spectral data is classified as “close” 81, if the distance D between the sensor and the external object is between 50 cm and 125 cm, and the spectral data is classified as “far” 82, 2022PF02461 P2023,0237 WO N - 24 - if the distance D is between 150 cm and 350 cm. Distances D between 125 cm and 150 cm are classified as “uncertain” 83. The method comprises training a random forest classifier with spectral data 1 labelled as “close”, “uncertain” or “far” in order to perform the given task. In particular, the spectral data 1 for training the random forest classifier comprises full spectral data obtained by transforming the time series data 2 to the frequency domain using the full set of basis functions. Moreover, the Float32 data type was used for processing the time series data 2 and the transformation matrix M_n,m. Figure 6 shows an importance score I as a function of frequency ω_n for the random forest classifier described in connection with Figure 5. In particular, the three panels of Figure 6 show the importance scores I for the random forest classifier that were obtained using three different training data sets. The importance scores I were obtained using a permutation feature importance measurement. Figure 6 shows that the most important frequencies ω_n in order to determine if the external object is in the “close” 81, “uncertain” 83 or “far” 82 class are between 0 and 40 kHz. For example, in the upper panel of Figure 6 the three most important frequencies ω_n for performing the classification described in connection with Figure 5 are marked by an ellipse. Figure 7 shows plots of a confusion matrix for estimating the performance of the random forest classifier described in connection with Figures 5 and 6 that was retrained with spectral data 1 restricted to the ten frequencies ω_n with the highest importance scores I according to Figure 6. In other words, the reduced set 4 of frequencies ω_n for the transformation of the time series data 1 to the frequency 2022PF02461 P2023,0237 WO N - 25 - domain comprises only ten frequencies ω_n with the highest importance scores I according to Figure 6. Consequently, the random forest classifier was retrained using only these ten frequencies ω_n from the reduced set 4 as inputs in order to perform the classification task. In particular, the three confusion matrices shown in Figure 7 correspond to the three different training data sets described with regard to Figure 6. The diagonal entries of the confusion matrix represent the detection accuracy for the three classes “close” 81, “uncertain” 82 and “far” 82. For example, external objects in the “close” class 81 can be detected with accuracies between 79% and 96%, whereas external objects in the “far” class 82 can be detected with accuracies between 96% and 98% using only spectral data 1 corresponding to the ten most important frequencies ω_n. Figure 8 shows a sensor 10 according to an exemplary embodiment, comprising a silicon substrate 11 with an integrated processing unit 5 and an integrated photodetector 3. The photodetector 3 is a photodiode that is integrated into the substrate 11 and comprises a pn-junction for converting incident electromagnetic radiation 12 into a photocurrent, for example. A VCSEL 6 is arranged directly above the photodetector 3 and configured for self-mixing interferometry. In particular, the sensor 10 is an FMCW LIDAR sensor. Electromagnetic radiation 12 emitted by the VCSEL 6 is at least partially reflected by the external object 7 and coupled back into the VCSEL 6 and further into the photodetector 3. The photodetector 3 is configured to measure a beating frequency of the emitted and received electromagnetic radiation 12 that is superimposed on the 2022PF02461 P2023,0237 WO N - 26 - photodetector 3 and that is proportional to the distance D between the sensor 10 and the external object 7. The processing unit 5 receives time series data 2 from the photodetector 3 and transforms it to the frequency domain using a reduced set 4 of frequencies ω_n that was previously obtained by machine learning and feature selection during manufacturing of the sensor 10. For example, the sensor 10 is configured to perform the classification task described in connection with Figures 5 to 7. In particular, the matrix elements of the transformation matrix M_n,m corresponding to the reduced set 4 of frequencies ω_n are stored in a memory of the processing unit 5 and the processing unit 5 performs the matrix multiplication according to equation (2) with the sparsified transformation matrix M_n,m. The processing unit 5 computes the spectral data 1 using a predetermined data type for the time series data 2 and the transformation matrix M_n,m, such as Float32, Boolean, or an Integer data type, for example. This patent application claims the priority of German patent application DE 102023110530.2, the disclosure content of which is hereby incorporated by reference. The invention is not restricted to the exemplary embodiments by the description on the basis of said exemplary embodiments. Rather, the invention encompasses any new feature and also any combination of features, which in particular comprises any combination of features in the patent claims and any combination of features in the exemplary embodiments, even if this feature or this combination itself is not explicitly specified in the patent claims or exemplary embodiments. 2022PF02461 P2023,0237 WO N - 27 - References 1 spectral data 2 time series data 3 detector 4 reduced set 5 processing unit 6 VCSEL 7 external object 81 “close” class 82 “far” class 83 “uncertain” class 10 sensor 11 substrate 12 electromagnetic radiation t time t_m time step ω_n frequency M_n,m transformation matrix n row index m column index I importance Re real part Im imaginary part D distance ^{P power spectral density}

Claims

2022PF02461 P2023,0237 WO N - 28 - Claims 1. Method for obtaining spectral data (1), comprising the steps of: - obtaining time series data (2) from a detector (3), - transforming the time series data (2) to a frequency domain using a reduced set (4) of basis functions, wherein - for a given application of the detector (3), the reduced set (4) of basis functions is selected using machine learning and feature selection. 2. Method according to the previous claim, wherein transforming the time series data (2) to the frequency domain comprises a discrete Fourier transformation with the reduced set (4) of basis functions. 3. Method according to any of the previous claims, wherein the reduced set (4) of basis functions comprises cosine functions and/or sine functions depending on time (t) and on a reduced set (4) of frequencies (ω_n). 4. Method according to the previous claim, wherein - the time series data (2) to be transformed to the frequency domain comprises at least 1000 data points at a series of subsequent time steps (t_m), and - the reduced set (4) of frequencies (ω_n) comprises at most 50 different frequencies (ω_n). 5. Method according to any of the previous claims, wherein - the given application comprises classifying the spectral data (1), and 2022PF02461 P2023,0237 WO N - 29 - - machine learning comprises training a classifier with at least a part of the spectral data (1) in order to obtain a classification model for the spectral data (1). 6. Method according to any of the previous claims, wherein the classifier comprises a random forest classifier or a support vector machine. 7. Method according to any of the previous claims, wherein - the feature selection comprises a feature importance investigation of a machine learning model that was trained with at least a part of the spectral data (1), and - the reduced set (4) of basis functions is selected such that their feature importance (I) is higher than the feature importance (I) of basis functions outside the reduced set (4). 8. Method according the previous claim, wherein the feature importance investigation comprises a recursive feature elimination and/or a permutation feature importance measurement. 9. Method according to any of the previous claims, wherein - the time series data (2) is downsampled before it is transformed to the frequency domain, and - the downsampling is performed at a frequency corresponding to twice a maximum frequency (ω_n) of a basis function in the reduced set (4) of basis functions. 10. Method according to any of the previous claims, wherein a Boolean data type is used for processing the time series (2) data and/or the basis functions when transforming the time series data (2) to the frequency domain. 2022PF02461 P2023,0237 WO N - 30 - 11. Sensor (10), comprising: - a detector (3) providing time series data (2), - a processing unit (5) that transforms the time series data (2) to a frequency domain by using a predetermined sparsified transformation matrix (M_n,m) stored in a memory of the processing unit (5), wherein - for a given application of the sensor (10), the transformation matrix (M_n,m) is sparsified using machine learning and feature selection. 12. Sensor (10) according to the previous claim, wherein the detector (3) comprises a vertical cavity surface emitting laser diode (6) that is configured for self-mixing interferometry. 13. Sensor (10) according to any of claims 11 or 12, wherein the sensor (10) is an optical distance sensor or an electroacoustic transducer. 14. Sensor (10) according to any of claims 11 to 13, wherein - the sensor (10) is configured for frequency modulated continuous wave light detection and ranging, and - the sensor (10) is configured for a classification of a distance between the sensor (10) and an external object (7) into at least two distance classes (81, 82, 83). 15. Sensor (10) according to any of claims 11 to 14, wherein - the method according to any of claims 1 to 10 is used for transforming the time series data (2) to the frequency domain, wherein - a matrix element of the sparsified transformation matrix (M_n,m) corresponds to a value of a basis function of the 2022PF02461 P2023,0237 WO N - 31 - reduced set (4) of basis functions at given frequency (ω_n) and time step (t_m), wherein the given frequency (ω_n) and time step (t_m) determine a row index (n) and a column index (m) of the transformation matrix (M_n,m), respectively.