Digitizing data acquisition and time-of-flight pulse processing for ToF-ERDA

A versatile system to capture and analyze signals from multi channel plate (MCP) based time-of-flight detectors and ionization based energy detectors such as silicon diodes and gas ionization chambers (GIC) is introduced. The system is based on commercial digitizers and custom software. It forms a part of a ToF-ERDA spectrometer, which has to be able to detect recoil atoms of many different species and energies. Compared to the currently used analogue electronics the digitizing system provides comparable time-of-flight resolution and improved hydrogen detection efficiency, while allowing the operation of the spectrometer be studied and optimized after the measurement. The hardware, data acquisition software and digital pulse processing algorithms to suit this application are described in detail.


Introduction
The data acquisition in ion beam analysis (IBA) techniques such as Rutherford backscattering spectrometry (RBS) and elastic recoil detection analysis (ERDA) almost always relies on analogue signal processing and multi channel analyzer (MCA).This is often sufficient, since the resolution needs and count rate capabilities can in most cases be met by standard nuclear instrumentation module (NIM) based electronics.
Digital pulse shaping is a well-established field and for example trapezoidal shaping [1] is a useful technique to extract high resolution pulse height information even at high count rates, originally developed for gamma ray spectroscopy.Commercial silicon drift detector (SDD) with integrated trapezoidal pulse shaper is often used in high count rate X-ray detection also in the world of IBA.In an RBS setup a digitizer with a trapezoidal shaper implemented using a field programmable gate array (FPGA) is a clear replacement for the traditional shaping amplifier MCA combination.
A unique challenge among ion beam analysis techniques is ToF-ERDA, where the time-of-flight measurement from two multi-channel plate (MCP) pulses is usually based on a discriminator such as a CFD, followed by a time-to-amplitude converter (TAC) or a time-to-digital converter (TDC).Coincidences between the time-of-flight and energy detectors are either found using timestamps [2] or a hardware coincidence box.By using list-mode data acquisition beam-induced changes to the sample can be detected.The count rate of timing detectors is ultimately limited by the recharging of the MCPs [3].The count rate is kept usually much lower than that, typically less than 20 000 counts per second (cps), in order to reduce background due to multiple particles in the same TDC window.The width of a TDC timing window needs to be roughly 200-500 ns, depending on the incident beam, to detect low energy heavy recoils.
Gas ionization chambers (GIC) with thin entrance windows are used in increasing numbers in ToF-ERDA spectrometers as the energy detector [4,5].In order to exploit the capabilities of even a simple planar electrode GIC, signals should be captured also from other electrodes than the anode, and advanced pulse shaping methods used to analyze those signals.Build up of charge in the gas and pile-up limit the count rate of the detector to typically 1000 (cps).The abovementioned count rates do not present a challenge to a digitizing system, even when full traces are captured from the MCP detectors and an energy detector.The data rate is typically some megabytes to tens of megabytes per second and therefore it is possible to do the signal processing on a standard PC, which makes it easy to modify the processing to suit the needs of the particular setup with only software modifications.After the experimental setup has been set and MCP traces have been captured, it is possible to fine-tune the timing analysis offline.Digitizing the detector signals is also valuable for the development of the detectors, as the signals can be studied in detail after a measurement.
Digital timing for positron emission tomography and nuclear physics experiments using multiple scintillation detectors is already established [6,7].Digitizers have also been used successfully in extracting high-resolution timing information from silicon detectors [8].The large number of channels in some of these experiments necessitates pulse processing in hardware, either on an FPGA or an application specific integrated circuit (ASIC).In contrast software based real-time analysis has been applied in positron lifetime spectrometers [9].The task of measuring time intervals with 200 ps resolution and kHz count rates is similar to ToF-ERDA.Especially signals from MCP-PMTs are comparable in shape to the MCP signals studied in this paper, but the variance in pulse amplitudes is typically smaller in photo detectors than in carbon foil based timing detectors used in ERDA.The instrumentation described in this paper has also been used to study scintillator detectors by reading out signals from both traditional photomultiplier tubes (PMT) and silicon photomultipliers (SiPM), the results are unpublished.
In IBA where the number of acquisition channels is small, usually between one to four, a commercial digitizer with proprietary FPGA firmware is a cost effective choice, the hardware itself is inexpensive and there is no need to invest time on digitizer hardware development or FPGA programming.The digitizing ToF-ERDA system developed in Jyväskylä is described in this paper, including discussion on the limitations of this approach to data acquisition.

Experimental
The experimental setup consists of the ToF-ERDA spectrometer installed at the +15 degree beamline at the JYFL 1.7 MV Pelletron accelerator.The ToF-ERDA spectrometer consists of two carbon foil timing detectors [10] T1 and T2 for the start and stop signals, respectively, and a gas ionization chamber [5].
The digitizing data acquisition comprises two digitizers, a CAEN S.p.A. N6751 2/4 channel 2/1 GSample/s 10-bit digitizer, which digitizes pulses with 1 Vp-p scale and adjustable +-0.5 V DC offset and a CAEN N6724 4 channel 100 MSample/s 14-bit digitizer, 2.25 Vp-p input scale.The input impedance in both digitizers is 50 ohm.The setup is shown schematically in figure 1.The N6751 is used to digitize fast MCP signals in two-channel mode at 2 GS/s and N6724 with the CAEN DPP-PHA is used to digitize GIC and Si detector signals after they have been amplified by a charge sensitive preamplifier (Amptek CoolFET).The DPP-PHA firmware runs in real time both a trigger and a trapezoidal shapers.This makes it possible to capture energy signals using list-mode at high count rates and more importantly for each channel to self trigger, minimizing the dead time.The faster digitizer uses the standard firmware provided by the manufacturer, which implements only a leading edge trig-ger and returns full traces of all active channels on trigger.With this firmware the digitizer is effectively a digital storage oscilloscope.The digitizers are synced using an external clock generator (AD9548) and a daisy-chained start signal with the GPO/GPI connectors on the digitizers.This allows events to be correlated within the 20 ns (N6751) and 16 ns (N6724) timestamp jitter.
In a time-of-flight measurement the trace length defines the width of the timing window.One advantage of the digitizers is the possibility to return samples also before the trigger, as the digitized signal is continuously saved to a buffer.The triggering time-of-flight gate in ToF-ERDA can therefore be the T2 without the use of delay boxes, and the advantage of smaller background count rate and on average higher pulse heights of the T2 can be exploited.In this approach there are no triggering criteria on T1, compared to digitizing the signals from the gates independently.When the trace has been captured, the digitizer continues saving the ADC samples to another buffer and ignores any triggers until this buffer is filled and a full trace can be returned on the next trigger.Therefore the trace length also defines the dead time.By allowing the digitizer to capture variable-length events the dead time could be eliminated entirely, but the reconstruction of events would be complicated.
The digitizers are connected by a CAEN optical link to a two channel PCI-Express card (CAEN A3818).Up to eight digitizers can be connected in a daisy chain to one channel with a maximum of 80 MByte/s data rate.The card is installed to a desktop computer with a quad core processor running Linux operating system.
Since the trigger logic of the N6751 operates at 125 MHz the local trigger can only be used for pulses exceeding the threshold for 8 sampling periods.This limitation exists in the firmware provided by the digitizer manufacturer, since using parallelization techniques it is possible to trigger to a leading edge in just one sample [11].In the ToF-ERDA application short MCP pulses barely above the noise level are observed from hydrogen recoils, and not all of these trigger the local trigger.Alternatively an external trigger can be used, for example from a CFD, in which case the detection efficiency can be as good as with an analogue setup.In this study the MCP signals are amplified by a 300 MHz amplifier (Phillips Scientific 776), and then digitized.The amplification (10x) must be this large for the T1 signals, as hydrogen signals below 10 mV are observed even after the amplification.For T2 the signals are attenuated by a factor of two using resistors after the amplifier, since the signals are stronger due to a thicker carbon foil [12], which emits on average more electrons.The root-mean-square (RMS) noise measured from the digitized signal from T1 is 0.7 least significant bits (LSB), approximately the same in millivolts, and 0.8 LSB from the T2, these figures include both the contribution of the quantization and detector noise.Adjusting the voltage over the MCP plates allows the gain of the MCP to be fine-tuned to match the digitizer dynamic range.Figure 2 shows typical pulses from the timing detectors as captured by the digitizer.

Software
The data acquisition software is homemade Linux software equipped with a graphical user interface (GUI) built using the Qt5 framework [13].Digitizer and pulse analysis specific code does not use any Qt5 code, but is instead implemented using C++ standard library, Boost C++ libraries [14], GNU scientific library [15], CAEN digitizer library, and POSIX threads.This makes it possible to reuse the code if a non-GUI version of the software is preferred.
The software is designed to run processing tasks in parallel as much as possible, since modern PCs are equipped with multi-core processors.Each digitizer connection is handled in a separate thread, and any output file saved from the data queues is handled by a data saving thread.Signal processing can be done by a user-supplied number of analyzer threads.In a twodigitizer scenario there are at least six threads running during the data acquisition.The data read by a digitizer thread is pushed immediately to an acquisition event FIFO queue, implemented using C++ std::queue containers.The threads poll for new data as soon as the previously read events are passed to the acquisition queue.The producer-consumer problem in the queues is handled using condition variables.The raw events can be dumped to a file in a custom ASCII format, binary format or as gzip compressed binary by a saver thread.The purpose of a single saver thread is to store all data in a single file and keep coincident events as close as possible to each other.The offline analysis of raw data is simplified compared to saving events to separate files for every digitizer.
After the events have been saved they are passed on to another queue.The events from the queue are read by the analyzer threads, which run the events through any number of preconfigured analysis algorithms.The implemented algorithms are described in more detail in section 3.2.The analyzer threads are all identical, as on the start of the acquisition these threads are launched with clones of the analyzers configured in the user interface.The different analyzers implement a common C++ class for interfacing.Each of the analyzers provide a concise numeric value, and all the values are gathered to a single file in the same fashion as the raw events.Finally the list events are passed to a coincidence processor.The data flow is shown schematically in figure 3.In practice it is not always necessary nor convenient to store the full traces, so the software has been designed to be fast enough to do all signal processing on-line.Storing full traces may also limit the achieved count rate, if the mass storage is not fast enough.
Due to the large number of queues storing potentially a large number of events, there must be a careful balance between the memory use and data loss.The size of the queues can be changed run-time, but a maximum number of events in each queue has to be set, so the memory use does not increase unexpectedly.In addition to the acquisition program, small helper programs were also written which allow events to be extracted from the raw data and reanalyzed.

Signal processing for MCP pulses
The timing algorithm that was found to work well for the MCP pulses is based on a digital implementation of the analogue CFD method.The figure of merit for a timing algorithm is not only the timing resolution, but also a stable threshold for small pulses that determines the hydrogen detection efficiency.An external trigger based on the T2 signal is used to capture the data, therefore the T2 detection efficiency is limited by the trigger.The algorithm uses a simple level threshold and a variable window averaging before the pulse exceeds the threshold to determine the rough location of a pulse and the baseline before it, see Fig. 2.Only the samples near this trigger point are then considered for detailed timing analysis.
Sub-sample resolution requires interpolation at some stage.In order to preserve the high-frequency signal an interpolation method producing a continuous reconstructed signal, preserving the original samples exactly, and only being sensitive to nearest few samples was preferred.According to sampling theorem a perfect reconstruction for band-limited signals exists, and this reconstruction could be used to interpolate the original samples using a formula given by Shannon [16].In this case the saturation of the ADC can severely limit the applicability of this method, and the computational cost of this ideal sinc filter is too high.A more practical windowed sinc filter called a Lanczos filter [17] was implemented and tested using five weights.As a downside the Lanczos filter shows some oscillatory behavior on step changes.Piecewise polynomials, splines, were implemented with both 1st order (linear) and 3rd order (cubic) polynomials.Both the Lanczos filter and cubic splines also produce a reconstructed signal with continuous derivative.The results in this paper are achieved with the use of cubic spline interpolation, unless otherwise noted.
After the baseline has been subtracted the algorithm sums the original pulse V(t) with an inverted, attenuated (attenuation factor f ) and time delayed (delay t d ) version of itself.This computed waveform is plotted in Fig. 4. The algorithm seeks the zero crossing V(t) − f • V(t − t d ) = 0 first approaching this point stepwise and finally using Newton's method with a fixed number of iterations.The algorithm is very robust and anyone used to analogue CFD modules will find it relatively easy to tune.The algorithm can optionally compensate for the input signal saturation by extrapolating the leading edge linearly from the last two samples before the clipping occurs.Using a value for the delay t d that is shorter than the pulse rise time to saturation will yield the best resolution, as then the algorithm works in the amplitude and risetime (ARC) compensated mode [18,19] However delay time below or near one sampling period might cause the timing to quantize to the sampling periods.For this reason pulse rise time from noise floor to saturation should always be at least two, preferably three sampling periods, depending on the f and choice of interpolation method.Even for analogue implementation the jitter increases for short t d , while longer values introduce amplitude related walk [18].Pulse #2 Shaped #2 Fig. 4. Two T2 pulses processed with the amplitude and risetime compensated (ARC) timing algorithm using parameters t d = 1.5, f = 0.5.The DC-level has been restored to the shaped pulse for the plot.
The optimum f and t d parameters for the ToF-ERDA setup were found by scattering 5.1 MeV 4 He ions from an evaporated 1 nm thick gold film on silicon.The resolution with our analogue pulse processing setup has been studied using similar method [10], since the resulting time-of-flight spectrum has a narrow peak.The He-particles at that energy produce both small pulses, resembling those of H recoils, but also stronger pulses like heavier recoils.The He particles at this energy can be detected by the GIC, but due to their long range they are not entirely stopped within the gas volume, and as such the GIC acts as a delta E detector.A silicon detector at the end of the active volume of the GIC has still enough resolving power to separate He scattered from the gold and silicon.This Si detector gating is important, since we can reduce the effect of the halos [12] from the lower energy He events overlapping in the time spectrum.The time-of-flight histogram is presented in Fig. 5.The same procedure but without the gating was performed also by scattering 6.8 MeV 12 C ions from the same film.
To study the effect of the MCP gain on timing resolution the measurements were repeated with different MCP voltages.During offline analysis only the pertinent events, i.e.only the few thousand events resulting from scattering from the thin film instead of hundreds of thousands events in total, were re-analyzed with varying parameter permutations.The energy loss in the gold film is negligible for the scattered He ions, but not for the C ions.Additionally the inhomogeneity of the gold film thickness might result in non-Gaussian ToF-distribution.
A Gaussian distribution was fitted to the peak automatically using Fityk [20].In Fig. 6 the FWHM of the He resolution is plotted with varying T1 timing parameters.In Fig. 7 the C resolution is plotted with varying T2 timing parameters.The results confirm that the best resolution is found only when t d is short enough to avoid issues with the ADC saturation (clipping).The pulses that have no zero crossing are discarded in these resolution plots.The efficiency of the algorithm must therefore be also investigated, since for any sensible values the threshold should limit the efficiency.Thus hydrogen detection efficiency must be calibrated every time the timing parameters are changed, but as long as the acquisition triggering conditions or the measurement setup are not changed the efficiency curve can be extracted from previously measured data.The hydrogen detection efficiency curves are plotted in Fig. 8.The final choice of threshold, t d and f involves some compromise between efficiency, resolution and the stability of these two.It should be noted that sub-sampling rate resolution and high efficiency are achievable with most sane combinations.The resolution for all studied cases is well below the kinematic broadening for the recoils.This makes the tuning of the resolution only meaningful in practice if position sensitive detectors are used.By discarding every other sample, effectively achieving 1 GS/s sampling rate, a sub-sample resolution was not achieved.With scattered helium, a time-of-flight resolution of 185 ps was achieved with 2 GS/s sampling rate using the full 10-bit samples.Degrading the vertical resolution by artificially zeroing the least 1, 2 or 3 bits the resolution degraded significantly, 196 ps, 217 ps and 314 ps respectively for the same data set.The minor difference between 10 and 9 bits vertical information can be explained by the presence of noise in the signals.

Signal processing for GIC pulses
CAEN DPP-PHA trapezoidal shaper is used online for the GIC/Si detector signals, but no direct software equivalent for this shaper is available.A trapezoidal shaper was therefore also implemented in software, in order to study the shaper parameters and GIC resolution offline.
The electronic noise contribution to the anode signal was measured with a precision pulse generator (BNC PB-5) connected to the test input of the anode preamplifier (Amptek CoolFET) during testing before GIC installation.The noise contribution for helium was determined to be 26 keV with an established calibration.
The digitized signal amplitude was increased by a factor of four by addition of a timing filter amplifier (TFA) to the frontend electronics.The signals from the GIC have plenty of low frequency noise associated with microphonic pickup, so the TFA was set to differentiate the signal with 100 µs time constant, acting as a high-pass filter.Without the differentiation the dynamic range of the digitizer would have not been sufficient.The shape of the pulses except for the tail remained virtually unaffected by the TFA, since the integration stage was turned off.The resolution improved noticeably for the smallest hydrogen and helium pulses, but the exact cause of this improvement is unclear.Both the low effective number of bits (ENOB) and integral nonlinearity (INL) of the digitizer were suspected, but could not be definitively attributed to be the source of poor performance.The measurements in this paper were performed without the TFA in the signal chain.

Conclusions
Since the MCP pulses are sampled at only 2 GS/s and many of the pulses are clipped even with 10 bits of vertical resolution, too fast pulses can create problems.At least two samples must be captured on the rising edge to be able to do sub sample period timing using the ARC method with the highest resolution.
Optimum timing parameters were found by studying a narrow peak with different particle species.Three different interpolation methods were tested using the best possible parameters for ARC timing and only minor differences in resolution were observed.
The system can achieve timing resolution comparable to an analogue setup.In our setup the T2 MCP gain was reduced to improve the timing resolution at the cost of slightly reduced hydrogen detection efficiency.The hydrogen detection efficiency is still noticeably higher than with our analogue setup.
Due to the versatility of the digitizing setup and its cost effectiveness it is expected that many analogue setups in ToF-ERDA will eventually be replaced by a digitizer based solution.The software-based signal processing presented here reduces the time spent on the tuning of the setup.Many of the hardware limitations and problems encountered with commercial digitizers are mitigated as the hardware and firmware continue to develop.

Fig. 2 .
Fig.2.Typical T1 and T2 pulses from a single particle are plotted as digitized.Note the saturating T2 pulse.

Fig. 3 .
Fig. 3.A simplified block diagram of the software processing data flow.Any number of readout threads can be combined with any number of pulse analysis threads.Raw data, list-mode data and coincidence list-data are saved to files by separate threads.

Fig. 5 .
Fig. 5. Time-of-flight histogram of 5.1 MeV 4 He particles scattered from 1 nm thin Au film.The blue subset is from events producing ADC saturating pulses on T2.The FWHM of the fit is 185 ps.

Fig. 6 .Fig. 7 .
Fig.6.Time-of-flight resolution for He, with varying T1 timing parameters f and t d .The delay parameter t d is expressed in sampling periods (0.5 ns).T2 parameters were kept constant.

Fig. 8 .
Fig.8.Hydrogen detection efficiency for the current analogue setup and the digitizing setup with two T2 MCP voltages.The difference between the efficiency of the two setups is explained by different T1 detection efficiency.T2 efficiency is limited by the analogue CFD module, which produces the external trigger.