Estimating mean lifetime from partially observed events in nuclear physics

The mean lifetime is an important characteristic of particles to be identified in nuclear physics. State- of- the-art particle detectors can identify the arrivals of single radioactive nuclei as well as their subsequent radioactive decays (departures). Challenges arise when the arrivals and departures are unmatched and the departures are only partially observed. An inefficient solution is to run experiments where the arrival rate is set very low to allow for the matching of arrivals and departures. We propose an estimation method that works for a wide range of arrival rates. The method combines an initial estimator and a numerical bias correction technique. Simulations and examples based on data on the alpha decays of Lutetium isotope 155 demonstrate that the method produces unbiased estimates regardless of the arrival rate. As a


| INTRODUCTION
Radioactive decay is the textbook example of a Poisson process in time (Cox & Lewis, 1966).The statistical model can be derived directly from the laws of physics: the decay events are independent given the decay rate and the lifetimes follow an exponential distribution with a mean equal to the inverse of the decay rate.The mean lifetime is an important characteristic of particles to be identified in nuclear physics since it is sensitive to the structure of underlying quantum mechanical states.For example, a large change in nuclear structure from the initial to the final state causes a radioactive alpha decay longer than predicted by a simple model (Geiger & Nuttall, 1911).The estimation of the mean lifetime under different experimental setups is a recognized statistical problem in particle physics (Lyons, 2008).
In many modern experiments, radioactive species are produced continuously and decays are measured simultaneously.This is enabled by state-of-the-art particle detectors and dataacquisition systems which can identify the arrivals of single radioactive nuclei to a detector as well as their subsequent radioactive decays (Lazarus et al., 2001;Page et al., 2003).The arrivals and the decays, called departures for generality, are unmatched meaning that there is no physical way to link an arrival and a departure to each other.Only if the arrival rate is low and the mean lifetime is short, we are able to say with a high probability that a certain arrival and departure form a pair because only one departure has been observed between two consecutive arrivals.
Challenges arise when some arrivals or departures are mislabelled or partially observed.In this paper, we focus on experiments where it is not possible to catch all departures due to the structure of the particle detector.For example, the alpha radioactive nuclei created in fusion reactions and implanted to a detector are typically very close to the surface of the detector which implies that about half of departures are not observed due to the escape of the alpha particle from the detector.Combined with the lack of matching, we are left with an unequal number of arrival and departure times that cannot in general be linked to each other.Estimating the mean lifetime from this kind of data is the problem addressed in this paper.
Experimenters have commonly preferred setups where the arrival rate is low compared with the decay rate because then matching an arrival to a departure before the next arrival causes only a negligible bias (e.g.Herzáň et al., 2015, figure 12).However, this approach may lead to inefficient use of the measurement capacity.The matching is not possible for a high arrival rate but on the other hand the number of events per second is high.The motivation of this paper is to propose a method, which can also be used with high arrival rate in the case of unmatched arrivals and departures.
First we show that the matching of arrivals and departures is not necessary for the estimation of the mean lifetime.Next, we present a simple estimator for the case where all arrivals and departures are observed but not matched.When the departures are only partially observed, we use the same estimator after the thinning of the arrivals and apply a numeric bias correction technique.Simulation experiments demonstrate that the proposed method produces unbiased estimates regardless of the arrival rate.
The paper is organized as follows.In Section 2 the details of measurement device and data collection procedure are explained.In Section 3 it is shown that the matching of arrivals and departures is not needed in the estimation.Section 4 describes the proposed methodology for partially observed arrivals and departures.In Section 5, the implementation of the proposed estimation method is described.In Section 6, the properties of the proposed estimator are studied in simulations.In Section 7, the method is applied to 155 Lu (Lutetium isotope 155) alpha decay data.The practical implications and future directions are discussed in Section 8.

| Technical description of the physical data collection
The data were collected in the Accelerator laboratory of University of Jyväskylä in an experiment aiming to measure charged particle radioactivity of 160 Os produced in a fusion evaporation reaction (unpublished data).A beam of 304 MeV 58 Ni from the K130 cyclotron (Liukkonen, 1993) was used to bombard a self-supporting 106 Cd target of thickness around 1 mg/cm 2 .The average beam intensity of 3 pnA (19 × 10 9 particles/s) was used over 14.6 h.The radioactive 155 Lu was one of the products of the fusion evaporation reaction.
Fusion evaporation residues (recoils, called arrivals in this paper) were separated from the unreacted primary beam with the recoil-mass spectrometer called MARA (Sarén, 2011) employing electric and magnetic fields.At the focal plane of MARA the recoils are detected first with a gaseous multi-wire proportional counter (MWPC) transmission detector and then implanted into a double-sided silicon strip detector (DSSD) consisting of 192 × 72 strips in horizontal and vertical directions respectively.The full size of the DSSD is 128 × 48 mm 2 (width × height).Ionizing particle causes a local simultaneous signal into both horizontal and vertical strips of the DSSD and therefore the event can be assigned to a virtual pixel.The transmission position in horizontal and vertical directions of a recoil flying through the MWPC detector is obtained with time to amplitude converter (TAC) modules utilizing the time differences between charge signals from opposite ends of the wireplanes read through delay lines.The directions of the decay products (departures) of the implanted recoils (arrivals) are random and the DSSD covers half of the possible directions of decays, which leads to the detection rate of 50%. Figure 1 presents a schematic illustration of the experimental setup.
All detector events in MWPC and DSSD are recorded by a self-triggering data-acquisition system employing analogue to digital converter (ADC) modules and field-programmable gate array (FPGA) chips.The kinetic energy of a recoil implanted into the DSSD and the radioactive decay of a recoil implanted earlier are calculated in the FPGA chips running a moving window deconvolution algorithm for the voltage signal generated by charge integrating preamplifiers.
Events observed in the DSSD are classified into three categories: recoils (arrivals), decays (departures) and background.A DSSD event is a recoil if it was also observed in the MWPC detector and the DSSD energy and the time of flight between MWPC and DSSD is inside a two-dimensional gate and both horizontal and vertical positions were recorded successfully at MWPC.Due to the last condition, the probability to classify a recoil as background was 8.7%.A DSSD event without any signal at MWPC and having an energy between 500 keV and 10,000 keV is classified as a decay event which can be in this reaction an alpha, beta or proton radioactivity.All the rest of events are classified as background.
It is possible to continue measuring decays even after the fusion evaporation reaction has ended.This means that right censoring due to the end of the follow-up can be avoided.

| Dataset 1
Dataset 1 is a subset of the full dataset consisting of 155 Lu decay events related to the alpha energy of 7390 keV.The subset was deliberately chosen so that the arrivals and departures can be matched with a high probability.A record containing the time of an 155 Lu alpha decay, the time of the previous recoil in the same DSSD pixel and the horizontal and vertical strip number was written into a file if the time between these events was less than 1.5 s and there were no other decay events in between.The arrivals misclassified as background were ignored.The dataset contains 16,320 matched arrival-departure pairs and lifetimes obtained as the differences of the departures and the arrivals.Lifetimes shorter than approximately 0.008 ms are F I G U R E 1 A schematic view of the production, separation and detection of products of a nuclear reaction when an in-flight recoil separator is used.(a) An almost continuous beam of primary particles from an accelerator hits a target foil.The electromagnetic recoil separator selects the products and transports them to a detector setup.(b) A typical focal plane detector setup consists of a transmission detector (typically multiwire proportional counter) and a Double-sided silicon strip detector (DSSD).The products are implanted into the surface of the DSSD and the position of the implantation is registered.The arrivals referred in the text form a subset of these events.(c) A radioactive decay of an implanted product can be observed later and can be correlated spatially and temporally to the implantation.Four different decay events are illustrated: (i) a full energy alpha decay takes place in a pixel defined by orthogonal strips of the DSSD, (ii) decay occurs at the pixel boundary and part of the decay energy might be lost, (iii) an alpha particle escaping from the surface of the DSSD and only part of its energy is registered and (iv) the product undergoes a beta decay and the beta escapes through the detector (a beta particle has much higher range in silicon compared to an alpha particle).The departures in the text are of the case (i) events, the cases (ii) and (iii) describe two different mechanism to miss a departure and the case (iv) illustrates a different decay mode creating background events not detectable due to technical limitations, that is, the data are left truncated at 0.008 ms.The time of arrivals varies from 0 ms to 52,577,456 ms (about 14.6 h) implying that the arrival rate is 0.000310 events/ms.
The lifetimes contain some outliers, that is, extremely long lifetimes, which are assumed to be not real.We choose 80 ms to be the cut-point on the basis of a visual inspection and remove 251 lifetimes being longer than the cut-point.Thus, the data are truncated to the interval [0.008 ms, 80 ms].After the truncation, the number of arrivals is 16,069 and the arrival rate is 0.000306 events/ms.These 16,069 records are called Dataset 1.
The lifetimes are modelled by the exponential distribution.This assumption is supported by the data even after the truncation.The mean of the lifetimes, 3.760 ms, is very close to their standard deviation 3.798 ms and the form of the empirical distribution closely resembles the exponential distribution (Figure 2).For the exponential distribution with the mean lifetime 3.760 ms, the probability mass outside the range [0.008 ms, 80 ms] is only about 0.0021.Thus, it seems justifiable to treat the lifetimes as independent observations from an exponential distribution.

| Datasets 2, 3 and 4
Datasets 2, 3 and 4 are derived from Dataset 1 for illustrative purposes by breaking the matching, modifying the arrival rate and removing a part of the departures.This allows us to compare the estimates obtained under different conditions to those obtained from Dataset 1.The potential misclassification of arrivals was not considered because the focus was in partially observed departures.Starting with arrival times a 1 , …, a 16069 and the corresponding lifetimes x 1 , …, x 16069 of Dataset 1, we derive modified arrival and departure times as follows where c is a positive scaling constant.Dataset 2 is a densified, unmatched variant of Dataset 1.The arrival times are divided by c = 1000 and the departure times are recalculated by adding the lifetimes to the modified arrival times.When the lifetimes are removed, Dataset 2 contains only arrival and departure time instances without any direct information on the matching or lifetimes.
Datasets 4A-4F are derived in a similar way as Datasets 3A-3F but each departure is removed with probability 0.75 instead of 0.5.Datasets 4A-4F are used to demonstrate that the proposed estimation method works even if the detection probability is lower than in the real-life experiment.Altogether we have 14 datasets for the analysis.

| Classical setup with matched pairs
The process of interest is defined by two independent Poisson processes in time: an arrival process with arrival rate λ a , and a lifetime process with decay rate λ and mean lifetime μ = λ −1 .We observe arrival times a 1 < … < a n and the corresponding departure times d 1 < … < d n .As the arrivals and departures are matched, we can calculate the lifetimes directly as x i = d i − a i .The lifetimes x 1 , …, x n are independent observations from an exponential distribution with the mean lifetime μ.This is the setup for Dataset 1.
As there are no censored observations, the likelihood function is given by: The maximum likelihood estimator for μ is then with the standard error

| Unmatched arrivals and departures
Now, let us assume that all arrival and departure times are observed, but the information on the matching between arrivals and departures is missing.This is the setup for Dataset 2. In practice, the experiment has a certain activity period when all arrivals are observed.After this period, new arrivals do not occur but departures can still be observed.The follow-up period is the activity period extended with a sufficiently long post-activity period to catch all departures with a high probability.The activity time refers to the length of the activity period and the follow-up time to the length of the follow-up period.
The lack of matching may appear as a serious problem at first.Attempts to enumerate all possible ways to match arrivals and departures quickly fail because of combinatorial explosion.Combinatorics can be completely avoided writing estimator (2) in the form where only the sum of the departure times and the sum of the arrival times are needed to estimate μ.We see that the matching does not give any additional information on μ and the maximum likelihood estimator is the same as in the classical setup.The standard error of estimator ( 4) is given by Equation (3).

| ESTIMATION METHODS FOR INCOMPLETE DATA
In the more complex case present in Datasets 3A-3F, arrivals and departures are unmatched and on average half of the departures are not observed.This situation is illustrated in Figure 3.The observed data consist of n arrival times a 1 , …, a n and m < n unmatched departure times d 1 , …, d m .
A straightforward extension of estimator (4) for data with partially observed departures is where the constant 2 is used to compensate for the unobserved departures.The crucial difference compared with estimator ( 4) is that a half of the arrival-departure pairs are replaced by randomly chosen arrivals and departures.It follows that the variance of Equation ( 5) is larger than the variance of Equation ( 4) because the variance of d j − a i , j ≠ i is larger than the variance of d i − a i .In the Appendix we show that the variance of estimator (5) increases as a function of n and thus the estimator is not usable.
The likelihood inference requires that the arrivals are thinned, that is, some arrivals are removed so that the number of remaining arrivals equals the number of observed departures m.The thinning is defined by selection indicators s 1 , …, s n for which s i = 0 if arrival a i is removed and s i = 1 otherwise.Here, thinning s = (s 1 , …, s n ) is called compatible with arrivals a 1 , …, a n and departures and for any t it holds that is, after the thinning, the cumulative number of departures never exceeds the cumulative number of arrivals.A compatible thinning guarantees that it is possible to match the arrivals and the departures but does not define the matching uniquely.
Denoting the arrivals by a = (a 1 , …, a n ), the observed departures by d = (d 1 , …, d m ), and the set of compatible thinnings by (a, d), we can write the likelihood of μ for fixed a in the form The challenge for the maximum likelihood estimation is the number of compatible thinnings that grows exponentially as a function of the sample size.For instance, in the example of Figure 3(a) with four departures and six arrivals, there are nine compatible thinnings and with additional new events a 7 , a 8 and d 5 , the number would increase to 23.It seems difficult to draw a random thinning from (a, d) which would be needed in Monte Carlo approximations of the likelihood.Rejection sampling where random binary sequences are checked for the compatibility fails in practice because the number of possible sequences (choose m out of n) grows even faster than the number of compatible thinnings leading to very low acceptance rates in the sampling.One could consider various sequential sampling techniques where the sampling proceeds iteratively until a compatible thinning is obtained but these approaches seem to lead to incorrect probabilities of compatible thinnings.
Since the maximum likelihood estimator appears to be computationally intractable, we shift the focus to estimators that are based on a certain compatible thinning.When the thinning is fixed, we may use formula (4) to obtain the maximum likelihood estimate given the thinning.This leads to estimators that are in general biased for μ but may be useful as initial estimators.
A particular estimator of interest, ̂ 0 , is a minimum contrast estimator defined as the solution to the minimization problem where thinning s 1 , …, s n is required to be compatible with a 1 , …, a n and d 1 , …, d m .As the density function of the exponential distribution decreases as a function of x for any μ, the likelihood is maximized when the lifetimes are as short as possible.This is achieved when the arrival times selected in the thinning are as large as possible.Given the maximizing thinning ŝ1 , …, ŝn , the estimate ̂ 0 is obtained applying estimator (4) in a modified form The compatibility requirement for ŝ1 , …, ŝn makes estimator (7) fundamentally different from estimator (5) in two aspects: estimator ( 7) is always non-negative, and its variance does not increase as a function of the activity time because each departure is implicitly matched with a nearby arrival from the same block.The obtained initial estimator ̂ 0 underestimates μ because the departures leading to the shortest possible lifetimes are chosen in the thinning.
In Section 5, we present an estimation procedure that starts with estimate ̂ 0 and corrects the bias by means of simulation.We generate datasets with different candidate values of parameter μ and calculate the initial estimate ̂ 0 ( ) using formula (7).The goal is to find a value of μ that produces an initial estimate close to initial estimate ̂ 0 obtained from the observed data.This specifies an implicit statistical model (Diggle & Gratton, 1984;Hoel & Mitchell, 1971) for the data.To formally justify this approach, we should prove that E[�  0 ( (1) )] < E[�  0 ( ( 2) )] whenever μ (1) < μ (2) and the arrival rate and the number of arrivals are fixed.It seems hard to show this analytically because the impact of thinning is challenging to evaluate.Therefore, we rely on simulations when studying the properties of the initial estimator and its bias-corrected version. (6)

| Formation of blocks
As the first step of the proposed estimation procedure, we divide the timeline into blocks on the basis of observed arrival and departure times.We work with a sequence v 1 , …, v n+m of events where either v i = a (arrival) or v i = d (departure).The number of arrivals is greater than or equal to the number of departures.The sequence must start with an arrival and end with a departure.The blocks are constructed processing the events sequentially from the end v n+m to the beginning v 1 .The numbers of arrivals and departures in the current block are tracked cumulatively during the processing.When the following three conditions are met in this process, a cut-point between the current block and the next block is set between events v i−1 and v i : (a) event v i is an arrival, (b) event v i−1 is a departure, and (c) the current number of arrivals in the block that is currently constructed is greater than the current number of departures.
The next block starts with departure v i−1 and the counters for the cumulative numbers of arrivals and departures are set to zero.Let us consider an example sequence of ten events shown in Figure 3.When events v 5 = d 2 and v 6 = a 4 are processed, we have a situation where event v 6 is an arrival, event v 5 is a departure, the number of arrivals is 3 and the number of departures is 2. Thus all three conditions are met and the cut-point is set between events v 5 and v 6 .Proceeding in the same manner we end up having three blocks, which are (a 1 , d 1 ), (a 2 , a 3 , d 2 ) and (a 4 , a 5 , a 6 , d 3 , d 4 ).

| Thinning of arrivals in blocks
In the second step, some of the arrivals are removed to restore the balance between arrivals and departures.We have chosen to use the thinning that provides the solution to the minimization problem (6).If a block has A arrivals and D departures, the A−D smallest arrival times are removed.In the example of Figure 3, the blocks after thinning would be (a 1 , d 1 ), (a 3 , d 2 ) and (a 5 , a 6 , d 3 , d 4 ).From the formed blocks, the initial estimate ( 7) is calculated.
The division into blocks is needed to ensure the compatibility of the thinning.Sometimes 'wrong' arrivals are removed in the thinning, which causes bias in the initial estimate ̂ 0 .For instance, in the example shown in Figure 3, arrivals a 2 and a 4 are removed, even if they could be the ones that match with the observed departures.
A referee proposed an alternative approach that will lead to the same initial estimate as the formation of blocks and the thinning described above.In this approach, departures are processed starting from the end and each departure time d j is matched with the largest possible arrival time a i , a i < d j , that has not been matched yet.After matching estimator (2) can be used to obtain the same initial estimate ̂ 0 as described above.This matching is illustrated in Figure 2 where we see that the distribution of the obtained pseudo-lifetimes clearly differs from the exponential distribution fitted to the original data.Conceptually, we prefer the block-based definition because it does not require explicit matching and keeps the formation of the blocks and the thinning separate from each other.

| Bias correction
In order to find an unbiased estimator for the mean lifetime μ we apply an implicit statistical model and simulation based estimation.The initial estimate ̂ 0 obtained from the thinned data by estimator (7) provides a starting point.We simulate data with different values of μ, apply thinning as described above, and estimate the mean lifetime with estimator (7).We aim to find a value of μ that produces an estimate sufficiently close to ̂ 0 .Binary search is applicable for this task because the parameter of interest is one-dimensional.However, we must apply noisy binary search (Ben-Or & Hassidim, 2008;Karp & Kleinberg, 2007;Rivest et al., 1980) that takes sampling variation into account.In practice, this means that several datasets must be generated to ensure that the mean of the obtained estimates is close to ̂ 0 with a high confidence.The detailed procedure for the bias correction is given in Algorithm 1 whose steps and notations are described below.
In Algorithm 1, we use the following notations.The candidate for μ, denoted by M, can have values in the interval (L, U) initialized as (0, ∞).Variable M 0 stands for the average of estimates (7) calculated (line 8) from samples simulated with an exponential distribution with mean M (lines 5-7).For each M, we simulate at least K min samples (line 4), and at maximum K max samples (line 13).The distance between M 0 and the estimated value ̂ 0 is denoted by D. The goal is to find the value of M that gives the smallest D. The value M is accepted and returned (line 10) if | D | ≤ ̂ 0 , where δ is the tolerance given by the user.The maximum number of steps, that is, the maximum number of M candidates considered, is J max (line 22).
If the candidate M is not accepted on the basis of K min samples, the confidence interval , where SE(M 0 ) is the standard error of M 0 estimated from the K min samples and z 1−α/2 is the 1 − α/2 quantile of the standard normal distribution.If the confidence interval includes zero, the loop (line 13) continues simulating the samples with the current M until M is accepted (line 19) or K max is achieved (line 13) or the recalculated confidence interval does not include zero (line 13).
In the latter two cases, if the number of M candidates J max is not reached (line 22), the lower or upper limit of M is first updated (lines 24-25 or lines 27-28).The new value for M is then calculated in two ways depending on the value of U. If U < ∞ (lines 26 and 29) or if U is ∞ and D > 0 (lines 23 and 25), the value of M is the average of L and U.If U is ∞ and D < 0 (line 24), the current M is doubled.
Choosing large K min , K max and J max and small tolerance δ leads to more precise estimates but requires more computational time.This trade-off has practical importance in simulation experiments where the analysis is repeated thousands of times.

| Parametric bootstrap
We apply parametric bootstrap (Efron & Tibshirani, 1993) to obtain the standard error and confidence intervals for the bias-corrected estimate of the parameter μ.When simulating bootstrap samples, the lifetimes are generated from the exponential distribution using the bias-corrected estimate as the mean lifetime.The arrival times are fixed to the arrival times in the data.The departure times are obtained by adding the lifetimes to the arrival times and each departure is removed with probability 0.5.For each bootstrap sample, the proposed estimation method with four steps (forming blocks, thinning, estimating initial value and using a bias correction) is applied in order to obtain the bias-corrected estimate.The standard error is estimated by the standard deviation of the bias-corrected estimates and the confidence intervals are calculated with the percentile method.

| SIMULATION EXPERIMENTS
Next we study the properties of the proposed estimator in simulation experiments.In general, simulations may not lead to sufficient understanding on the performance of a new method because it is difficult to ensure that all relevant setups are considered.Here, simulations provide relatively strong evidence because all setups can be essentially characterized with only two quantities: the number of events which depends linearly on the activity time, and the ratio of the arrival rate λ a and the decay rate λ.A wide range of values for these quantities can be easily covered in simulations.In addition, Algorithm 1 has control parameters for the numeric accuracy.
The arrivals follow a Poisson process with known arrival rate λ a and the probability of detecting each departure is 0.5.The mean lifetime μ = 1/λ is fixed to 1 without loss of generality and the arrival rate is altered.In all simulations, the control parameters of Algorithm 1 have values: tolerance δ = 0.001, risk level α = 0.05, minimum sample size K min = 5, maximum sample size K max = 100 and maximum number of iterations J max = 100.

| Simulations with fixed number of arrivals
In the first simulation, reported in Table 1 and in Figure 4, the activity time is chosen so that the expected number of arrivals during the activity period is always 2000.It is seen from Table 1 that the initial estimates are often strongly biased but the bias correction technique described in Section 5.3 reduces the bias to a few per cent or less in all setups.The biases and standard errors are calculated from 10,000 simulation runs per each arrival rate.
We consider the limiting behaviour of the estimators understanding that the setups with very small or very large arrival rate may not be implementable in real experiments.In the setups of the first simulation, the initial estimate is unbiased when the arrival rate approaches to zero or infinity.When the arrival rate approaches zero, a departure (observed or unobserved) occurs almost always before the next arrival occurs.This means that with a high probability a block will contain only one departure and this departure and the arrival preceding it correspond to each other.In the thinning, other arrivals in the block (if any) are removed and only the departure and its corresponding arrival will remain.When the arrival rate approaches infinity and the expected number of arrivals is fixed, all arrivals occur in a short interval and almost all departures occur after this interval.The thinning causes only a small bias because all arrival times are almost the same.Consequently, the situation is close to an ideal experiment where all arrivals occur at the same time and the departures are observed without interference.The limiting standard error 1∕ √ 1000 ≈ 0.032 is obtained by Equation (3) where n = 1000 equals the number of observed departures (half of 2000 arrivals).
When we look at the results as a function of arrival rate λ a keeping in mind that the number of arrivals was fixed to be 2000, there are two effects in the opposite directions.When the arrival rate increases, it becomes more difficult to match the arrivals and the departures and the bias caused by thinning increases.When the activity time becomes shorter, more departures occur after the activity period (Table 1, third column).The expected number of these departures can be calculated with the formula (see Appendix for the derivation) where t is the activity time.The departures at the post-activity period are highly informative because it is known for sure that the corresponding arrivals occurred during the activity period.In Table 1, the initial estimate has the largest bias when λ a = 50.In this case, the average number of the postactivity departures is still quite small (24.9 departures) but the bias caused by thinning is already high.
The bias of the initial estimate, the standard error of the initial estimate and the standard error of the corrected estimate are closely related.When the bias is large, initial estimate ̂ 0 is small and its standard error is also small because it is proportional to the mean (Equation 3).The large bias means that it is more difficult to correct the bias by Algorithm 1 and these difficulties are reflected in the standard error of the corrected estimate.In Notes: For each setup, the reported statistics are based on 10,000 simulation runs.The average bias (in units of 10 −6 ) was calculated as the average of the differences of the 10,000 estimates and the true mean lifetime μ = 1.The standard errors (SE, in units of 10 −6 ) were calculated as the standard deviations of the 10,000 estimates.The third column reports the average number of observed departures during the activity time and after the activity time.
standard error.The uncertainty of the corrected estimate is the price we pay for the bias in the initial estimate.
The results give indirect information on the relative efficiency of the proposed estimation method compared with the maximum likelihood estimation.The optimal method (maximum likelihood) cannot perform better than the theoretical lower limit (standard error 0.032).We see that if the arrival rate λ a is smaller than 0.5 or larger than 500, the optimal method cannot be much better than the proposed method.When λ a is between 0.5 and 500, there may be room for improvement.The maximum standard error 0.209 was obtained when λ a = 15.This means that in the imaginary extreme case, the optimal method might be 0.210/0.032= 6.5 times more efficient.In other words, we have shown that the practical gain from a more efficient estimation method could be considerable but not huge.

| Simulations with fixed activity time
In the second simulation, reported in Table 2 and in Figure 5, the activity time is fixed to a value (Table 2: 2000; Figure 5: 1000, 2000 or 3000) but the arrival rate is varied from 0.1 to 1000.In this setup, the smallest standard error of the bias-corrected estimate is obtained when the arrival rate is very large.Practically all information comes then from departures observed after the activation period.Equation ( 8) tells us that for long arrival times, the expected number of these departures is approximately λ a μ/2 which does not depend on the activity time.For this reason, the curves for different activity times coincide for large arrival rates in Figure 5.

F I G U R E 4
The standard error for the bias-corrected estimate of the mean lifetime as function of the logarithm of the arrival rate in the first simulation experiment.The expected number of arrivals is always 2000 and the true mean lifetime is 1.The dashed line shows the theoretical lower limit for the standard error.The results are based on 10,000 simulation runs per each arrival rate Arrival rate (log10)

Standard error
When we look at the results as a function of arrival rate λ a , there are again effects in opposite directions.When the arrival rate increases, the number of arrivals increases, which as such decreases the uncertainty of the initial estimate (Table 2, fifth column).At the same time, it becomes more difficult to match the arrivals and the departures, which increases the bias of the initial estimate (Table 2, fourth column).The high arrival rate also means that the number departures after the activity period is high.Using Equation ( 8) we learn that the ratio of the expected number the post-activity departures and the expected number of all departures does not depend on the arrival rate.In Notes: For each setup, the reported statistics are based on 5000 simulation runs.The average bias (in units of 10 −6 ) was calculated as the average of the differences of the 5000 estimates and the true mean lifetime μ = 1.The standard errors (SEs, in units of 10 −6 ) were calculated as the standard deviations of the 5000 estimates.The third column reports the mean number of observed departures during the activity time and after the activity time.
As the composition of the different effects, we see the non-trivial behaviour illustrated in Figure 5.There seems to be a local minimum at λ a = 1, that is, when the arrival rate and the decay rate are equal.The location of a local maximum varies as a function of activity time.For activity times 1000, 2000 and 3000 the local maxima are located around λ a = 30, λ a = 40 and λ a = 50 respectively.

| Simulations studying the effect of sample size
In the third simulation, reported in Table 3 and in Figure 6, we study how the standard error of the estimated mean lifetime behaves as a function of the expected number of arrivals λ a t.It is seen from Figure 6 that for low arrival rates, λ a = 0.1 and λ a = 1, the relation of the standard error and the expected number of arrivals is ordinary, that is, approximately linear in the log-log scale.When λ a = 1, the standard errors are about 1.7 times the standard errors (the dashed line) one would obtain if the arrivals and the observed departures were matched.
For high arrival rates, λ a = 100 and λ a = 500, increasing the expected number of arrivals does not decrease the standard error after a saturation point, as also seen in Figure 5.In these cases, the information on μ comes mainly from the departures observed after the activity period.Increasing the activity time has only a negligible effect on the number of these departures if the activity time is already long because it is unlikely that an arrival observed at the beginning of the activity period matches with a departure observed after the activity period.For intermediate arrival rates, λ a = 5 and λ a = 20, the curves in Figure 6 are concave but seem to approach linear behaviour when the expected number of arrivals increases.

F I G U R E 5
The standard error for bias-corrected estimate of the mean lifetime as function of the logarithm of the arrival rate.The activity time is 1000, 2000 or 3000 and the true mean lifetime is 1.The results are based on 5000 simulation runs per each combination of arrival rate and activity time Arrival rate (log10)

Standard error
Activity time G 1000 2000 3000 We apply the methods of Section 3 to Datasets 1 and 2, and the method of Sections 4 and 5 to Datasets 3A-F and 4A-4F.Table 4 summarizes the estimates and their confidence intervals.
Original data (Dataset 1) and its unmatched densified variant (Dataset 2) give the same estimates for reasons explained in Section 3.2.An estimate for the half-life 2.61 ms (2.57,2.65) is obtained by multiplying the mean lifetime by log(2).This estimate is close to the estimates reported in the literature (Page et al., 1996).
For incomplete data (Datasets 3A-3F), the point estimates are not exactly the same but still relatively close to the estimate obtained from Dataset 1.The uncertainty of the estimates behaves similarly to the first simulation experiment.(The standard errors and the arrival rates should be divided by the mean lifetime and accounted for the different sample sizes before they can be compared with those in Table 1.)When the activity time decreases and the arrival rate increases, the bias of the initial estimate increases.In Table 4 this is seen as an increase in the standard error Notes: For each setup, the reported statistics are based on 5000 simulation runs.The average bias (in units of 10 −6 ) was calculated as the average of the differences of the 5000 estimates and the true mean lifetime μ = 1.The standard errors (SE, in units of 10 −6 ) were calculated as the standard deviations of the 5000 estimates.The third column reports the mean number of observed departures during the activity time and after the activity time.

F I G U R E 6
The standard error for bias-corrected estimate of the mean lifetime as function of the expected number of arrivals.The reference line (dashed) has the slope −0.5 in the log scale, which corresponds to the factor 1∕ √ n in the formula of the standard error.The results are based on 5000 simulation runs per each combination of arrival rate and expected number of arrivals @ @ @ @ @ @ @ @ @ @ 0.03 Notes: The number of arrivals is 16,069 for all datasets.The confidence intervals for Datasets 1 and 2 are obtained by using the asymptotic properties of the maximum likelihood approach.For Datasets 3A-3F and 4A-4F, the standard errors and confidence intervals were calculated using parametric bootstrap with 1000 replications.
of the bias-corrected estimate, especially in Datasets 3D and 3E.However, increasing arrival rate leads to higher number of departures after the activity period.Due to the departures after the activity period, the standard error is smaller for Dataset 3F than for Datasets 3D and 3E.It is remarkable that the activity time of 53 ms (Dataset 3F) gave the standard error 0.10 which is only twice the standard error with the activity time of 1.46 h (Dataset 3A).
The results for Datasets 4A-4F show that the method works even if the probability for observing a departure is only 0.25.We see that the ratio of the standard errors for Datasets 3A and 4A, 1.37, is close to √ 2 as one would expect when the sample size is halved.However, the same ratio for Datasets 3D and 4D is 2.2 indicating that in this setup the estimation becomes relatively more difficult when the probability for observing a departure drops from 0.5 to 0.25.
The results demonstrate that the additional loss of information due to the thinning is not necessarily large.The standard errors for Datasets 1 (16,069 departures) and 3A (8042 departures) are 0.030 and 0.046 respectively.If we scale the standard error of Dataset 1 from sample size 16,069 to sample size 8042, we obtain 0.042 which is 9% smaller than 0.046.
As an additional analysis, we repeated the creation of Datasets 3A-3F 100 times (regenerating the random removal of departures) and used the same estimation procedure.Figure 7 shows the obtained point estimates and their 95% confidence intervals.The results support the conclusion that even if the uncertainty of the estimate depends on the arrival rate, the proposed estimation method works reliably for a wide range of arrival rates.

| DISCUSSION
We have proposed a method for the estimation of the mean lifetime in situations where the arrivals and the departures are unmatched and the departures are missing completely at random.The method is derived for experiments in nuclear physics where the goal is to The point estimates (dots) and their 95% confidence intervals (vertical lines) for Datasets 3A-3F with 100 independent repetitions of random removal of departures.For each dataset, the results are ordered by the point estimate in increasing order from left to right.The horizontal dashed line shows the point estimate from Dataset 1 Dataset Estimated mean lifetime (ms) characterize particles by their expected lifetime.The estimation relies on the exponential distribution of lifetimes-an assumption that follows from the laws of physics.Due to the memoryless property of the exponential distribution, the information on the matching of arrivals and departures is not needed in the estimation.In principle, the method could be applicable to other problems as well if the assumption of the exponential distribution can be justified.
Simulations and real data examples demonstrate that the mean lifetime can be successfully estimated from partially observed departures.The estimation method combines an initial estimator and a numerical bias correction technique.As the exponential distribution has only one parameter, the noisy binary search can be applied to find an unbiased estimator in the minimum contrast estimation.One could apply the same bias correction technique with some other initial estimator that is based on a different thinning strategy.The probability of observing a departure was known in our examples but could be estimated from the data as well.
We focused on the problems caused by partially observed departures and did not consider misclassification of arrivals in this paper.Further challenges arise in experiments where a large proportion of the arrivals cannot be reliably separated from the background (as defined in Section 2.1).There is also a risk in some experiments that some background events are mislabelled as departures.Whether good estimators can be derived for these situations is an open question.
The simulation results reveal an interesting relation between the arrival rate and standard error of the estimate.If the activity time is fixed (the setup of the second simulation experiment), the arrival rate should be either equal to the decay rate or very large.The latter case, however, has some limitations: First, very large arrival rates are usually practically unreachable in real-world experiments.Second, it is crucial that the follow-up period is longer than the activity period.Third, increasing the activity time may have only very small impact on the accuracy of the estimates after a saturation point has been reached as seen from Figure 6.Therefore, if any prior information on the mean lifetime is available, it seems that an optimal design should have an arrival rate equal to the expected decay rate.The theoretical justification for this empirical observation remains as an open problem.
In nuclear physics, the estimation of the mean lifetime for a particle may be only one of multiple goals in an experiment.This means that the arrival rate might be far from the optimal for a particular estimation task.Fortunately, the proposed estimation method can be applied with a wide range of arrival rates.This will enable using the data from all DSSD strips.Compared to approaches that utilize only the strips where the arrival rate is sufficiently low for the matching of arrivals and departures, the proposed estimation method easily multiplies the amount of available data.As a consequence, the estimates will be more accurate or the required accuracy can be achieved in a shorter experiment.
The estimator (A1) can be written in the form The estimator is unbiased for μ because the independence of R i from a i and x i implies that and To see that the estimator is not consistent, we consider the variance of estimator (A1) starting from the form Equation (A2) Now the first and the third expectations are positive and the second expectation converges in probability to 0 by Slutsky's theorem when n approaches infinity.We consider therefore only the first expectation that simplifies to (A2) because the independence of R i and R j , i ≠ j, implies E[(2R i − 1)(2R j − 1)] = 0 and the independence of a i and R i allows us to factorize the first term.If the sequence a 1 , a 2 , …, a n increases linearly on average, the variance increases as a function of n.As a demonstration, let the time between subsequent arrivals to be a constant μ a so that we have a i = μ a i.Now the expression increases linearly as a function of n We conclude that estimator (A1) is not consistent because its variance increases as a function of n.
In practice this means that estimator (A1) cannot be used.Instead, Section 4 describes an approach where the complete data estimator ( 4) is applied with a bias correction.
A.2 | Expected number of departures after the activity period We derive the formula (8) for the expected number of post-activity departures.Let t be the activity time and assume that the follow-up continues without limits.The arrivals follow a Poisson process with rate λ a and the lifetimes are exponentially distributed with mean μ.For an arrival at time a, the corresponding departure occurs after t with probability P(d > t|a) = exp (−(t − a)/μ).
As the arrivals are uniformly distributed on the interval [0, t] and independent from the lifetimes, we obtain the unconditional probability Formula (8) is obtained when Equation (A3) is multiplied by λ a t, the expected number of arrivals, and it is taken into account that only half of the departures are observed.

F
I G U R E 2The quantile-quantile plot of Dataset 1 and the fitted exponential distribution contrasted with Dataset 3C.The pseudo-lifetimes for Dataset 3C are obtained applying the thinning and matching explained in Section 5

F
An illustration of the process of unmatched arrivals and departures when on average half of the departures are missing.Arrows pointing upward represent arrivals and arrows pointing downward represent departures.The activity period ends somewhere between a 6 and d 3 and the follow-up period ends a long time after the last observed departure d 4 .(a) The original data containing six arrivals and four departures.(b) The data after the division into blocks and the thinning of the arrivals.The vertical dashed lines show the cut-points between the blocks Table1, the highest standard error of the corrected estimate occurs at λ a = 15 when the initial estimate has a large bias and small The bias and the standard error of the estimates of the mean lifetime as functions of the arrival rate when the expected number of arrivals is fixed to 2000 a ( − e − t∕ )∕2, T A B L E 1

Table 2 ,
the expected proportion of post-activity departures is thus 1/2000.The bias and the standard error of the estimates of the mean lifetime as functions of arrival rate λ a in the first simulation experiment where the activity time is fixed to 2000 a ( − e − t∕ ) a t = − e − t∕ t ≈ t T A B L E 2 The bias and the standard error of the estimates of the mean lifetime as functions of expected number of arrivals for various arrival rates λ a T A B L E 3