Dedicated diffusion phantoms for the investigation of free water elimination and mapping: insights into the influence of T2 relaxation properties

Conventional diffusion‐weighted (DW) MRI suffers from free water contamination due to the finite voxel size. The most common case of free water contamination occurs with cerebrospinal fluid (CSF) in voxels located at the CSF‐tissue interface, such as at the ventricles in the human brain. Another case refers to intra‐tissue free water as in vasogenic oedema. In order to avoid the bias in diffusion metrics, several multi‐compartment methods have been introduced, which explicitly model the presence of a free water compartment. However, fitting multi‐compartment models in DW MRI represents a well known ill conditioned problem. Although during the last decade great effort has been devoted to mitigating this estimation problem, the research field remains active.


| INTRODUCTION
Diffusion-weighted (DW) MRI is a widely used, non-invasive technique for measuring water molecular diffusion in biological tissue. 2 The most common use of DW MRI in clinical practice is based on the observation of DW images without further processing. Due to its simplicity, diffusion tensor imaging (DTI) currently remains the most frequent method used for clinical analysis of DW MRI data. 3 During the last two decades, DTI metrics have been extensively used in the study of several neurodegenerative diseases, 4 brain neoplasms 5 and stroke, 6,7 as well as in studies relating to brain development and ageing. 8,9 DTI metrics can only be considered to be tissue specific if the voxel under analysis contains a single type of tissue. In cases when voxels contain more than one tissue type, DTI metrics integrate the microstructural characteristics of all tissue types within the voxel. This phenomenon is normally referred to as the partial volume effect (PVE), which is ubiquitous in DW MRI at conventional resolutions on clinical scanners (2 3 -3 3 mm 3 ). 10,11 In particular, the PVE with cerebrospinal fluid (CSF) represents a serious limitation for DTI.
From the point of view of particle diffusion, CSF behaves as free water. That is, as water molecules do not experience restrictions, the apparent diffusion coefficient (ADC) is approximately equal to 3 μm 2 /ms at 37 C, ie almost four times larger than the mean ADC observed in human brain tissue. 12 As a consequence, even small amounts of CSF within the voxel can strongly bias the tissue-specific DTI parameters. In particular, the ADC becomes elevated, whereas the corresponding fractional anisotropy (FA) is reduced. 10 Free water contamination in DW MRI can happen in two ways. CSF surrounds the brain parenchyma and is also confined inside the ventricles. Therefore, DTI parameters for voxels located at the CSF-tissue interface will be affected. In this case, the PVE depends on the positioning of the imaging grid and also on the size of the structure under investigation. 13 Thus, small structures (eg the fornix) will be more affected by CSF contamination compared with larger structures. 14 Another type of free water contamination refers to the presence of intra-tissue water compartments sufficiently large to reflect free diffusivity, which has been observed, for example, in the case of vasogenic oedema. 15,16 There are two kinds of approach for the reduction of free water contamination in DW MRI, namely, sequence based and model based. The most common sequence-based approach is FLAIR DW MRI. 17 This sequence makes use of the fact that the T 1 of CSF is longer than that of tissue, allowing one to choose an inversion-time such that the CSF signal is cancelled at the echo-time (T E ). However, this sequence suffers from a high specific absorption rate and longer acquisition times. 16 Another approach avoids an increase in acquisition time, compared with conventional settings, 18 by using shorter repetition times and a slice acquisition reordering to reduce the signal of the CSF. However, a common limitation of these sequences is the loss of signal-to-noise ratio compared with conventional settings (28% and 36%, respectively 18 ).
The model-based approaches do not require a priori specially designed pulse sequences, but they do need to explicitly model the existence of a free water compartment within the voxel. Pierpaoli and Jones 15 initially noticed that vasogenic oedema and CSF have similar diffusion properties and therefore contribute to the total DW MRI signal in a similar manner. In order to obtain tissue-specific DTI parameters, they proposed the use of a two-compartment model, with one of the compartments referring to free water and the other to tissue. The free water compartment is characterised by isotropic, Gaussian diffusion with a diffusion coefficient equal to that of free water at the corresponding temperature, and the tissue compartment characterised by restricted diffusion, described by a diffusion tensor. 15,19 However the parameter estimation problem in twocompartment models is generally ill conditioned, 20,21 and in some circumstances ill posed. 16 In order to overcome this limitation, Pierpaoli and Jones proposed the acquisition of several diffusion weightings (b-values) in the range [0, 1.2] ms/μm 2 . Although this approach mitigates the ill conditioned problem, it can greatly increase the acquisition time well beyond clinical constraints. Thus, several approaches have been proposed to optimize the b-values, the number of shells and the number of field gradient directions per shell, using simulations 21,22 or a minimization of the Cramér-Rao lower bound of the parameters of interest. 23,24 Several works have also been devoted to avoiding the ill posed estimation problem in the case of single-shell experiments. Pasternak et al 16 initially proposed adding constraints to the model and a spatial regularization of the tissue diffusion tensor. This approach was shown to reduce the degeneracy of the estimation problem. However, it has been recently shown that such regularization does not fully cancel the degeneracy of the problem. 25 Another method using subject-based constraints in the mean (MD) or axial diffusivities showed some advantages for the investigation of small structures, such as the fornix, due to the lack of spatial regularization of the diffusion tensor. 14 So far, free water elimination (FWE) DTI models have been shown to reduce bias in DTI metrics and fibre tractography 13,14,19,26 and have also improved test-retest reproducibility. 27 Moreover, the free water volume fraction map has been shown to be a sensitive biomarker in vasogenic oedema 16 and neurodegenerative diseases such as dementia, 28 Alzheimer's disease 29 and Parkinson's disease. [30][31][32] Although the aforementioned methods stabilize the parameter estimation problem, imposing these model assumptions leads to the risk of producing biased results. Recently, there has been a great effort to make the estimation problem well conditioned and to stabilize the solution without priors and regularizations. 23,24,[33][34][35] In particular, it has been demonstrated that, due to the difference in the T 2 of free water and tissue, the addition of a T 2 attenuation term to each of the compartments and the use of different T E values leads to a more accurate, precise and robust estimation of the model parameters. 36 The underlying idea of this approach is inspired by the broadly studied multidimensional NMR methods used in the investigation of porous media and tissue. [37][38][39][40][41][42] However, there is a fundamental difference between the aforementioned correlation methods, which are based on the inversion of the Laplace transform, and the FWE model with explicit account of T 2 attenuation (FWET 2 ), in which the number of physical compartments is defined a priori. 36 Moreover, the difference in T 2 of the different compartments leads to a T E dependence of the relative free water fraction if the difference in T 2 is not explicitly modelled. 36,43 Diffusion phantoms have become an indispensable tool for a broad set of applications, including the calibration of diffusion pulse sequences, the optimization of tractography algorithms and the validation of theoretical diffusion models, data post-processing techniques and numerical simulations. Existing phantoms with independently known diffusion properties are based on simple liquids 44 or liquid solutions/mixtures 45 that are useful for the calibration of field gradients, for quality assurance of various scanner conditions, for inter-scanner reproducibility etc. However, these phantoms are too simplistic and do not reflect most features of diffusion in brain tissue. On the other hand, when the complexity of a phantom microstructure is increased in order to match it to the enormous complexity of in vivo brain tissue, the possibility of deriving its diffusion and relaxation properties based on the microstructural properties decreases. Thus, although the terms "gold standard" or "ground truth" are often used in the jargon, 46,47 they refer rather to the possibility of verifying specific features of interest addressed by the suggested models/methods under investigation, while the lack of "true" gold standard phantoms is generally recognized by the community. 48,49 The requirements of physical phantoms, therefore, mostly relate to the needs of a specific problem and should fulfil the following criteria 46 : a. exhibits the microstructural property of interest affecting the measured signal; b. is well characterised in terms of its microstructural properties; c. is easy to assemble and reproducible; d. is stable and non-toxic.
An excellent discussion on this topic can be found in a recent review by Fieremans and Lee. 46 Given the great attention that FWE has gained during recent years, here we aim to introduce the design of two artificial, anisotropic diffusion fibre phantoms and to characterise their NMR properties in the framework of FWE models. In particular, we investigate the recently introduced FWET 2 . 36 One of the phantoms is designed to mimic the first case of free water contamination described here, ie voxels at the CSF-tissue interface. This phantom contains sections of different thicknesses of parallel fibre bundles, separated by compartments of free water of different sizes.
The second phantom is based on our previously published prototype 50 and is intended to mimic the situation of PVE with intra-tissue free water compartments. The phantoms were constructed using Dyneema polyethylene fibres (SK75 dtex1760, DSM, Geleen, The Netherlands), which have been shown to be useful in mimicking some features of water diffusion in brain white matter. [51][52][53][54][55][56] Given the highly complex process of deriving ground truth parameters based on microstructural features, we instead construct "reference truth" maps of the model parameters using the median of each parameter of interest, taken over several scan sessions acquired at two different sites. Therefore, in order to ensure the stability of the phantoms, we first investigate the variability of both intra-site (several experiments at a single site) and inter-site (experiments at two sites) model parameters. 1 We expect our prototypes to benefit the design and optimization of experimental protocols for DW MRI, as well as the development of data pre-and post-processing methodologies.

| THEORY
Within the diffusion tensor distribution (DTD) framework, 57 the MRI signal can be regarded as the result of the contribution of several domains characterised by a transverse relaxation time T 2 and a diffusion tensor D (we neglect longitudinal relaxation effects throughout this work). The total MRI signal is thus expressed in terms of the distribution P(T 2 , D) and the response kernel K(T E , b, T 2 , D), according to 58 where S 0 is the proton signal, b is the diffusion encoding tensor 59 and Ω is the integration domain for D.

| FWE DTI
Conventional FWE DTI is based on a two-compartment model in the slow-exchange limit. The signal for the free water compartment is modelled as an isotropic, Gaussian diffusion signal, whereas the signal for the tissue compartment is described by an anisotropic, DTI-like signal. 15,16 The total signal within the voxel is written as follows: where S 0 0 = S 0 exp − T E =T 2 ð Þis the T 2 -weighted (T 2 W) proton signal, f and D w are the relative fraction and diffusion coefficient for the free water compartment and D t is the second-rank symmetric, positive-definite diffusion tensor for the tissue compartment. The fraction f in the case of compartments with different transverse relaxation times depends on the echo-time, T E (see below). Within the DTD framework, this model is is the Dirac delta function and I is fibre volume fraction on the length-scale of a typical voxel size is expected to be homogeneous. Thus, depending on the position of the imaging grid and voxel size, one can generate data sets with different degrees of PVE. A similar phantom has been published in the literature, ie containing sections of fibre bundles of different thicknesses. 11 In contrast to the cylindrical design of the phantom in Reference 11, the sections of fibre bundles in our design are straight and can, therefore, be positioned at a given angle with respect to the external magnetic field. This, in turn, enables the susceptibility-induced, background gradients to be controlled. 61

| Phantom 2
This is a multi-section phantom, similar to the one published in our previous work 50 ( Figure 1B). This phantom was constructed by stacking layers of fibres in alternating directions, crossing at right angles. The thickness of each layer lies in the range from approximately 180 μm to 220 μm. As a result of this configuration, a free water compartment of planar structure between the layers with increasing volume is formed (from left to right in Figure 1B). Therefore, several compartments will be present within a voxel, thus producing the desired intra-tract PVE free water contamination.

| MRI experiments
Experiments were performed with a 3 T Siemens Prisma scanner (Siemens, Erlangen, Germany) and a 64-channel receive coil provided by the manufacturer. A 2D DW spin-echo pulse sequence with monopolar pulse field gradients and EPI readout was used. Two types of experiment were performed: DW and T 2 W, both using coronal slices. The protocol parameters for the DW experiment were T E = 70 ms,   50 The section of interest for this work (red ROI) consists of nearly parallel fibre layers, which go from an area of high packing density (left-hand side) to an area of low packing density (right-hand side). In addition, the phantom contains two sections with fibres crossing at right angles and spatially constant volume fraction, and one section of parallel fibres with spatially constant volume fraction. These two fibre configurations are not considered in this work For the analysis of intra-and inter-site variability, the former protocol was performed on the same phantom three times at "Site 1" and three times at "Site 2". Both sites were equipped with a 3 T Siemens Prisma scanner.

| Model parameter estimation
All images were processed using in-house MATLAB scripts (MATLAB 2015a, MathWorks, Natick, MA, USA). In order to avoid a T 1 modulation of the signal due to different T R values between the two experiments, we normalized the signal of the T 2 W experiment, S T2W T E ð Þ , as FWET 2 parameter estimation was performed in two steps.
i. In the first step, DTIT 2 model parameters were estimated by fitting Equation 3 to the whole experimental data set (DW and T 2 W), in order to generate an initial guess for T 2,t and D t . Additionally, DTIT 2 parameters in the bulk area of the phantom were used to assess T 2,w and D w . The estimation of DTIT 2 parameters was performed via weighted linear least squares (WLLS). 62 The WLLS estimator for the DTIT 2 model parameters, θ DTIT2 = lnS 0 ,D 11 , D 12 , D 13 ,D 22 , D 23 , D 33 ,1=T 2 ½ T , is written as follows: where y (N × 1) is a column vector whose elements are given by the natural logarithm of the N measured signals, ie y i = lnS i (i = 1,…,N), and the weighting matrix, w (N × N), is a diagonal matrix whose elements are given by w i,i = S 2 i . The ith row of the design matrix, X (N × 8), is written as ii. In the second step, FWET 2 model parameters, θ FWET2 = S 0 ,f w , D 11 ,D 12 , D 13 , D 22 , D 23 , D 33 , T 2,t ½ T , were estimated via constrained non-linear minimization of the least-squares estimator, which is written as follows: where S θ FWET2 ð Þis the signal model. The minimization of Equation 7 was performed using the function fmincon available in MATLAB using the interior-point algorithm, subject to the following linear and non-linear constraints: The minimization process in this step was initialized using the values of T 2 and D generated in Step (i) and f w = 0.1. The free water parameters, D w and T 2,w , estimated in Step (i) in the bulk of the phantoms were fixed during minimization.
FWE model parameter estimation was also carried out following Steps (i) and (ii), with the exception that, in this case, the parameters were estimated using only the DW data set.
DTI parameters FA, MD and radial diffusivity (RD) were evaluated for each case based on the corresponding diffusion tensor as described elsewhere. 64 In the case of voxels containing only free water (f w = 1), the tissue diffusion tensor can erroneously fit the signal from the free water compartment. In this case, given that both the free water and the tissue compartment can fit the total signal equally well, the free water fraction can take any value between 0 and 1, as explained in References 21 and 22. Therefore, the analysis of voxels for which f w ≥ 0.95, λ 2,3 ≥ 2.0μm 2 /ms or T 2,t ≥ 2.4 s was skipped.

| Proton density and fibre volume fraction
In order to assess the relative proton density, PD, we first computed the intensity bias field from the parameter S 0 in Equation 4 following the method in Reference 65 . Then S 0 was normalized using the calculated intensity bias field (so that PD = 1 in the bulk). The fibre volume fraction, ϕ, in the case of a voxel containing a single compartment is simply given by ϕ = 1 − PD. However, if the voxel contains a free water compartment with a relative fraction f w , as described by Equation 4, the true fibre volume fraction at the fibre compartment can be shown to be provided that f w < 1 ( Figure 4E).

| Reference truth maps and intra-and inter-site variability
Reference truth DTIT 2 and FWET 2 model parameter maps were constructed by taking the voxel-wise median of all time points and sites.
On the other hand, voxel-wise deviation maps were assessed via the interquartile range, ie the distance between the first and the third quartile. The main underlying assumption of this approach for constructing reference truth maps, which integrates information from experiments from different sites, is that the inter-site variability is negligible and the main contribution to between-sessions variations is of intra-site nature. In other words, there is no site bias. 1 This will be discussed in detail in Subsection 5.3 in the context of the current results.
Variability of DTIT 2 and FWET 2 model parameters, both intra-and inter-site, was investigated following the framework proposed by Walker et al. 1 It assumes that experiments have been performed at p = 1, ..., P sites and q = 1, …, Q p time points (here P = 2 and Q p = 3 for p = 1, 2; see Appendix A for details) and includes the following steps.
i. Outlier identification for all DTIT 2 and FWET 2 parameter maps by subtracting a given map at site p and time point q, from the corresponding median taken over all sites and scan sessions. Significant deviations from the median are considered to be outliers.
ii. Evaluation of intra-and inter-site variability maps, σ intra and σ inter (Equations A1, A2), as well as the intra-and inter-class correlation coefficients, ICC intra and ICC inter (Equations A3, A4), for all model parameters.
Before the analysis of variability was performed, all FA p,q maps from DTIT 2 were coregistered to FA 1,1 (Site 1, Scan Session 1) with the help of the toolkit FLIRT available in FSL, 66,67 using the rigid body model. The resulting transformation matrix was subsequently applied to the remaining parameter maps.
It should be noted that, in order to avoid any site and/or scan session bias in MD and T 2 maps due to possible differences in the room temperature, we normalized each of these maps by the corresponding bulk value of each scan session, before the variability analysis and the construction of reference truth maps. In particular, the leftmost section, which is 2 mm thick, is hardly visible in the DTIT 2 FA map (yellow arrows in Figure 2A and 2D). However, the PVE with free water in the parameters can be reduced when the FWET 2 model is used. See, for instance, the yellow arrow in Figure 2B and 2E,

| Phantom 1
where the diffusion properties for the same section appear more similar to those voxels unaffected by PVE. Note that voxels containing only free water were not analysed with FWET 2 and are displayed in black (see Subsection 3.3).
The profiles of MD, FA, T 2 and f w taken along the blue, horizontal line indicated in Figure 2B are shown in Figure 2C  , MD and T 2 maps were normalized to the corresponding bulk value of each scan session, so as to avoid any bias due to possible differences in the room temperature. Voxels where f w ≥ 0.95, λ 2,3 ≥ 2.0μm 2 /ms or T 2,t ≥ 2.4 s (ie bulk water) were not considered in the analysis and are therefore not displayed for FWET 2 interface between the section of fibres and the bulk water. The presence of a relative fraction of free water strongly affects the estimated diffusion and relaxation properties of the section of fibres. However, FWET 2 allows the PVE in these parameters to be reduced, given that a more constant profile is expected, based on the homogeneity of the fibre volume fraction in the length-scale of the sections of fibre bundles. Therefore, any major change in the diffusion or relaxation characteristics is expected to be due to free water contamination.
It is worth mentioning that, in the construction of the reference truth maps and the corresponding profiles, those voxels located at the interface between the bulk water and sections of fibre bundles reflect not only variations due to the intrinsic noise of the imaging protocol and the phantom itself, but also variations introduced by the position of the imaging grid, which changes from session to session in a random manner and therefore increases the parameters' variability.  Figure 3A. That is, MD and T 2 increase, whereas FA decreases. However, these changes become less pronounced in the corresponding maps from FWET 2 . In particular, the transverse relaxation time in the fibre compartment is visually constant, within the large standard deviation of this parameter (see grey shaded areas in Figure 3C).

| Phantom 2
In order to better visualize these changes, we evaluated spatial profiles for each map, shown in Figure 3C (single session) and 3F (reference truth), for the DTIT 2 (green) and FWET 2 (orange) parameters. The abscissas refer to the horizontal position in the blue ROI. Solid lines and grey, shaded areas in Figure 3C denote the mean and standard deviation of a given parameter taken along a vertical stripe in the blue ROI (where f w was observed to be approximately constant). Thus the grey, shaded areas in Figure 3C show the variations of a given parameter along each vertical stripe in the blue ROI. In contrast, solid lines and coloured, shaded areas in Figure 3F correspond to the mean along a vertical stripe in the blue ROI of the first quartile map (lower line of the coloured, shaded area), median map (central line) and third quartile map (upper line). Therefore, in this case, the colour-shaded areas in Figure 3F denote the variations of the phantom parameters across the six scan sessions. A comparison of the profiles of DTIT 2 versus FWET 2 parameters demonstrates that a change in f w is accompanied by changes in DTIT 2 metrics. However, these changes are less pronounced when FWET 2 is used. The change in MD and FA from FWET 2 ( Figure 3B and 3E), which allows one to more accurately assess the diffusion and relaxation properties of the fibre compartment, suggests that there is a decrease in the fibre volume fraction of the fibre compartment (from left to right). However, this change in the fibre volume fraction does not produce a significant change in T 2,t ( Figure 3C and 3F). Therefore, the observed change in T 2 from DTIT 2 is mainly due to the increase in the size of the free water compartment. This feature is further analysed in the following subsection. In Phantom 2, the spatial change in the size of the free water compartment corresponds to an increase of PD (from left to right in the blue ROI in Figure 4C) and a slight decrease of ϕ. This is clearly shown in the profiles plotted in Figure 4D for PD (orange), f w (blue) and ϕ (grey). The increase in f w is accompanied by an increase in the total PD, whereas a decrease of approximately 7% (0.78 to 0.73) is observed in ϕ. Although the change in ϕ is rather small, it is seemingly enough to induce changes in the diffusion properties, as shown in Figure 3C and 3F, but not observable changes in T 2,t . For a quantitative comparison, RD/D w and FA from FWET 2 (ie in the fibre compartment) are plotted against ϕ in Figure 5A and 5B. Additionally, the theoretical predictions for the RD in a system of parallel cylinders in the long-time limit (Equations B1-B3) under different conditions of fibre volume fractions and packing geometries (square and hexagonal lattices, see Appendix B for details) are also plotted. One can see that, although the arrangement of the fibres in the phantom follows neither the square nor the hexagonal lattices but is rather random, the

| f versus f w
In order to evaluate the effects of accounting for differences in the transverse relaxation times between the water and fibre compartments on the estimation of the free water fraction, we exploit the fact that we have access to a range of free water volume fractions in Phantom 2 and  Grey colour refers to fibres, whereas light blue and dark blue refer to water in the restricted and free water compartments, respectively transverse relaxation times for in vivo brain CSF, T 2,w = 500 ms, 68 and tissue, T 2,t = 60 ms at 3 T, 69 and T E = 80 ms. In this case, f can be up to 75% larger than f w .

| Inter-site and intra-site variability
The outlier identification step did not reveal outliers in any of the six scans (see Supplementary Figure S1). Figure 7 summarizes the intraand inter-site variability maps, σ intra and σ inter (Equations A1 and A2), as well as ICC intra and ICC inter maps (Equations A3 and A4), for all DTIT 2 and FWET 2 parameters. All intra-site variability maps show significantly higher intensity values than the inter-site ones. This behaviour is more pronounced in the ICC parameters, where ICC intra shows much higher values than ICC inter . In the case of Phantom 1, one can also observe that both variability values are higher in voxels located at the edge of the sections of fibre (red arrows) compared with those voxels unaffected by PVE. This is not the case in Phantom 2, where both variability and ICC parameters show a fairly homogeneous distribution in the blue ROI.

| DISCUSSION
In this work, we introduced the design, and demonstrated the use, of two anisotropic, diffusion fibre phantoms for the investigation of FWE models. In particular, we investigated the recently introduced FWE DTI model, which explicitly takes into account differences in the transverse relaxation times between compartments (FWET 2 ). Our phantoms fulfil the four criteria described in Section 1, as follows.
a. 'Exhibit the microstructural property of interest affecting the measured signal'. The effect of both types of free water contamination on the DW and T 2 W signal described in Section 1 was successfully mimicked. The diffusion-resolution phantom (Phantom 1, Figure 1A) was designed to mimic the PVE occurring in voxels at the CSF-tissue interface. The multi-section phantom (Phantom 2, Figure 1B), on the other hand, was c. 'Is easy to assemble and reproducible'. Although the fibres were manually wound around the Perspex platforms, given the design and dimensions of the platforms ( Figure 1) the assembly and reproducibility of both phantoms is relatively easy. However, higher levels of reproducibility, especially regarding fibre packing density, could potentially be achieved using a winding machine as in References 11 and 70 d. 'Is stable and non-toxic'. One of the advantages of artificial fibres such as polyethylene is that, given its hydrophobic properties, they are more stable than biological tissue or hydrophilic materials. 52,71 The phantoms presented in this work are constructed using only polyethylene fibres, Perspex platforms and distilled water. Therefore, the absence of any biological contamination can be guaranteed with the help of a preservative (eg NaN 3 ). In particular, our experiments span a time line of 564 days, and during this period our variability study did not show significant bias in the model parameter maps.

| Phantom 1
The diffusion-resolution phantom ( Figure 1A) was conceived to investigate the CSF contamination and the elimination of the bias in diffusion models in voxels located at the CSF-tissue interface. The causes can be on the one hand random processes, such as the positioning of the imaging grid, and on the other hand characteristics relating to the dimensions of the object being imaged relative to the voxel size. The geometrical configuration of the phantom includes five sections of fibre bundles of different thicknesses separated by water and Perspex (noise) compartments of different sizes. We observed that DTIT 2 parameters were heavily affected by free water contamination in voxels located at the bulk-fibre interface, whereas FWET 2 parameters showed a more constant behaviour within each section of fibre bundles. This is clearly seen in the grey areas in the plotted profiles in Figure 2C and 2F. In particular, the leftmost section (2 mm thickness), which was almost invisible in the conventional FA, was completely recovered in FWET 2 FA. Moreover, a comparison of DTIT 2 and FWET 2 demonstrates that not only diffusion properties are affected by free water contamination, but also relaxation properties can be significantly altered. It should be noted that voxels presumed to be unaffected by PVE with free water, ie not positioned at the edge between the section of fibres and the bulk water, still show f w values between~0.25 and~0. 35. This is most likely due to the presence of free water pockets formed by the imperfect packing of the fibre bundles. This, in turn, is a result of the fact that the fibre bundles (~700 single fibre strands) do not have a square cross-section (which would be required to have a perfect bundle packing), but rather have amorphous crosssections. This feature can be seen in both single scan and reference truth maps.
These observations have important implications for in vivo DW MRI, given that CSF contamination is particularly crucial for some brain structures, such as the fornix, due to its small size and proximity to the ventricles. 13,14 In such cases CSF, contamination induces bias in DTI metrics that becomes a confounding macroscopic effect entangled with changes in the tissue microstructure, such as white matter atrophy. 13 Hence, the elimination of this bias becomes paramount. In this regard, our Phantom 1 could be utilized to optimize, among other things, the voxel size in order to minimize or eliminate CSF contamination in small structures.
The presence of thin sections of fibre bundles and small water compartments allows one to investigate more subtle processes leading to free water contamination. For instance, image distortions induced by eddy-currents depend on the direction of the field gradient. Therefore, a failure F I G U R E 6 Orange circles and light blue bars depict the mean and standard deviation of f from FWE (equation 2) versus f w from FWET 2 (equation 4), calculated using a clustered mask based on the map of f w , using eight equally spaced bins in the range [0.05, 0.7]. Equation 5 is plotted for comparison using T E = 70 ms and T E = 500 ms and the mean transverse relaxation times for the bulk T 2,w = (2.53 ± 0.15) s and fibre T 2, to fully correct these distortions will induce a small contamination dependent on the direction of the field gradient. Another case for which Phantom 1 is potentially useful is in the investigation of Gibbs ringing artefacts in DWI, which have been shown to strongly bias diffusion metrics in other models. 72 However, the study of such effects is beyond the aim of this work.

| Phantom 2
The previously introduced multi-section fibre phantom 50,73 includes a section that consists of parallel fibre bundles, conforming layers ( Figure 1B), stacked in alternating directions. As a result, a free water compartment of variable volume is generated between the layers of fibres (volume increasing from left to right in Figure 1B). This feature results from the fact that each fibre bundle is made out of This intra-bundle free water compartment mimics, although in a simplified manner, the situation encountered in intra-tissue free water contamination, in the sense that there are several free water compartments within a single voxel. We have assessed this feature using the FWET 2 model, which indeed shows that the relative free water fraction f w spans approximately the range [0.27,0.62] in the investigated slice ( Figure 3). As a consequence of this increase in the size of the free water compartment, conventional DTIT 2 parameters show a strong free water contamination. For a given voxel in this phantom, f w is determined by the thickness of the fibre layers, the distance between the layers and their relation to the voxel dimensions ( Figure 1B). The thickness of the layers and their separation are spatially variable and, as a consequence of this variation, f w is also spatially dependent. However, as the positioning of the imaging grid is random in nature and also influences the free water fraction, the values of f w cannot be a priori determined.
In Figure 3 we showed that MD, FA and T 2 from DTIT 2 show a spatial change that correlates with the estimated increase in the free water fraction. Yet the actual diffusion properties at the fibre compartment (ie estimated with FWET 2 ) also show a spatial change. Therefore, one can conclude that there is indeed a change in the fibre volume fraction at the fibre compartment, ϕ, which directly affects the diffusion properties.
Conversely, the transverse relaxation time of the fibre compartment did not show an observable change, which may be a consequence of the higher variance of this parameter compared with the diffusion parameters. This high variance is a result of the relatively short range of echo-times used in our experiments in relation to the long relaxation times.
We have taken advantage of the fact that we have also access to the PD in order to quantify the change in actual fibre volume fraction free volume theory 75 and the hydrodynamic approach. 76 The applicability of these models was previously tested in binary solvent-polymer solutions and gels, 77 and was shown to hold in dilute systems or, in some experiments, at moderate concentrations. However, they were shown to break down for concentrated solutions. In the case of diffusion in the space between parallel cylinders, the Maxwell Garnett equation ) arrangements. 79 However, Dyneema fibres are arranged in a random packing geometry, as previously observed using micro-CT. 51 This fact restrains an a priori derivation of the diffusion properties based on the geometrical configuration of the fibres.
Regarding the validity of the long-time approximation (Equations B1-B3), we observed that the root mean square (rms) displacement during the experiment, l t , for the fibre compartment ranges from l t ≈ 8 μm (high ϕ) to l t ≈ 11 μm (low ϕ), calculated using an observation time of 70 ms.
Similarly, one can calculate the structural length scale, l s , based on the values of ϕ, 52,80 to be of the order of l s ≈ 5 μm (high ϕ) to l s ≈ 6 μm (low ϕ).
This means that molecules travel a distance of the same order as l s in both cases, and therefore the long-time limit is not fully developed. However, one can observe the expected trend. The transverse relaxation time at the fibre compartment, in contrast, showed almost no change across the phantom, which means that the change in ϕ is still not sufficiently large so as to induce significant changes.
The fact that the phantom shows different transverse relaxation times between the bulk and the fibre compartments (T 2,w ≈ 2.6 s and T 2,t ≈ 0.7 s) allowed us to investigate the T E dependence of the free water volume fraction as predicted by Equation 5. Thus, we exploited another of the benefits of FWET 2 by showing that, in that case, the free water fraction can be overestimated compared with the T Eindependent free water fraction. This overestimation was fairly small in our experiments due to the fact that T E in our DW experiments (70 ms) is small compared with the values of T 2,w and T 2,t . However, the green line in Figure 6 demonstrates that the overestimation can potentially be much greater in in vivo experiments. Therefore, accounting for this effect becomes of paramount importance given the fact that several works in the literature have already shown f to be a useful biomarker in the assessment of different brain conditions, such as Parkinson's disease, [30][31][32] Alzheimer's disease 29 or vasogenic oedema. 16 In all of these works, the difference in transverse relaxation was not taken into account. Another condition that could be assessed a priori using FWET 2 is traumatic brain injury, which has been shown to involve the formation of vasogenic oedema. 81,82

| Reference truth maps and intra-and inter-site variability
With the purpose of constructing the reference truth maps, we first studied the variability of all DTIT 2 and FWET 2 maps in six scan sessions acquired at two different sites and with a time span of 564 days between the first and the last scan session. For the analysis of variability, we followed the framework published by Walker et al. 1 The first step of this framework consists in the detection of outliers using the distance from each scan session to the corresponding median calculated using all scan sessions. The median is a measure that, in contrast to the mean, is robust to outliers. Thus, in the case of a systematic difference, the presence of outliers produces only a small shift in the median value, allowing for easier identification of outliers when looking at the difference between the median and each single map. We did not observe significant outliers in our results ( Figure S1).
The analysis indicated that intra-site variability values were larger than inter-site ones for all model parameters. In an ideal situation, inter-site and intra-site variability values should be similar, indicating that there is no site bias and that the amount of variance is similar within each site and through all sites. 1 As a consequence, in an ideal case, ICC intra ≈ ICC inter ≈ 0.5. Conversely, one would expect greater ICC inter in cases where the mean value between sites differs, whereas ICC intra would be larger in those situations where the within-site variance differs between sites. The latter was the observed situation in our results for all ICC maps, ie ICC intra > ICC inter . Therefore, we conclude that the mean values between sites were rather similar (ie no site bias), which validates the framework we proposed for the construction of reference truth maps described previously.
The reference truth maps for all model parameters were constructed based on the former premise, using the median of all six scan sessions for each map. All reference truth maps were comparable to the corresponding single scan sessions (Figures 2-4). Moreover, we used the interquartile range (difference between the first and the third quartiles), which is also more robust to the presence of outliers than the standard deviation, to assess the voxel-wise uncertainty of the reference truth maps. As depicted in the corresponding profiles ( Figures 2C, 2F, 3C, 3F, 4B and 4D), the uncertainty range for all parameters was within reasonable ranges.

| Other considerations
Regarding the pulse sequence, in this work we have used the paradigm of linear diffusion encoding originally proposed by Stejskal and Tanner. 83 However, more advanced diffusion encodings, such as double or triple diffusion encodings 84 or arbitrary q-trajectory encoding, 59,85 have been recently suggested and could potentially prove useful tools in the quest for free-water elimination and mapping. 86 Concerning the fibre material for the construction of phantoms, an important issue requiring consideration is its magnetic susceptibility. Differences between the magnetic susceptibility of the fibres and the surrounding bulk liquid generate field gradients, which in turn produce image distortions due to the bias in the phase accumulated towards the end of the EPI readout. In the case of Dyneema fibres, the difference in magnetic susceptibility between the fibres and distilled water is~1 ppm. 61 Yet, as observed in our results, this difference does not lead to appreciable image distortions, compared with the distortions that one normally observes in vivo induced by the presence of air, which has a difference in the magnetic susceptibility of nearly 8.5 ppm compared with tissue. 87

| A brief, qualitative digression about exchange
One of the assumptions of FWE models is that water exchange between compartments is negligible. In this subsection we make some geometrical considerations of this situation.
In the case of Phantom 1, the rms displacement during the experiment for molecules in the bulk and fibre compartments is l w ≈ 17 μm and l t ≈ 9 μm (in the radial direction), respectively (using an observation time of 70 ms). During this time, only water molecules predominantly within distances l w (in the bulk compartment) and l t (in the fibre compartment) can reach the interface and exchange. A schematic representation of this F I G U R E 8 Qualitative representation of the fraction of water molecules within a voxel that have the probability to exchange in the different configurations for phantom 1 (A, B) and phantom 2 (C, D). Note that the dimensions of fibres and shaded areas are qualitative situation is shown in Figure 8. Green and yellow areas represent the layers with thicknesses l w and l t , respectively. One can approximate the fractions of water molecules that have the possibility to exchange as the fractions located within these layers. For a voxel side length of 2 mm and assuming that the bulk-fibre interface is parallel to the voxel side, for example, one can demonstrate that these fractions are 14% (green area, Figure 8A) and 2% (green area, Figure 8B) for the bulk compartment with f w = 0.2 and f w = 0.8, respectively. In contrast, for the fibre compartment, these fractions are equal to 2% (yellow area, Figure 8A) and 4% (yellow area, Figure 8B) for f w = 0.2 and f w = 0.8, respectively. These estimates justify the slow-exchange limit in Phantom 1.
The situation with Phantom 2 is more complex since each voxel contains alternating layers of fibres and bulk water ( Figure 8C and 8D). In this phantom, the rms displacement in the bulk compartments is l w ≈ 18 μm, whereas for the fibre compartment it ranges from l t ≈ 8 μm (high ϕ, Figure 8C) to l t ≈ 11 μm (low ϕ, Figure 8D). Conversely, the thickness of each layer of fibres was estimated to range from approximately 180 μm (high ϕ) to 220 μm (low ϕ), whereas the thicknesses of the bulk compartments ranged from approximately 80 μm (low f w ) to 200 μm (high f w ). Thus the fraction of water molecules that have the possibility of exchanging within the fibre compartments is 9% (yellow areas, Figure 8C) and 10% (yellow areas, Figure 8D), respectively. These fractions for the bulk compartments are 45% (green areas, Figure 8C) and 18% (green areas, Figure 8D), respectively. Therefore, based on this simplistic description, one can conclude that the effect of water exchange in Phantom 2 is a priori not negligible, especially in the region of high ϕ (low f w ).
For conventional DTI, an increase in the amount of exchange will cause an increase in bulk diffusivity and a decrease in FA. In the case of FWE models, it is expected to induce a bias in the estimation of f w and therefore in the tissue-specific diffusion tensor parameters. 16 Although water exchange has been extensively studied in the field of NMR, [88][89][90] open questions remain in relation to the exchange rate in the in vivo brain and its effect on the DW MRI metrics. 16,[91][92][93][94][95][96] In particular, special care has to be taken with vasogenic oedema, given that tissue permeability plays an important role. 16 Thus we expect our Phantom 2 to become a useful tool for the design, validation and optimization of experimental techniques for the assessment of exchange.

| LIMITATIONS
A limitation of the current experimental setup of the phantoms is the long transverse relaxation times observed in the bulk and fibre compartments (2.53 s and 0.76 s, respectively), which are much larger than the values observed in the brain tissue (500 ms for CSF and an average of 60 ms for brain tissue at 3 T). 68,69 Therefore, in order to observe a clear difference between the T E -dependent and T E -independent free water fraction, one needs large T E values. Doping the water with MgCl 2 Á6H 2 O (as in Reference 61 or with CuSO 4 in order to manipulate, 97 among other things, T 2 , needs further investigation and is in our future plans. Another possibility is to use agarose gel, which has been shown to drastically reduce T 2 . 98 However the construction process is challenging given that, due to the size of the agar chains, the solution barely fills the interstitial space via capillary force. One possibility requiring further investigation is the construction of the phantom while submerged in an agar bath.
Regarding the fibre material, one possibility is to use hydrophilic materials, such as rayon, 52,55 which induce strong transverse relaxation as a result of surface interactions.
Although our Phantom 2 spans nearly 40% of the whole range of f w , it still lacks the range [0,0.2], which is particularly important given that the free water fraction in the healthy, in vivo interstitial space is known to lie in this range. 99 This represents a limitation to our Phantom 2.
A limitation related to Dyneema fibres is that they are not hollow. Therefore, one can only mimic the water diffusion in the extra-cellular space. Other new materials though, which are able to reproduce both intra-and extra-cellular spaces, 46 could be used to construct phantoms with the same macroscopic characteristics as ours.
Regarding the estimation of the PD, a more exhaustive analysis should correct for the transmit field inhomogeneity and the receiver inhomogeneity separately, as demonstrated in several works in the field of quantitative water content mapping. 65,100,101

| CONCLUSIONS
We have introduced the design and characterised the NMR properties of two artificial, anisotropic diffusion fibre phantoms useful for the investigation of FWE models. The first phantom was designed to mimic the case of free water contamination occurring via PVE in voxels at the CSFtissue interface. The second phantom was used to reproduce the situation of free water contamination encountered in the intra-tissue case. Thus, investigation of FWE methods with such phantoms can be done on the basis of real experimental data. Moreover, the phantoms can be measured using clinical MRI scanners fitted with conventional receive coils.
We demonstrated the use of our phantoms to investigate the recently introduced FWE DTI model with explicit account of differences in the transverse relaxation times between compartments. Important diffusion and relaxation features, as well as the size of the free water compartment space, were captured by the FWET 2 model. Furthermore, we investigated the variability of different diffusion and transverse relaxation times in six scan sessions at two different sites, which indicated a negligible site bias. Based on this, we constructed reference truth maps for each of the model parameters using the median of all six scan sessions.
Our phantoms are expected to be useful tools for the design and optimization of experimental protocols for FWE models as well as for the development and improvement of different data post-processing pipelines. Moreover, our phantoms represent useful systems for the validation and optimization of sequence-based approaches for FWE.