Dense Data - Methods to Handle Massive Data Sets without Compromise

Document Type


Publication Date



Objectives: Histamine iontophoresis with laser Doppler monitoring (HILD) is a robust and dynamic surrogate for histamine microvasculature response. We characterized histamine pharmacodynamics in adult participants using HILD, although some distinct challenges were seen that needed to be solved in order to allow subsequent data modelling to occur. This type of data collection produces a rare situation for pharmacometricians in which they have an abundance of data that needs to be appropriately (maintaining variability) pared down to a useful and usable size.

Methods: HILD data was obtained in 16 adults as previously described (1) in a convenience sample for the evaluation of HILD. The data was aligned based upon application of a second derivative function to determine rise from baseline, maximal effect and when possible, return to baseline. A non-compartmental analysis and non-linear mixed-effects model with a linked effect PK/PD model was used to provide estimates for area under the effect curve (AUEC), maximal response over baseline (EffmaxNT), and time of EffmaxNT (Tmax) using Phoenix® WinNonlin version 6.2 (Pharsight, Mountain View, CA). ANOVA and regression analyses were used for sub-group comparisons using R statistical software.

Results: Distinct histamine response phenotypes were identified among the adult participants after data sampling. The need for this method of data handling arises as a result of the HILD technique itself that generates data at 40 Hz producing nearly 30,000 measurement/time data parings as a result of a run of 2 hours in duration. In addition, as the data is generated from the start of the data recorder as opposed to being directly timed, aligning multiple subject data produces some rarely seen logistical challenges. Reduction of data revealed no change in time to peak response after data was aligned and intra-individual variability was preserved.

Conclusions: Data quality and integrity remain the most important consideration when assessing large dataset although current modelling software programs have difficulty with very large data sets as they were designed to handle far fewer samples per individual than techniques such as HILD. Automated processes that will allow adequate sampling and subsequent modelling is clearly needed.