Data Collection

The ASPREE trial developed a data suite that was specifically designed to support data collectors and produce high-quality data: the ASPREE Web Accessible Relational Database (AWARD). The AWARD suite consists of four communicating modules: AWARD-Data, AWARD-General, AWARD-Adjudicator and AWARD-Access Management System (AMS) (1). All data collection for ASPREE was facilitated by one of the four modules of the AWARD suite and included collection of operational fields (e.g. visit due date and booked date) and analytical fields (e.g. height, date of MI). Analytical data was annotated for quality via AWARD using a commentary code system. For more information on commentary codes please see Quality Control within the section About the Data Set.

Overview of the structure and function of the AWARD suite.

The AWARD suite SQL database is hosted in the Monash University Data Centre in Clayton and within the Monash ‘Red Zone’. This is a specialised facility with tightly controlled data access and storage. All data is encrypted at 2048 bits in transit via SSL through the web server and IP Sec tunnels to the database cluster. The data centre is secured and monitored electronically and can only be entered by authorised personnel. Data centre processes are detailed by Monash University’s Information Security Management System (ISMS) which is Certified to ISO 27001 international standards (Certification No: ITGOV40017). These are continually reviewed and improved as a part of the continuous improvement process of the ISMS. The Data House at Clayton (Victoria) is mirrored to a second centre at Noble Park (Victoria) to provide redundancy. Only system administrators have unrestricted access to the ASPREE database, no users have direct access.

Operational and analytical data collected on case report forms (CRFs) were entered into AWARD via the web application by study staff. Each visit conduced in the field was created in the web application as an electronic visit. Following creation of the electronic visit staff entered the data from each CRF into the electronic case report form (eCRF) on AWARD-Data (see Figure 2 below).

Example on electronic data entry - Visit 1 Lifestyle CRF and eCRF.

Questionnaire Design

Validated questionnaires were utilised where possible. These included:

  • Short Form 12 (SF-12) (2);
  • Modified Mini-Mental State Examination (3MS) score ≤77 (3);
  • Hopkins Verbal Learning Test—Revised (HVLT-R) (4);
  • Controlled Oral Word Association Test (COWAT) (5);
  • Symbol-Digit Modalities Test (SDMT) (6);
  • Center for Epidemiologic Studies—Depression (CES-D) assessment tool, Version 10 (7); and
  • LIFE Disability Form (8).

ASPREE developed case report forms to capture demographics, lifestyle information, physical examination results, concomitant medication use, clinical events, and study medication compliance.

In-Field Data Collection

In-field data collection was conducted by staff during face-to-face assessments with participants. Data collection was driven by national standard operating procedures (SOPs) for visit conduct and individual test administration. To ensure that SOPs were understood and implemented correctly, new staff underwent an intensive three to five day training course when they first commenced at ASPREE. Prior to this training week, staff were asked to view the online training videos and read all relevant visit conduct and test administration SOPs.

During training, staff were informed of the principles underpinning good clinical practice, the ethical considerations of undertaking research involving human subjects, and guidance on how to collect informed consent. All visit-related questionnaires and data collection forms, and in particular, the cognitive assessments, were reviewed in detail and then rigorously practiced to allow for immediate feedback. At day three in the training schedule, staff were subjected to an assessment for cognitive test administration by the ASPREE nominated Neurologist, with staff requiring a pass to conduct these cognitive tests independently. Staff then observed the conduct of ASPREE visits, for a period of three to four weeks, including conducting small portions of the visit themselves.

NOTE: In 2010, new staff observed visits with staff involved with collection of data for the ASPREE vanguard cohort.

When deemed ready, ASPREE trainers formally accredited the staff member for independent visit conduct activity by assessing their competency to accurately collect all relevant data in compliance to visit conduct SOPs and good clinical practice.

Conduct of In-Field Data Collection

In-field data collection, occurred via participant visits with ASPREE staff. These visits were booked at the participant’s local GP practice, a community venue, a clinical trial study site or at their home. To conduct a study visit, the following equipment and documentation was required:

  • A print out of the Annual Visit Summary Report (AVSR)
  • Relevant visit-specific source documentation pack
  • Annual study medication from the ASPREE Study Medication Cabinet
  • A working BP monitor (and spare batteries) and both medium and large BP cuffs;
  • Timer (and spare batteries);
  • Calculator;
  • Ruler or equivalent;
  • Pathology provider request forms;
  • Weighing scales;
  • Figure Finder tape measure for abdominal circumference;
  • Stop watch (to 1/100 sec) for 3m gait speed test;
  • Retractable tape measure for 3m gait speed test;
  • Masking tape (24 mm wide) for 3m gait speed test;
  • Hand grip dynamometer

The Annual Visit Summary Report (AVSR) was a summary of relevant information previously entered into AWARD-Data. This report was available for download via AWARD-Data and designed to negate the need to bring the primary participant files to a study visit. Staff recorded relevant information about family medical history, clinical events, concomitant medication use and study medication use on the AVSR and then transcribed the information into AWARD-Data at the time of data entry.

Potential Errors During In-Field Data Collection

While equipment was serviced regularly, occasionally a device malfunction prevented collection of data at a given visit. Where device error is the reason for missing data a commentary code of 4 was applied (see Quality Control within the section About the Data Set).

If an annual visit was missed for any reason (e.g. the participant was away for an extended period of time) data collection linked with the missed visit was conducted at the next annual visit. For example, if the missed visit was a Year 2 annual visit, at the Year 3 visit the grip strength and gait speed tests would be administered and data collected. In this situation, the date of the Year 2 annual visit is the same as the Year 3 annual visit date. Data that is collected at each annual visit and hence cannot be ‘caught up’ is left blank for the missed annual visits (i.e. Year 2 annual visit in this example). Where data is missing due to the conduct of a visit at the same time as another visit a commentary code of 12 was applied.

If any data collection field was missed due to staff error (e.g. not taking the correct CRFs to a visit, failing to administer a mandatory measure etc), a commentary code of 5 was applied.

Collection of Clinical Events

Scheduled collection of ASPREE clinical events occurred at all six-month phone calls and annual visits through completion of the following forms and assessments:

  • Six-month Phone Call form
  • Six-month Life Disability form
  • Annual Visit Life Disability form
  • Annual Visit 3MS (and CES-D 10)
  • Annual Visit ConMeds form
  • Annual Visit Participant Medical History Update form (PMHU)
  • Annual Visit review of medical records

Participants were able to contact ASPREE at any time and report a clinical event, which was recorded in AWARD-Data for follow-up either prior to or at the time of the next annual visit.

Death could be detected at any point in time. In Australia, linkage with the Ryerson Index of obituaries was performed on a weekly basis to detect deaths. A search of the National Death Index was conducted in both countries in November 2017.

Clinical Event Collection

With regard to scheduled clinical event collection, at six-month phone calls participants answered a series of seven questions regarding ASPREE endpoints (i.e. cancer, clinically significant bleeding, dementia, depression, hospitalisation for heart failure, myocardial infarction, stroke), one question regarding other hospitalisations and 19 structured questions about new diagnosis of specific medical conditions of interest and one open question about other new diagnoses. This part of the six-month phone call mirrored the questions on the PMHU administered at annual visits. In this way, clinical event reporting was consistent at all scheduled collection time points. Clinical events reported via the seven standard questions at a six-month phone call or via the PHMU at annual visits were subject to collection of supporting documentation and adjudication. Although not an ASPREE primary or secondary endpoint, cause of death was also subject to collection of supporting documentation and adjudication.

Fact of death was subject to confirmation with two independent sources (e.g. family, GP/PCP and published obituary).

Data Driven Event Collection

In addition to clinical event detected, participants could trigger for data driven events at six-month phone calls and annual visits. Data driven events included:

  • Persistent physical disability (detected via the LIFE Disability questionnaire)
  • Depression (detected via the CES-D 10 questionnaire); and
  • Dementia (detected via the 3MS assessment or entry of a medication).

With regard to the persistent physical disability endpoint, at six-month phone calls, annual visits and Milestone annual visits, participants were asked a modified list of Katz ADL questions related to difficulty performing daily activities. A response of ‘a lot of difficulty’ or ‘unable to perform’ an activity, or requirement of assistance to complete the activity was considered a trigger for the physical disability endpoint. If an equivalent response (i.e. ‘a lot of difficulty’ or ‘unable to perform’ an activity, or requirement of assistance) was reported at re-administration approximately six months later, the trigger was deemed to be confirmed as persistent and hence a physical disability endpoint.

With regard to the depression endpoint, participants were asked to complete the CES-D 10 questionnaire at year 1, 3 and 5 visits between 1 March 2010 and 31 Dec 2014, and at all annual visits from 1 Jan 2015 onwards. Any assessment with a score of eight or more was considered to be a depression endpoint. The determination of depression endpoint sub-type (i.e. incident, recurrent or persistent) was automated based on previously entered data. If all previous CES-D 10 scores were less than eight, the event was considered to be incident. If one or more previous CES-D 10 scores were eight or more but the CES-D 10 score immediately preceding was less than eight, the event was considered recurrent. If the CES-D 10 score immediately preceded was also eight or more, the event was considered persistent.

With regard to the dementia endpoint, endpoint triggers were driven by the completion of the 3MS assessment. This assessment was completed at baseline and then re-administered at the year 1, 3 and 5 visits and the Milestone Visit. A predicted long term 3MS score was calculated for each participant based upon their raw score at baseline. A post-randomisation 3MS score that was below 78 or demonstrates a greater than 10.15 drop from their predicted 3MS score generated a dementia endpoint trigger based on the CES-D 10 result (see Table 1 below).

Table 1. Dementia endpoint trigger decision rubric based on CES-D 10 score at baseline and time of 3MS examination conduct.

Baseline CES-D 10 result CES-D 10 result linked with 3MS < 78 or > 10.15 drop from predicted score Dementia endpoint trigger?
<8 <8 Yes
8+ <8 Yes
<8 8+ No*
8+ 8+ Yes

*participant required reassessment of 3MS and CES-D 10 in three months

Dementia triggers were followed up by the Dementia Assessment team. Where possible, genuine dementia triggers resulted in an in-person dementia assessment visit. This visit included administration of the ADAS-Cog (9) (10) (for aphasia, apraxia), Color Trails (11) (for executive functioning), Confusion Assessment Method proforma short form (12) (13) (for delirium), visual agnosia (for agnosia) (14), and ADCS IADL scale (15) based on both self-report and surrogate information (for functional decline). This set of cognitive tests are utilised purposefully to ensure major cognitive domains are tested. The results of these tests were utilised by the Dementia Endpoint Adjudication Committee but have not been included in the data set.

Event coding and adjudication is outlined in the section Endpoints and Adjudication.

Clinical event trigger data has been included in Sections B5, C1, E1 and E2 of the ASPREE Longitudinal data set.

Study Medication Tracking

Study medication was tracked via an online log called the Drug Log in AWARD-Data. At randomisation, each participant was assigned a unique medication number. All subsequent bottles of study medication provided were labelled with this unique number.

New bottles of study medication were dispensed and old bottles retrieved at each annual visit. When bottles were retrieved the following data was collected and entered into the Drug Log for the bottle in question:

  • Returned date
  • Count of returned pills
  • If not retrieved, the reason the bottle could not be returned

If a bottle was lost or accidentally destroyed, participants were provided with a new bottle of medication from the emergency medication supply. Data from the Drug Log has been included in Section G1 of the ASPREE Longitudinal data set.

Concomitant Medication Collection

Concomitant medications (ConMeds) were collected directly from participants wherever possible. At baseline and each annual visit, participants were asked to bring along all their medications. Prescription medications were recorded on the AVSR and entered into AWARD-Data by staff either by selecting an option from a dropdown list of medication or, if the option was not available, entering the medication name in free text.

At the conclusion of the study, a neural network was trained to ingest free text ConMed data and produce a list of probable ATC medication codes based on publically available large medication data sets (16). The list of probable ATC medication codes was then reviewed by two staff who independently recorded the correct ATC code. Discordant cases were reviewed by the Data Manager and resolved.

ConMed data including medication name, ATC code and whether or not the medication was taken during each calendar year of follow-up can be found in Section B5 of the ASPREE Longitudinal data set.

Pathology Measure Collection

The following pathology measures were collected as per the schedule shown in Table 2 below:

  • Haemoglobin
  • Fasting lipids (total cholesterol, HDL, LDL, triglyceride)
  • Fasting glucose
  • Serum creatinine
  • Urine albumin:creatinine ratio

In Australia, all pathology measures were collected each year. In the US, haemoglobin was collected each year and other measures were only collected at certain timepoints.

In general, ASPREE did not conduct pathology measures directly (point-of-care haemoglobin measures were available at some US sites). Rather, participants were provided with pathology slips and asked to attend a local pathology centre for blood and urine collection and analysis. ASPREE details were included in the pathology slip to enable feedback of results.

If the urine albumin:creatinine ratio was > 2.5 mg/mmol (males) or > 3.5 mg/mmol (females) a letter was sent to the participant’s GP to follow up with the participant and arrange for a second urine albumin:creatinine ratio test. The results of these follow up urine albumin:creatinine ratio tests are labelled fw. Completion of this additional test was at the discretion of the PCP/GP.

Results returned to ASPREE were entered into AWARD-Data. Where pathology results were requested from a third party pathology but never provided, a commentary code of 6 has been applied.

Pathology measure data has been included in Section B4 of the ASPREE Longitudinal data set.

Data Collection Schedule

The ASPREE Measurement and Study Activity schedule is summarised below.

Cognitive function was measured at baseline, annual visits 1, 3 and 5, and the Milestone Visit using a 30 minute cognitive battery that includes the Symbol-Digit Modalities Test (SDMT) (6), Hopkins Verbal Learning Test—Revised (HVLT-R) (4), and Controlled Oral Word Association Test (COWAT) (5).

Depression was measured each year using a self-reported questionnaire, the Center for Epidemiologic Studies—Depression (CES-D) assessment tool (7).

Physical function was measured at baseline, annual visits 2, 4, 6 and the Milestone Visit using the performance-based measures of gait speed and hand grip tests, self-reported ADLs and instrumental activities contained within the LIFE Disability Form (8), and quality of life was measured using the Short Form 12 (SF-12) (2).

Hospitalisation for reasons other than primary or secondary endpoints were also captured.

ASPREE data collection by visit type.

X indicates when each category of measures were carried out. Superscripts a-c specify the tests within each category that were performed at the designated time point. Lack of superscript indicates that all category measures were carried out at the designated time point. ** Milestone Visit took place in years 3 to 7 depending on year of randomisation. Final visit measurements will be the same as those indicated for Milestone Visit.


  1. Lockery JE, Collyer TA, Reid CM, Ernst ME, Gilbertson D, Hay N, et al. Overcoming challenges to data quality in the ASPREE clinical trial. Trials. 2019 Dec;20(1):686. doi: 10.1186/s13063-019-3789-2.

  2. Ware J, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996 Mar;34(3):220–33.

  3. Teng EL, Chui HC. The Modified Mini-Mental State (3MS) examination. J Clin Psychiatry. 1987 Aug;48(8):314–8.

  4. Benedict RHB, Schretlen D, Groninger L, Brandt J. Hopkins Verbal Learning Test – Revised: Normative Data and Analysis of Inter-Form and Test-Retest Reliability. Clin Neuropsychol. 1998 Feb 1;12(1):43–55.

  5. Ross TP. The reliability of cluster and switch scores for the Controlled Oral Word Association Test. Arch Clin Neuropsychol. 2003 Mar;18(2):153–64.

  6. Smith A, Services (Firm) WP. Symbol digit modalities test : manual [Internet]. Los Angeles, Calif. : Western Psychological Corporation; 2002 [cited 2018 Nov 1]. Available from:

  7. Radloff LS. The CES-D Scale: A Self-Report Depression Scale for Research in the General Population. Appl Psychol Meas. 1977 Jun 1;1(3):385–401.

  8. LIFE Study Investigators, Pahor M, Blair SN, Espeland M, Fielding R, Gill TM, et al. Effects of a physical activity intervention on measures of physical performance: Results of the lifestyle interventions and independence for Elders Pilot (LIFE-P) study. J Gerontol A Biol Sci Med Sci. 2006 Nov;61(11):1157–65.

  9. Rosen WG, Mohs RC, Davis KL. A new rating scale for Alzheimer’s disease. Am J Psychiatry. 1984 Nov;141(11):1356–64.

  10. Mohs R. Alzheimer Disease Assessment Scale-Cognitive subscale; Adapted from the Administration and Scoring Manual for the Alzheimer’s Disease Assessment Scale, 1994 Revised Edition. The Mount Sinai School of Medicine; 1994.

  11. D’Elia L, Satz P, Uchiyama C, White T. Color Trails Test: Professional Manual. Psychological Assessment Resources, Inc; 1996.

  12. Inouye SK, van Dyck CH, Alessi CA, Balkin S, Siegal AP, Horwitz RI. Clarifying confusion: the confusion assessment method. A new method for detection of delirium. Ann Intern Med. 1990 Dec 15;113(12):941–8.

  13. The Confusion Assessment Method (CAM). Training Manual and Coding Guide. Yale University School of Medicine; 2003.

  14. Lezak M. Neuropsychological assessment, 3rd ed. New York: Oxford University Press; 1995.

  15. Galasko D, Bennett DA, Sano M, Marson D, Kaye J, Edland SD, et al. ADCS Prevention Instrument Project: assessment of instrumental activities of daily living for community-dwelling elderly individuals in dementia prevention clinical trials. Alzheimer Dis Assoc Disord. 2006 Dec;20(4 Suppl 3):S152-169.

  16. Lockery JE, Rigby J, Collyer TA, Stewart AC, Woods RL, McNeil JJ, et al. (2019) Optimising medication data collection in a large-scale clinical trial. PLoS ONE 14(12): e0226868. 10.1371/journal.pone.0226868