Welcome to the NCEAS Data Quality Assurance Tool

This page is a proof-of-concept prototype. Further development will allow the technology developed here to be used in production environments for ecological research. The work was wrtten up and presented at the Third IEEE Computer Society Metadata Conference. The citation is:

Nottrott, R., M. B. Jones, and M. Schildhauer. 1999. Using XML-structured metadata to automate quality assurance processing for ecological data. Proceedings of the Third IEEE Computer Society Metadata Conference. Bethesda, MD. April 6-7, 1999.

This web page provides scientists with the ability to run automated quality assurance testing on their data files. It utilizes a structured form of metadata (XML) to describe expectations about a data set, and then uses a combination of perl and SAS to generate a quality assurance script which is used to validate the dataset. Any errors in the data are formatted in HTML and returned to the scientist, along with other standard quality assurance output.

To use the form, you must submit three files for processing. The three files are:

Currently, we only support data files that are in comma-separated value (CSV) format, although this will expand as we more effectively make use of the metadata describing each data set.

In addition, metadata files must conform to the Ecological Metadata Language (EML) that was developed at NCEAS and was based on the FLED metadata standards. The two relevant sections of metadata are file-level descriptors (eml-file) and variable-level descriptors (eml-variable). Links to example files are provided below so that you can test the system without developing the metadata for particular data sets yourself. To use the example files, download them using the links below and save them to your local hard drive, and then submit the files using the form provided below as if you had created them yourself. Note that the data is fictional data.

Please submit the three files below, and then press the Process button to start the analysis. Note that processing time for large data sets may be long, and partly depends on internet bandwidth.

File metadata file to upload: Download example file
Variable metadata file to upload: Download example file
Data file to upload: Download example file
file now!

Note: this form has been tested with Netscape 4 and Internet Explorer 4 -- your mileage may vary.