Draft Address Data Standard - Data Quality
Data Quality
Part 3: Street Address Data Quality
The purpose of this section is to provide a set of methods for applying spatial quality standards to addressing. Data quality refers to the description of how to express the applicability or essence of a data set or data element and include data quality, assessment, accuracy, and reporting or documentation standards.
Quality standards that apply to addressing consist of a series of standards that apply to spatial data generally, as well as a standard from the National Emergency Number Association (NENA). General data quality standards cannot speak to a specific type of database. The NENA standard describes address quality relative to an established Automatic Location Information (ALI) file, as is appropriate for emergency services. It gives an example of describing very specific tests to assess the fitness of a dataset for emergency services. Similarly, the United States Postal Service (USPS) Postal Addressing Standards describe addresses as used for mailing.
There remains a gap in guidance for assessment of quality for addresses themselves, information independent of a specific format or use. Local variations in address data sets is so great that very specific tests, such as those described in the NENA standard, are not applicable to every situation.. A series of more general methods can provide assistance in a variety of situations, working within the framework described by more broadly stated spatial quality standards. . This section seeks to support existing standards, providing content tests for specific address uses and describing ways of testing the quality classifications required for complete metadata.
Part 3: Data Quality
(in .pdf form)
- Standard_Part_Section:
- 3.2 About Observing Address Quality
- Type:
- Editorial
- Name:
- Anne O'Connor
- Organization:
- U.S. Census Bureau
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- anne.v.o.connor@census.gov
- Date:
- 10 Jan 2006
Comments
Section 3.2, 2nd paragraph -- "Similarly, quality control for addresses takes both...
Proposed_Changes
Add an 's' to take.
- Standard_Part_Section:
- 3.3 Testing Address Quality - Generally
- Type:
- Substantive
- Name:
- Mike Walls
- Organization:
- Organization_Comment:
- Email:
- michael.walls@atosorigin.com
- Date:
- 11 Jan 2006
Comments
It is unclear to me at what stage of the data provisioning process the testing should occur. When and how should we test source data for transformation into a standards-compliant data set, the allegedly compliant data, or data sets resulting from importing a transfer?
Proposed_Changes
1) Add section providing overview of how you conceptualize this data flow. 2) Cross-reference tests to stages in the flow. For one example, format tests only make sense for data that is loaded into a standards-compliant data set. Source or destination data sets may well be structured quite differently. 3) Distinguish more systematically between tests of data vs. tests that verify outcome of a process. A quality test on the output records may identify defects in the input data OR defects in the loader application.
- Standard_Part_Section:
- 3.4 Tests for Simple Elements - Generally
- Type:
- Substantive
- Name:
- Mike Walls
- Organization:
- Organization_Comment:
- Email:
- michael.walls@atosorigin.com
- Date:
- 11 Jan 2006
Comments
Testing for simple element compliance needs expansion.
Proposed_Changes
1) We need testing for MANDATORY vs. OPTIONAL element content, integrated with (a) presence of element in data set record layout and (b) conformance and consistency with domains. For example, a mandatory element may have blank or 'UNKNOWN' as part of its domain. (Most databases only exclude NULL values from a column declared to be mandatory.) 2) One straightforward test of data formatting is to run the data through a known-good loader procedure and look at what records made it into the output and what did not. This is a fundamental test of structure, but needs some explication in terms of how to interpret the results within the overall quality context. (If a record loads it is by definition in the right format.)
- Standard_Part_Section:
- 3.7.2 Thoroughfare Addresses
- Type:
- Substantive
- Name:
- Steve Grise
- Organization:
- ESRI
- Organization_Comment:
- Email:
- sgrise@esri.com
- Date:
- 13 Jan 2006
Comments
3.7.2.1 – I don’t see how this standard provides an indication of how to share geometry for Thoroughfares (in fact this is excluded in the scope), but the standard is quite specific about segmentation rules: · Lines associated with addresses do not overlap, except where schemas are intermingled along a linestring. · Lines do not break except at intersections, dead ends and locations where streetnames change. These seem like rules for a specific master repository and not something people could commit to providing and validating for every address exchange.
Proposed_Changes
Remove geometric and topological rules from the standard unless geometry is part of the scope, and there is substantial agreement on segmentation and topological rules.
- Standard_Part_Section:
- Introduction
- Type:
- Editorial
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
In the third paragraph of the Introduction, there are two sentences that incorrectly end with double periods.
Proposed_Changes
In the third paragraph of the Introduction, remove the duplicate periods at the end of the second and third sentences.
- Standard_Part_Section:
- 3.1.1 Objectives of Existing Standards
- Type:
- Editorial
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
3.1.1 Objectives of Existing Standards: Table -- In the Purpose for the “Geographic information – Data quality measures (ISO 19138)” Standard, there is a spelling error.
Proposed_Changes
In the Purpose for the “Geographic information–Data quality measures” Standard, change “standardising” to standardizing”.
- Standard_Part_Section:
- 3.4.1 Testing Data Types of Simple Elements
- Type:
- Substantive
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
In the Pseudocode Example: Testing the Conformance of a Data Set section, the beginning of the sentence stating “If the data are in field that conforms to type” is not clear and requires further clarification.
Proposed_Changes
In the Pseudocode Example: Testing the Conformance of a Data Set, rewrite the sentence in question.
- Standard_Part_Section:
- 3.4.1 Testing Data Types of Simple Elements
- Type:
- Editorial
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
In the Measure Description section, there is an unnecessary occurrence of the word “they” in the very last sentence.
Proposed_Changes
In the Measure Description section, delete the word “they” from the last sentence.
- Standard_Part_Section:
- 3.4.2 Domains and Sources of Values for Simple Elements
- Type:
- Substantive
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
The second paragraph identifies how CSDGM classifies domains but nowhere in this Standard could we find definitions for the four domain classifications listed. Since these four classes are used throughout the Data Quality Section, it is suggested that these four classes be defined in this section.
Proposed_Changes
In the second paragraph, add definitions for the “enumerated”, “range”, “codeset”, and “unrepresentable” domains.
- Standard_Part_Section:
- 3.6.1 Location Attributes
- Type:
- Editorial
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
3.6.1.1 Address X Coordinate, Address Y Coordinate and Address Longitude, Address Latitude -- The Measure Name for the US National Grid Coordinate has a capitalization error.
Proposed_Changes
In the Measure Name section, change “Usng” to “USNG”.
- Standard_Part_Section:
- 3.6.2 Non-Location Attributes
- Type:
- Editorial
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
3.6.2.9 Location Description -- In the Evaluation Procedure section, there is an unnecessary occurrence of the words “It can” at the beginning of the second sentence.
Proposed_Changes
In the Evaluation Procedure section, delete “It can” at the beginning of the second sentence.
- Standard_Part_Section:
- 3.7.2 Thoroughfare Addresses
- Type:
- Editorial
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
3.7.2.2 Check for Address Number Range Completeness -- This section appears to be a subsection for 3.7.2.1 and should not have a section number associated with its heading.
Proposed_Changes
From the heading, remove the “3.7.2.2” section number.
- Standard_Part_Section:
- 3.7.2 Thoroughfare Addresses
- Type:
- Editorial
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
3.7.2.1 Address Number Range: Check Address Number Range against Scheme Origin -- In the Evaluation Procedure section, there are two typographical errors.
Proposed_Changes
In the Evaluation Procedure section, change “distinces” to “distances” in the first sentence and change “is great than” to “is greater than” in the last sentence.
- Standard_Part_Section:
- 3.7.2 Thoroughfare Addresses
- Type:
- Editorial
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
3.7.2.1 Address Number Range: Check Address Number Range against Address Scheme Axes -- In the Measure Description section, there are several typographical errors.
Proposed_Changes
In the Measure Description section, change “touching the centerline” to “touches the centerline” in the second sentence. Also, change “Axes” to “Axis” throughout this entire Check Address Number Range against Address Scheme Axes” section, including the title.
- Standard_Part_Section:
- 3.7.2 Thoroughfare Addresses
- Type:
- Substantive
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
3.7.2.1 Address Number Range (page 103) -- In the Check Consistency of Odd and Even Parity section, if this Standard is expanded to include “Both” as a valid value for Parity, then the test here would be to check the Parity of an Address Number Range that is “Even” or “Odd” but you could not test for “Both”.
Proposed_Changes
Modify this test to confirm that the low and high address numbers agree with their specified parity unless the Parity is specified as “Both”, in which case no parity validation is possible.
- Standard_Part_Section:
- 3.9.1 Test for conformance to Address Number Range or domain
- Type:
- Editorial
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
3.9.1 Test for conformance to Address Number Range or domain -- In the Measure Description section, there is an unnecessary occurrence of the word “is” in the first sentence of the second paragraph.
Proposed_Changes
In the Measure Description section, change “This is test is…” in the second paragraph to “This test is…”.
- Standard_Part_Section:
- 3.10 Postal Service Delivery Addresses – Generally
- Type:
- Substantive
- Name:
- Cheryl Benjamin
- Organization:
- NYS GIS Standards & Data Coordination Work Group
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- cheryl.benjamin@cscic.state.ny.us
- Date:
- 13 Jan 2006
Comments
3.10 Postal Service Delivery Addresses -- The first sentence references checks for uniqueness “as noted above” but this is too vague and should instead identify by section the uniqueness checks that should be performed on Postal Service Delivery Addresses.
Proposed_Changes
In the first sentence, replace “as noted above” with references to the section(s) that contain the uniqueness checks that should be performed.
- Standard_Part_Section:
- Introduction
- Type:
- Substantive
- Name:
- wendy blake-coleman
- Organization:
- Organization_Comment:
- I am commenting on behalf of my organization.
- Email:
- blake-coleman.wendy@epa.g
- Date:
- 16 Jan 2006
Comments
Much of the data standard does not support the benefits of spatial data quality which was a key benefit to doing a data standard on Street Address.