Draft Address Data Standard - Data Quality

Data Quality

Part 3: Street Address Data Quality

The purpose of this section is to provide a set of methods for applying spatial quality standards to addressing. Data quality refers to the description of how to express the applicability or essence of a data set or data element and include data quality, assessment, accuracy, and reporting or documentation standards.

Quality standards that apply to addressing consist of a series of standards that apply to spatial data generally, as well as a standard from the National Emergency Number Association (NENA). General data quality standards cannot speak to a specific type of database. The NENA standard describes address quality relative to an established Automatic Location Information (ALI) file, as is appropriate for emergency services. It gives an example of describing very specific tests to assess the fitness of a dataset for emergency services. Similarly, the United States Postal Service (USPS) Postal Addressing Standards describe addresses as used for mailing.

There remains a gap in guidance for assessment of quality for addresses themselves, information independent of a specific format or use. Local variations in address data sets is so great that very specific tests, such as those described in the NENA standard, are not applicable to every situation.. A series of more general methods can provide assistance in a variety of situations, working within the framework described by more broadly stated spatial quality standards. . This section seeks to support existing standards, providing content tests for specific address uses and describing ways of testing the quality classifications required for complete metadata.

Part 3: Data Quality
(in .pdf form)




Standard_Part_Section:
3.2 About Observing Address Quality
Type:
Editorial
Name:
Anne O'Connor
Organization:
U.S. Census Bureau
Organization_Comment:
I am commenting on behalf of my organization.
Email:
anne.v.o.connor@census.gov
Date:
10 Jan 2006

Comments

Section 3.2, 2nd paragraph -- "Similarly, quality control for addresses takes both...

Proposed_Changes

Add an 's' to take.


Standard_Part_Section:
3.3 Testing Address Quality - Generally
Type:
Substantive
Name:
Mike Walls
Organization:
Organization_Comment:
Email:
michael.walls@atosorigin.com
Date:
11 Jan 2006

Comments

It is unclear to me at what stage of the data provisioning process the testing should occur. When and how should we test source data for transformation into a standards-compliant data set, the allegedly compliant data, or data sets resulting from importing a transfer?

Proposed_Changes

1) Add section providing overview of how you conceptualize this data flow. 2) Cross-reference tests to stages in the flow. For one example, format tests only make sense for data that is loaded into a standards-compliant data set. Source or destination data sets may well be structured quite differently. 3) Distinguish more systematically between tests of data vs. tests that verify outcome of a process. A quality test on the output records may identify defects in the input data OR defects in the loader application.


Standard_Part_Section:
3.4 Tests for Simple Elements - Generally
Type:
Substantive
Name:
Mike Walls
Organization:
Organization_Comment:
Email:
michael.walls@atosorigin.com
Date:
11 Jan 2006

Comments

Testing for simple element compliance needs expansion.

Proposed_Changes

1) We need testing for MANDATORY vs. OPTIONAL element content, integrated with (a) presence of element in data set record layout and (b) conformance and consistency with domains. For example, a mandatory element may have blank or 'UNKNOWN' as part of its domain. (Most databases only exclude NULL values from a column declared to be mandatory.) 2) One straightforward test of data formatting is to run the data through a known-good loader procedure and look at what records made it into the output and what did not. This is a fundamental test of structure, but needs some explication in terms of how to interpret the results within the overall quality context. (If a record loads it is by definition in the right format.)


Standard_Part_Section:
3.7.2 Thoroughfare Addresses
Type:
Substantive
Name:
Steve Grise
Organization:
ESRI
Organization_Comment:
Email:
sgrise@esri.com
Date:
13 Jan 2006

Comments

3.7.2.1 – I don’t see how this standard provides an indication of how to share geometry for Thoroughfares (in fact this is excluded in the scope), but the standard is quite specific about segmentation rules: · Lines associated with addresses do not overlap, except where schemas are intermingled along a linestring. · Lines do not break except at intersections, dead ends and locations where streetnames change. These seem like rules for a specific master repository and not something people could commit to providing and validating for every address exchange.

Proposed_Changes

Remove geometric and topological rules from the standard unless geometry is part of the scope, and there is substantial agreement on segmentation and topological rules.


Standard_Part_Section:
Introduction
Type:
Editorial
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

In the third paragraph of the Introduction, there are two sentences that incorrectly end with double periods.

Proposed_Changes

In the third paragraph of the Introduction, remove the duplicate periods at the end of the second and third sentences.


Standard_Part_Section:
3.1.1 Objectives of Existing Standards
Type:
Editorial
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

3.1.1 Objectives of Existing Standards: Table -- In the Purpose for the “Geographic information – Data quality measures (ISO 19138)” Standard, there is a spelling error.

Proposed_Changes

In the Purpose for the “Geographic information–Data quality measures” Standard, change “standardising” to standardizing”.


Standard_Part_Section:
3.4.1 Testing Data Types of Simple Elements
Type:
Substantive
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

In the Pseudocode Example: Testing the Conformance of a Data Set section, the beginning of the sentence stating “If the data are in field that conforms to type” is not clear and requires further clarification.

Proposed_Changes

In the Pseudocode Example: Testing the Conformance of a Data Set, rewrite the sentence in question.


Standard_Part_Section:
3.4.1 Testing Data Types of Simple Elements
Type:
Editorial
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

In the Measure Description section, there is an unnecessary occurrence of the word “they” in the very last sentence.

Proposed_Changes

In the Measure Description section, delete the word “they” from the last sentence.


Standard_Part_Section:
3.4.2 Domains and Sources of Values for Simple Elements
Type:
Substantive
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

The second paragraph identifies how CSDGM classifies domains but nowhere in this Standard could we find definitions for the four domain classifications listed. Since these four classes are used throughout the Data Quality Section, it is suggested that these four classes be defined in this section.

Proposed_Changes

In the second paragraph, add definitions for the “enumerated”, “range”, “codeset”, and “unrepresentable” domains.


Standard_Part_Section:
3.6.1 Location Attributes
Type:
Editorial
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

3.6.1.1 Address X Coordinate, Address Y Coordinate and Address Longitude, Address Latitude -- The Measure Name for the US National Grid Coordinate has a capitalization error.

Proposed_Changes

In the Measure Name section, change “Usng” to “USNG”.


Standard_Part_Section:
3.6.2 Non-Location Attributes
Type:
Editorial
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

3.6.2.9 Location Description -- In the Evaluation Procedure section, there is an unnecessary occurrence of the words “It can” at the beginning of the second sentence.

Proposed_Changes

In the Evaluation Procedure section, delete “It can” at the beginning of the second sentence.


Standard_Part_Section:
3.7.2 Thoroughfare Addresses
Type:
Editorial
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

3.7.2.2 Check for Address Number Range Completeness -- This section appears to be a subsection for 3.7.2.1 and should not have a section number associated with its heading.

Proposed_Changes

From the heading, remove the “3.7.2.2” section number.


Standard_Part_Section:
3.7.2 Thoroughfare Addresses
Type:
Editorial
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

3.7.2.1 Address Number Range: Check Address Number Range against Scheme Origin -- In the Evaluation Procedure section, there are two typographical errors.

Proposed_Changes

In the Evaluation Procedure section, change “distinces” to “distances” in the first sentence and change “is great than” to “is greater than” in the last sentence.


Standard_Part_Section:
3.7.2 Thoroughfare Addresses
Type:
Editorial
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

3.7.2.1 Address Number Range: Check Address Number Range against Address Scheme Axes -- In the Measure Description section, there are several typographical errors.

Proposed_Changes

In the Measure Description section, change “touching the centerline” to “touches the centerline” in the second sentence. Also, change “Axes” to “Axis” throughout this entire Check Address Number Range against Address Scheme Axes” section, including the title.


Standard_Part_Section:
3.7.2 Thoroughfare Addresses
Type:
Substantive
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

3.7.2.1 Address Number Range (page 103) -- In the Check Consistency of Odd and Even Parity section, if this Standard is expanded to include “Both” as a valid value for Parity, then the test here would be to check the Parity of an Address Number Range that is “Even” or “Odd” but you could not test for “Both”.

Proposed_Changes

Modify this test to confirm that the low and high address numbers agree with their specified parity unless the Parity is specified as “Both”, in which case no parity validation is possible.


Standard_Part_Section:
3.9.1 Test for conformance to Address Number Range or domain
Type:
Editorial
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

3.9.1 Test for conformance to Address Number Range or domain -- In the Measure Description section, there is an unnecessary occurrence of the word “is” in the first sentence of the second paragraph.

Proposed_Changes

In the Measure Description section, change “This is test is…” in the second paragraph to “This test is…”.


Standard_Part_Section:
3.10 Postal Service Delivery Addresses – Generally
Type:
Substantive
Name:
Cheryl Benjamin
Organization:
NYS GIS Standards & Data Coordination Work Group
Organization_Comment:
I am commenting on behalf of my organization.
Email:
cheryl.benjamin@cscic.state.ny.us
Date:
13 Jan 2006

Comments

3.10 Postal Service Delivery Addresses -- The first sentence references checks for uniqueness “as noted above” but this is too vague and should instead identify by section the uniqueness checks that should be performed on Postal Service Delivery Addresses.

Proposed_Changes

In the first sentence, replace “as noted above” with references to the section(s) that contain the uniqueness checks that should be performed.


Standard_Part_Section:
Introduction
Type:
Substantive
Name:
wendy blake-coleman
Organization:
Organization_Comment:
I am commenting on behalf of my organization.
Email:
blake-coleman.wendy@epa.g
Date:
16 Jan 2006

Comments

Much of the data standard does not support the benefits of spatial data quality which was a key benefit to doing a data standard on Street Address.

Proposed_Changes