PROCEEDINGS OF THE NORAC SIXTH
BREEDING BIRD ATLAS CONFERENCE
DATA COLLECTION AND MANAGEMENT
NA Database | Centralization | Additional
Issues | Proceedings contents list
Chair: Bruce Peterjohn
With the first round of atlases, Ohio was using punch cards, and the atlases began and
ended within state limits.
NORTH AMERICAN DATABASE
Steve Kelling (BirdSource Project Leader,
Cornell Lab of Ornithology)
- Creation of database for breeding bird atlases across North America would need to
build a relational database using the most current technologies.
- A NA database would be easier to maintain, since adding new data and changing
to new database technologies in a single location would be more efficient
and cost effective.
- This would require both a secure location and accessibility.
- Internet-based tools are needed for analysis at any level, across state,
within state, physiographic region, township, etc.
- Technologies now exist on the web where you can select any area of interest for
analysis.
- Crucial to georeference data
Proposal: Create a single repository for all State and Provincial Breeding Bird Atlases
- Build a relational database using the most current database technologies.
- Maintain the database. Keep the database up to date both in terms of new data, as well
as in current technologies. Locate it in a secure environment with an
off-site backup.
- Develop
Internet-based tools for the analysis and dissemination of atlas results.
Ensure that the analysis can occur at any level and that data results can
be analyzed both across states and provinces in physiographic regions, and
by specific townships, ownership parcels, or atlas blocks.
Database Structure
- A major factor in the creation of a North American Atlas database would be to find
the common denominators across all atlas projects, and not impose a rigid
set of rules that would restrict individual state and provincial atlas
project data collection.
- The database structure must be sufficiently flexible to incorporate all manner of atlas
information, but have a set of data that would be collected across all
atlas projects.
- In terms of database structure, a set of common denominators should be implemented
across all atlas projects. This represents the core data for the
North American Database. This core can be likened to the center of an
onion and might include the following data fields:
- Participant identification (necessary for data editing)
- Location (at minimum this could be the centroid of each atlas block. This is
crucial because it georeferences each data point for further analysis)
- Date of sampling
- Time of sampling
- Duration of sampling
- Indication of whether one is reporting all birds or a subset
- Species identification
- Nesting category (pull-down menu of observed behavior codes)
- Measure of abundance (from simple estimates of numbers of each species to specific
point counts
- The above represent the minimum information necessary for inter- and intra-atlas
analysis.
- Other layers can then be added to the onion - the core is simply the basis to start an
atlas project from, NOT the end point. The database design must be
flexible enough to ensure that each atlas project can gather additional
information depending on state or provincial needs. These layers might
include:
- Habitat identifier
- Weather conditions during observations
- Block size variations
The implications of a North America Breeding Bird Database are enormous.
- All atlas data would be in the same general format and at a single location.
Specific differences across atlases could be housed within the database
and available.
- Atlas information could be made widely available and in a form that would be
amenable to further analysis.
- We are building for the long haul. Consider the next iteration of atlases, or the
fourth. All atlas data could be maintained, updated, and housed in a
secure manner, insuring its long-term availability.
- Core data ensures across-atlas analysis capabilities.
- Georeferencing each data point ensures that the North American Breeding
Bird Atlas database can be part of a larger network of environmental and
geographical data.
Distributed data sets, wherein data from different projects exists at different locations which are
connected electronically, is a possible alternative to a centralized
database.
There is interest in georeferencing and distributed databases from USGS, EPA, and GIS companies.
By georeferencing data, we will be getting into a position where data will be very valuable in
conservation decision-making.
Conclusions
- Recommend the creation of a single database for North American Breeding Bird Atlases
- The atlas database should be housed in a secure environment
- A core thread of information (determined by a group such as this) would be a part
of each state and provincial atlas.
- The database would be sufficiently flexible to incorporate any specific data-collecting
requirements for a particular state or province.
- Internet technologies should be developed to make atlas results available in
various formats and georeferenced to ensure that they remain part of the
development of a distributed network of data sets.
DISCUSSION
Mark Wimer: Half of the value of a published atlas is the involvement of local people. Need to
retain this ability to customize.
Charles Francis: The product should be the database. In order to use it, you need to visualize
it. A book is one way to visualize it, and different analyses will use the
data in different ways.
Steve Kelling: A centralized database would eliminate duplication of hardware in each state
and province.
Bob Budliger: How much will it cost?
Scott Sutcliffe: First need commitment of states and agencies and vision of what it should look
like before estimating cost.
DATA CENTRALIZATION ISSUES
Bruce Peterjohn and Keith Pardieck (Patuxent Wildlife Research Center)
Advantages
- Improves uses of data
- Examination of regional patterns of distribution
- Addressing of conservation issues
- Implications for management of species/habitats
- Improves regional comparability of data
- Allows for additional scientific uses of data
- Biogeographic studies
- Landscape ecology studies
Disadvantages
- May reduce comparability with previously collected data from a state, province, or
county
- Issues associated with atlas effort should be addressed
- Is effort standardization possible?
- Need to include effort data in databases
- How can effort be documented?
- Effort to locate species
- widespread vs. localized/specialized species
- Effort to document breeding status
- Observers' identification skills cannot be quantified
DISCUSSION
Impossible to standardize effort - not just hours in the field, also includes observer id
skills, hearing, awareness
Effort to locate species and effort to document breeding status are different
There are methods now that estimate species richness and correlate with level of effort that can
be used to estimate species numbers that occur based on effort invested
If a species is not documented does it mean it is not there or that no one looked? If a species
is documented you know it is there, if not, you don't know whether or not it
is there.
CENTRALIZED DATABASE: ADDITIONAL ISSUES
- Standardization of data collection
- Grid designs
- How to deal with different grid designs?
- Data collection issues
- Standardization of "safe dates," acceptable codes for different
species, ...
- Emphasis on species detection or confirmation of breeding
- Abundance data
- Should a single method be adopted by all atlas projects?
- Data management issues
- Develop a single database format?
- States responsible for quality control
- Need to record effort data
- Proprietary/publication issues
- How important are data proprietary issues in the next atlases?
- Internal publication vs. paper publication
- Extensive publication on the Internet may preclude paper publications
DISCUSSION
Charles Francis: 2 issues - centralized database or distributed databases that can talk to each
other - distributed databases enable local control but are much more
expensive.
Charlie Smith: Federal databases have standards for map accuracy, metadata, and spatial
referencing. If we wish to allow access or want federal funding will need to
adhere to Federal Geographic Data Committee standards. A National Vegetation
Standard has been developed and adopted, and could be used for habitat
designations.
With development of the Internet, remote access and management is now feasible.
Steve Kelling: ESRI is developing Internet tools to enable users to create maps from databases.
Bruce Peterjohn: States that are well-supported by agencies may be able to do much more and have
their own data management, but other states may benefit greatly from
centralized data management to reduce costs.
Coverage standards is a complicated issue, involving not only hours of effort for each block but
also special efforts for particular species, e.g., use of tapes for owls and
marsh birds.
Consensus of the group: Sending the data to a central place once the atlas is completed is highly
desirable. Data bases need to be compatible.
The remaining issues need work. A subcommittee volunteered to work on recommendations to be
reviewed first by this group, then by the full NORAC Committee.
Committee on Core Data Standardization/Central Depository for North American Atlas
Data
- Mike Cadman, Chair
- Chris Elphick
- Charles Francis
- Steve Kelling
- Bruce Peterjohn
- Chan Robbins
- Charles Smith
- Scott Sutcliffe
- Joan Walsh
- Mark Wimer
back to top
OPTIONS FOR
DATA ENTRY ONLINE
/ WEB BASED ATLASES
Web-based Issues | Scanning
Option | Habitat Data Collection
Chair: Scott Sutcliffe
Steve Kelling
- Creation of Internet-based data form for data entry
- Internet use and technology are growing exponentially
- Enormous potential as tool for collection of ornithological data
- Cornell's Backyard Bird Survey Website had three and a half million hits, and 44
thousand checklists.
- Web-based data entry provides rapid access, error detection and correction, and
prompt identification of areas or species needing further effort.
- How do you go about collecting data for the web?
- Can be problematic - a link breaks, a form doesn't work
- Would need to work on an easily navigable and quick-entry site
- Can tinker with form as data is being collected
- Propose standardized form with fields for core data but flexible to enable
customization to each state and province
- Can create smart form in which expert has created set of filters that restrict the
kinds of responses allowed, to avoid mistakes
- Some fields mandatory, others optional
- Ability to catch errors during data entry
- Can develop useful online help procedures that would help people navigate through data
form
- Once you have a centralized data collection system, ability to manage it will
reduce costs for everyone.
- Won't need server and team of developers for every state and province.
- Set of tools can be developed for atlas project where project leaders can develop their
own form for state or province, no need for redundancy.
Why has Lab begun to emphasize this approach?
- Problems with scanning technologies
- Great expense in developing scanning materials
- If need to change form, big problem
- Some scanning idiosyncracies - with
Project Feeder Watch had booklet of forms, blots on adjacent pages
caused problems
- Birds in Forested Landscapes - most participants had graduate degrees,
but 30-40 percent had great difficulty filling out the forms and had to be
called for explanations before data could be entered.
- Opportunity to take advantage of available and developing technologies which will
increasingly become a part of the way people communicate and operate in
North America.
- Smart Forms help to eliminate typos by setting limits on ranges of data that are
entered - species and geographic areas, dates, numbers, etc. - by
questioning the data and requiring confirmation that entry is correct,
also flagging entry for evaluation by experts
WEB-BASED DATA ENTRY ISSUES
Bruce Peterjohn
- Not every atlas participant will have access to the Internet
- Have to develop alternative(s) for those without Internet access
- Data entry requirements for participants may reduce efforts in the field
- Procrastinators will procrastinate even longer
- Quality control issues remain after data are entered over the Internet
- Edits in data entry interface can reduce likelihood of errors
- Quality of entered data will vary between participants, and it may be necessary to
check some/all of the entered data against original field cards
SCANNING AS A DATA ENTRY OPTION
- Minimize/avoid character recognition
- Character recognition increases data verification time
- Still faster than manual data entry
- Use of "bubbles" is preferred
- Efficient processing of data forms
- Minimal data verification time is required
- Current software is very reliable
Disagreements regarding expense of scannable option and difficulty of
modifying form.
- Scanning technology is improving and costs are decreasing.
- Optical character recognition is improving but still has a ways to go, can
separate whether field is reading character or number.
- Takes 15 minutes per form to edit BBS scanned form, an hour per form to enter and
verify by hand.
- Economy of scale is considerable for hardware and data management costs.
DISCUSSION
Need to weigh observer preference for kinds of data - bubble, character, electronic entry.
Joan Walsh: Even if 30% of people enter their own data it will be a big help. Data proofing will
still be necessary.
Rick West: The BBS is much more complex than atlases. Atlases could use other alternatives, such
as an e-mail form attached back rather than on-line data entry.
HABITAT DATA COLLECTION
Dan Brauning
Discussion emphasized the importance of adding new information to earlier atlas data to justify
the effort.
back to top | proceedings
contents list
The NORAC website is hosted and maintained by Bird Studies Canada