SCEDC Home
Home
data center chronicles
Volume II, Issue 2

Welcome to the fifth issue of the Southern California Earthquake Data Center's electronic newsletter. We produce this compilation of news and information about the SCEDC as part of our continuing efforts to keep users informed about the Data Center and promote the data, tools and services we provide at the SCEDC.

For a web-based version of this newsletter, please click on the link below or paste the URL into your browser's address bar: http://www.data.scec.org/about/chronicle/vol2issue2.html

If you would like to subscribe to our mailing list, you can sign up (or unsubscribe) at: http://www.data.scec.org/mailman/listinfo/scedc_users. Please send your questions, comments and suggestions on this newsletter or any SCEDC issues to: vikki_at_gps.caltech.edu.

In This Issue:
A. The Archive
B. What's new with STP (Seismic Transfer Program)?
C. What's new on the SCEDC Website?
D. Searchable Catalog of Scanned Analog Seismic Records
E. Searchable Catalog of Moment Tensor Solutions
F. Google Output Available from the SCEDC
G. Location Codes: Coming Soon to a Seismogram Near You!
H. Highlight: the Station Information System
I. Email Virus Alert


A. The Archive

The Archive: By the Numbers

Number of earthquakes in the 1932-present Caltech/USGS catalog: 623,872 earthquakes Total size of the waveform archive: 6,893 GB
Size of SCEDC parametric and waveform database: 239,552,775 rows

Data transferred via STP:

Q1: January 1-March 31:

  • 17,325,347 waveforms = average of 191,667 waveforms daily = 2.22 waveforms per second!
  • 532 gigabytes of waveform data = average of 6,051 megabytes bytes daily = 70 kilobytes per second

Q2: April 1-June 30:

  • 3,156,771 waveforms = average of 34,690 waveforms daily
  • 465 gigabytes of waveform data = average of 5,111 megabytes daily = 59 kilobytes per second.

Q3: July 1-September 30:

  • 1,671,318 waveforms = average of 18,570 waveforms daily
  • 271 gigabytes of waveform data = average of 3,018 megabytes daily = 35 kilobytes per second.

From January 1 - Sept 30, 2005, the SCEDC archived:

  • 11,975 events
  • 3,037,653 waveforms
  • 239,855 arrivals
  • 890,742 amplitudes

magnitude

Number of
local events (le):

0-1

3,118

1-2

6,012

2-3

1,290

3-4

132

4-5

20

5-6

4


# of events

event type

10,576

le (local event)

473

qb (quarry blast)

936

re (regional event)

99

sn (sonic blast)

578

ts (teleseism)

12,662

TOTAL


Six month summary of requests for catalog information:

January

74,812

February

60,048

March

67,981

April

120,219

May

166,136

June

690,487

Total

1,179,683


Continuous Archiving of High-Sample Rate Data

The SCEDC continuously archived high sample-rate data (HH_, HL_ (80 sps) and/or EH_, EL_ (100 sps)) for the following significant events:

Obsidian Butte Swarm
EVID: 14179736 Mag = 5.1
Origin date/time: 2005/09/02, 01:27:19
lat/long: 33.1598, -115.6370
EVID: 14178184 Mag = 4.6
Origin date/time: 2005/08/31, 22:47:45
lat/long: 33.1648, -115.6357
channels/time available: HH_, HL_, EH_
Continuous archive from 2005/08/31,00:00:00 to 2005/09/06,00:00:00

Anza/ Yucaipa Events
EVID: 14151344 Mag = 5.2
Origin date/time: 2005/06/12, 15:41:46
lat/long: 33.5288, -116.5727
EVID: 14155260 Mag = 4.9
Origin date/time: 2005/06/16, 20:53:2
lat/long: 34.058, -117.0113
channels/time available: HH_, HL_, EH_
Continuous archive from 2005/06/12,12:00:00 to 2005/06/18,00:00:00

Wheeler Ridge Event
EVID: 14138080 Mag = 5.2
Origin date/time: 2005/04/16, 19:18:13
lat/long: 35.0272, -119.1783
channels/time available: HH_, HL_ / -6h, +12h

More information on this topic is available at: http://www.data.scec.org/about/sigeventsshot.html


B. What's new with STP (Seismic Transfer Program)?

Additional STP Server

When significant events occur and large numbers of users log on to STP, you may find yourself waiting in line. To accommodate the increasing number of STP users, we have added a third server that accesses a read-only database, which is replicated from our two main databases. This addition will increase the reliability of our service and allow more simultaneous users. If one of our servers is at full capacity, the STP client will automatically connect to the next server in a seamless process.

SAC2000 Module for Macintosh

Last year, the SCEDC released SAC2000 modules for Linux and Solaris that enable users to issue STP commands directly from within SAC2000. Now our Mac users will have the same flexibility. We have developed a SAC2000 module for Mac OS X. that we are currently testing. This module will be available for download very soon.


C. What's new on the SCEDC Website?

3D Velocity Model for Southern California: Version 4 now available

The Three-Dimensional Community Velocity Model for Southern California provides a unified reference model for the several areas of research that depend of the subsurface velocity structure in their analysis. These include strong motion modeling, seismicity location, and tomographic velocity modeling. It is also hoped that the geologic community will find the basin models useful because they are based on structures and interfaces that are largely derived from geologic structure models.

The Community Velocity Model has been released in progressive versions, and it is recommended to use version 4 over previous versions. Version 4 of the SCEC model is available at: http://www.data.scec.org/3Dvelocity/

One-Stop SCEDC Software and Downloads Page

The SCEDC has made all of our software tools, catalogs, models and waveform retrieval tools available from a single download page at: http://www.data.scec.org/research/downloads.html. This page has the most recent versions of all of the products and the software we produce and host.

Website Map

The SCEDC has a lot of great information available on our website. To make sure that users can find what you're looking for, or discover something you didn't know we had, we've built a website map at: http://www.data.scec.org/sitemap.html. This map is accessible from most of the SCEDC's web pages via the red "website map" link below the left-hand navigation menu.


D. Searchable Catalog of Scanned Analog Seismic Records

The Caltech Seismological Lab recently completed a project to scan pre-digital analog recordings of major earthquakes recorded in Southern California. We have scanned records for M>3.5 earthquakes between 1962 and 1992 and other significant teleseisms. These scans are now available for download through our new search page at: http://www.data.scec.org/research/scans/. Search features include the ability to search by date, station, instrument, and orientation; the option to sort search results by date, and the option to download multiple files as a single zipped archive.

There are two output formats for the scanned results:

  1. Raster image format (TIFF) for 1-90 intermediate period record (1 sec seismometer free period, 90 sec galvanometer free period) and 30-90 long-period record (30 sec seismometer free period, 90 sec galvanometer free period).
  2. High-resolution JPEG format for WA (Wood-Anderson) records with file sizes ranging from 3-8 Megabytes.

The naming format for the scanned records follows the convention:

NET_STA_BAND_INSTR_DIR_YYYYMMDD_HHMM

Example:
CI_PAS_30-90_N_19690108_1500.tiff
The north-south 30-90 record for Pasadena beginning at 1500 UTC on January 8, 1969.


E. Searchable Catalog of Moment Tensor Solutions

The SCEDC is currently archiving Moment Magnitudes and Moment Tensor Solutions (MTS) produced by the SCSN in real-time and post-processing solutions for events spanning back to 1999. These solutions are in the SCEDC searchable database and are available for distribution from the consolidated catalog search page (Moment Tensors tab) at: http://www.data.scec.org/catalog_search/CMTsearch.php.

The automatic MTS runs on all local events with Ml>3.0, and all regional events with Ml>=3.5 identified by the SCSN real-time system. The solution is emailed to SCSN personnel about 10 minutes after an event. If the quality of waveform fits is good enough, and the event is within the SCSN reporting region, it is automatically distributed to the outside world. The distributed solution automatically creates links from all USGS Simpson Maps to a text e-mail summary solution, creates a .gif image of the solution, and updates the moment tensor database tables at the SCEDC. The solution can also be modified using an interactive web interface, and re-distributed. The SCSN Moment Tensor Real Time Solution is based on the method developed at UC Berkeley by Doug Dreger.


F. Google Output Available from the SCEDC

KML (Google Earth) Catalog Output

If you frequently use our catalog search at http://www.data.scec.org/catalog_search/, you may notice a new output format, KML. KML, or Keyhole Markup Language, is an XML-based language for creating files that can be loaded into Google Earth, a 3D application that functions as an interactive 3D globe letting you seamlessly zoom in on any part of the world from a global scale to down to a few meters above the ground.

Viewing your search results in Google Earth offers many exciting features:

  • Zoom in on event epicenters.
  • Tilt the view to study terrain from different angles, with 3D rendering in some areas.
  • "Fly-by" tour of your search results.

Google Earth support has been implemented for date/magnitude/location, event ID, and polygon searches. To use this new feature, select KML in the output format pull-down box when you search. If you are directly saving your search results to an output file, make sure that its name ends in .kml. If your search results are being displayed in a web page, copy and paste the complete results to a text file whose name ends in .kml. Load your search results by opening Google Earth and then opening your .kml file. Placemarks for the search results will be displayed in the left-hand menu under "Temporary Places" in a subfolder named "SCEDC Catalog Search Results."

Google Earth can be downloaded from http://earth.google.com. More information about the KML schema is available at http://code.google.com/apis.html#earth. Google Earth is currently available only for Windows.

Google Map Catalog Output

The second new product available from the drop-down "Output Format" menu on the catalog search page is "Google Map," which will plot the results of your query directly onto a Google Map. The map icons are color-coded by magnitude i.e., all magnitude 1-2 markers are white, magnitude 2-3 markers are purple etc. The earthquake's magnitude is displayed when you mouse over the icon, and more information (time/date, event ID, latitude/longitude, depth and magnitude) about the event is displayed when you click on the event's icon. This development is ongoing, and we are working to improve the response for larger queries, which currently take a much longer processing time, so kindly limit your queries to shorter time periods.


G. Location Codes: Coming Soon to a Seismogram Near You!

The SCEDC uniquely describes the seismograms we archive and distribute using the FDSN (Federation of Digital Seismic Network) system, which includes the following four fields:

  1. Network (2 characters)
  2. Station (3-5 characters)
  3. SEED Channel (3 characters; see SEED Reference Manual Appendix A)
  4. Location Code (2 characters)
e.g., NN.SSS.CCC.LL

The FDSN standard has always included the Location Code field, but it was not used by the SCSN until recently. The SCEDC will now use the Location Code field to uniquely identify SCSN data streams.

Location Code is used to distinguish between multiple seismograms with identical station and channel names. For example, a station equipped with both STS-1 and STS-2 broadband high gain seismometers would produce two data streams with the same net.sta.cha identification. Also, the SCSN uses orientation codes of [1,2,3] for data channels from downhole sensors and [Z,N,E] for traditionally-oriented surface channels. Without Location Codes, we cannot have multiple downhole sensor packages without changing the station or channel names on the second to n-th downhole sensors.

Currently, the default value for SCSN Location Codes is blank i.e., that field contains two blank spaces. For instances where a different Location Code is necessary to uniquely describe a data stream, the SCSN will follow the SEED convention of allowed characters (A-Z, 0-9, space) and identify streams with "01" for the first non-unique stream, "02" for the second, etc. Users (or their software) should not assume that Location Codes have a meaning; the SCSN will not use this field to encode information like emplacement depth, preferred channel, sensor type, orientation etc. However, the full SCNL description can be used as a unique key into complete descriptive information about the characteristics of the data stream.

SCEDC users should be aware that if you do not specify a Location Code in your data request, the Data Center will provide all seismograms that match that net.sta.channel, so you may receive multiple seismograms where you only expect one. For ASCII output where Location Code is a whitespace-delimited field, a blank-blank Location Code will be assigned "--" and when parsing ASCII input, "--" should be interpreted as 2 blanks.

The naming convention for seismograms will be to only include the Location Code in the filename if it is something other than the default value of blank-blank:

Triggered Waveforms
Now:
14176696.CI.USB.HLE.sac
Future:
14176696.CI.USB.HLE.sac and
14176696.CI.USB.HLE.01.sac (if there is an 01 Loc Code)

Continuous Waveforms
Now:
20050831000000.CI.USB.HLE.sac
Future:
20050831000000.CI.USB.HLE.sac and
20050831000000.CI.USB.HLE.01.sac (if there is an 01 Loc Code)


H. Highlight: the Station Information System

The Data Center has developed the Station Information System (SIS) to manage station metadata for the California Integrated Seismic Network (CISN) Southern California Management Center (SCMC). The goal of this project was to develop a simplified database-driven system that can interact with a single database source to enter, update and retrieve station metadata easily and efficiently.

Over the course of this project, the SCEDC: redesigned the database schema, built a dynamic PHP website to allow users to view hardware and other station information held in the SIS database at: http://www.data.scec.org/stations/views/sta_hardware.php (this is also available from the "Display Station Hardware Information" link from our main Stations/Instrumentation page at: http://www.data.scec.org/stationinfo.html), built a Graphical User Interface to interact with the database, and have migrated all of the online broadband, K2 and short-period telemetered stations data into the SIS database.

All station field changes that result in a change of a station's response have been recorded in the SIS using the SIS GUI since 11/01/2005. Dataless SEED volumes are now generated from data held in SIS via a stored procedure, and all SEED volumes created since 11/01/2005 are from the SIS database. The IRIS DMC has verified the dataless volumes produced by this system.

Redesigned Database Schema

The SCEDC staff has considerable education and experience with databases, so we knew that we should invest significant time and effort on the data model, which has had a very positive impact on the end product. The SIS's highly normalized logical data model (an ERD is available from http://www.data.scec.org/stations/SIS/SIS_ERDV4.jpg) is implicitly designed for performance. If a database is not well modeled, it becomes clear to the applications and the users. The SIS's well-designed data model reduces the need for programming changes and increases application maintainability.

Normalization in a database design allows for efficient access and storage of data in a relational database. The purpose of the normalization process is to reduce redundancy (same information stored more than once) and secure data integrity (ensure that the database contains valid information). This is achieved by reducing large entities into several other, lesser entities, which together contains the same information without repeating it.

Every time a decision to denormalize the database is made, a price is paid. The cost is lost flexibility, future scalability, performance, and data integrity. By denormalizing, there is data redundancy in the database, which needs to be managed through program code, either at the GUI or by using triggers. Denormalization may solve one part of dealing with performance, but it creates possible performance problems in several other areas and data integrity is at high risk. A clean, normalized database always gives good performance and preserves data integrity.

Database Packages, Procedures and Functions

With this project, the SCEDC requested that embedded SQL statements NOT be allowed in applications. All SQL that is routinely executed was written as stored procedures and functions, contained in database packages. Database stored procedures for most common field operations (sensor swaps, new station installs) have been written and are used by the GUI. A user can record a sensor swap for a station and have all updates to gain and epochs for all relevant channels immediately available in the database in one step.

What are the benefits?

  1. Programmers do not need to worry about the database structure, which is beneficial because most programmers don't like working with databases that are as well-designed as the SISDB.
  2. Changes in the database structure/access paths do not influence application logic
  3. Tuning of SQL is done independently of how many times (or places) this access path is in use
  4. Stored procedures outperform any programmer's SQL - our DBA can write the statements to utilize indexes and performance improving hints that programmers typically aren't aware of.

SIS Graphical User Interface

The SIS interface is a Java program that directly accesses and updates the SIS database. The users requested that the design provide drop-down lists, radio buttons, and drill-down (nested tree) capabilities. For instance, a user can type in a minimum set of parameters and then assemble a station from drop-down lists of components in inventory (i.e., components that are not installed at other stations.). The use of drop-down lists, radio buttons, and forms that are pre-populated with default configurations have significantly reduced problems associated with typos and cut and paste errors. When a technician enters a new instrument into inventory, s/he can accept the nominal values for an instrument of that type, or can modify the values if desired. Field staff make their changes directly in the SIS interface and trigger a pre-compiled email to the operators' mailing list when the changes are submitted.

Process

During the development of this system, the Data Center met with many users of metadata to determine what their needs were for a Station Information System and we discovered that each user community had different priorities and used station data in different ways. We investigated a number of currently-existing systems that deal with station-metadata. We thieved the best ideas from areas that these systems did particularly well and developed strategies to avoid methods did not work well in some systems.

As our development progressed, we held regular bi-weekly meetings with the SCSN field technicians and showed them our progress and received feedback that helped guide our work to meet their needs. The filed staff provides two different styles of information to the SIS, so they have two function-based areas: one for fieldwork (Field Maintenance) and another tab for lab-based actions (Hardware Maintenance).

We went through a period of extensive testing including user-testing where all users of the SIS GUI were given a worksheet package containing scenarios for users to work through that tested the interface and determine if any modifications were necessary to improve the system.

Future Work:

  • The Data Center will populate the SIS with as much information as we have available for historic stations. We have file cabinets of paper records for historical stations that have never been looked at, because the systems that we had in place were not flexible enough to allow entry of incomplete information or of non-standard station configurations.
  • As a direct result of our SIS efforts, the SCEDC is finally in a position where we can effectively manage our metadata and easily exchange our metadata with our CISN partners and other organizations. We are looking forward to this next phase of SIS development.

Acknowledgements

The Station Information System project was financed with joint special funding to the Data Center from the USGS/ANSS and IRIS. This project has been a tremendous success, which we could not have achieved without this financial sponsorship. The SIS Gang sincerely thanks these organizations for their support.


I. Email Virus Alert

Recently there has been a worm spreading through emails that claim to be from the SCEDC. The message may appear to come from an address like webmgr@quakedc.gps.caltech.edu or postmaster@quakedc.gps.caltech.edu (but is actually forged) and have a subject line similar to "hi, ive a new email address." The body of the email may look like:

hey its me, my old address dont work at time. i dont know why?! in the last days ive got some mails. i' think thaz your mails but im not sure!

plz read and check ... cyaaaaaaa

and will include an attachment with a name similar to mailtext.zip or mail_body.zip.

If you receive such an email, simply delete it. Do NOT download any attachments.


Research Tools
General Earthquake Information
Stations/ Instrumentation
Educational Resources
About the Data Center >
• website map