Section 2: The Distribution of Earthquakes


Concept: No definite connections have been found between any potential influencing factor and the occurrence of earthquakes.



In this activity you will look at the possibilities of connections between seismic activity and some sort of daily cycle and/or natural forces. To do this, you'll study the distribution of earthquakes from a variety of data sets, with respect to the time of day those events occurred.

You'll start out by looking into one of the more popular ideas about earthquake timing. Each of the two exercises is basically self-explanatory, so begin whenever you're ready.

Exercise 1Do Large Earthquakes Always Happen in the Morning?

In recent years, a popular notion has taken root in the minds of many southern Californians: large earthquakes always happen in the morning! The magnitude 7.3 Landers earthquake and its largest aftershock, the Big Bear earthquake, shook awake a lot of people in 1992. Those events were still fresh on their minds when another large rupture, the magnitude 6.7 Northridge earthquake disturbed the sleep of millions in and around the Los Angeles area. Most recently, the Hector Mine earthquake of October 1999 struck at 2:47 am. Never mind that the Joshua Tree earthquake, which ultimately led to the Landers rupture two months later, struck just before 10:00 pm Pacific Daylight Time; a pattern was perceived independently by countless residents, especially those who could remember that the 1987 Whittier Narrows occurred just before 8:00 am. When this "discovery" was brought up over lunch or coffee with others who'd noticed the same thing, that only served to reinforce it. Now it is part of the "earthquake culture" in this area.

But is there any credence to this idea? Is it total nonsense, or does it seem to be more of a rule, with only a few exceptions? As we've already shown (by mentioning the Joshua Tree earthquake), large earthquakes don't always happen in the morning, but do our records of earthquakes show any bias toward a particular time of day?

To try and answer that question, you'll need an appropriate set of data. This has been already "distilled" for you: a list of the 119 earthquakes (but not aftershocks) in southern California between 1933 and 1997 with a magnitude of 4.8 or greater.

After you finish reading these directions, link to the earthquake list page. That list contains every earthquake greater than magnitude 4.8 to strike southern California between 1933 and 1997. Aftershocks are not included in this table, because they might tend to bias the data in favor of the times when the largest earthquakes have occurred. Preceeding each earthquake's date and magnitude is the time of day, given in Pacific Standard Time, at which that earthquake struck.

Make a chart or graph with 24 divisions (technically, a histogram), representing each of the hours in a day. Count the number of earthquakes in the list that happened during each hour, and add those figures to your histogram. When you are done, study the results so that you can answer the questions below.

    1. Is your histogram fairly regular (in the number of quakes per hour) or does it have sharp dips and rises?

    2. If there are sharp dips and rises, do they occur fairly randomly, or do they favor certain times of day?

    3. Do large earthquakes happen only in the morning?

  1. What would you say your results suggest regarding a connection between time of day and the onset of large earthquakes?

Your histogram should have revealed two fairly conspicuous peaks, one in the morning, and one in the afternoon. Compare your results to these example histograms of the same data. One is a simple bar graph, the other a fancier graph that plots not only number of earthquakes per hour bracket, but also the year of occurrence and magnitude. Take a look at these to see how they compare to your histogram and then return here.

Now that we've established that the "peaks" in this data do exist, the important question to ponder is "Are they significant?"

Realize that we have taken a set of 119 data points, which may seem like a lot, but that we've separated in into 24 different brackets. That makes for an average of about five per bracket. This is not a very large sample set, and that's important, because when you work with a too-small sample set, chance can produce "patterns" that might go away with a larger set of data.

For example, there is a pretty good chance that if you flipped a coin only five times, each of those times it would come up "heads". If you stopped there and performed no more trials, you might then conclude that the coin could only come up "heads", which assuming it wasn't a trick coin, would be a poor conclusion. On the other hand, if after 85 flips of the coin it had not once come up "tails", you would have good reason to believe it never would, because it was somehow rigged to always land with the "heads" side up. (Regardless of the odds in your favor, you might still be wrong!)

Suppose we could run some random trials and see how likely it is that we get a configuration like the graph we got from our 119 large earthquakes? If most of the random trials showed a very even distribution, we might conclude that there is something affecting the timing of large earthquakes in southern California. However, if it's relatively easy to produce large peaks in the "data", then there's no reason to assume we have a pattern on our hands.

Fortunately, you can do just that! If you go back to the page of example histograms, you will see a link at the bottom that says "Generate a random sample of 119 earthquakes". Take one last look at our real histogram, and then follow this link. The page it takes you to will generate data sets similar to the one we used, but entirely random. A script will automatically generate a histogram of this data that resembles our real sample set.

Use the "Generate another random sample" link at the bottom of the random-generator page to generate at least 10 different random sets of data and their corresponding histograms. Then use the other link to return here and answer the questions below.

    1. Did any of the random plots come up looking at all like the distribution of the real histogram you made (large peaks)?
    2. What was the highest single bracket you saw?
    3. Was this total larger or smaller than the highest hourly total on the real histogram?
    4. Did you see any zeroes in the random histograms, and were there any in the real histogram?
    5. Were any other "patterns" apparent?

    1. All in all, what is your revised analysis of our results?
    2. Would you still conclude the same thing you did before?
    3. Is it feasibly possible (i.e. the odds against it are far from astronomical) that a random process could have produced the distribution you saw in the real histogram?

  1. Would you say that our sample set of 119 earthquakes is just too small to be useful? (If so, cheer up -- Exercise 2 focuses on a much larger set of data!)

  2. If you feel that the peaks on the real histogram might be significant, can you think of anything that might cause surges in seismicity at those hours of the day? Keep in mind, as you move on to Exercise 2, that it is generally assumed that all earthquakes start in the same way, and that large ruptures are simply small ruptures that failed to stop propagating. Hence, we should be able to look for increases in earthquake frequency at all magnitudes, not just for large earthquakes.

  3. (Optional -- this can be used as the basis for a fairly simple independent research project.) Obtain records of the timing of past weather, tides, or other repeating natural phenomena (eclipses, the sunspot cycle, etc.). Compare the timing of any one of these phenomena to the list of earthquakes used above. Are your results much different?

Exercise 2Daily Variations in Seismicity?

As mentioned above, it is thought that all earthquakes start in the same way, as tiny ruptures along fault surfaces. For whatever reason, some continue to propagate and grow, while others stop relatively quickly. Only a few continue to rupture for more than a second, and a very rare one may continue on for hundreds of kilometers, lasting over a minute. But initially, they all seem to be equal. Hence, if we want to know if large earthquakes are more likely to occur during a particular time of day, we should be able to look at the hourly occurrence of small earthquakes and extrapolate from this much larger, more reliable data set.

So what do you see if you make histograms of all earthquakes recorded by the Southern California Seismic Network for the years 1995 through 1998? Take a look and see for yourself.

Did you see an obvious trend in the data? You should have. There is a definite, broad peak between hour 02 and hour 14, in every year's data set. Bear in mind that this is GMT, Greenwich Mean Time. Hour 08 thus corresponds to 12:00 midnight, Pacific Standard Time.

  1. Given that time conversion, during what hours of the day, Pacific Standard Time, does the peak in seismicity cover?

  2. Does this span of time coincide with any natural cycle you can think of? (Think simply.)

  3. If this peak is some sort of "artifact" in the data, what could we do to minimize it? Is it possible that the peak isn't so much a peak, but that the lower part of the graph is more of a trough?

Since the peak you see in the data corresponds with nighttime, it's possible that higher noise levels on seismic recording instruments resulting from solar radiation, wind, and human activity might be "drowning out" the signals from smaller earthquakes. Perhaps if we set a cut-off for minimum earthquake magnitude, we could eliminate this bias. This will reduce the size of our sample set, but hopefully not by so much that our results become heavily affected by chance.

We have trimmed the same set of earthquakes we used in the previous plot in such a way that only earthquakes of magnitude 2.0 and greater are considered valid data points. Compare this diagram with the previous one. Then return and answer a few final questions.

  1. How did the histogram of earthquakes of magnitude 2.0 and greater compare with the histogram of all earthquakes? (Aside from the fact that one is a bar graph and the other a line graph.) Was the peak at all present in the more refined data set?

  2. Do you think the peak was just an artifact, or could it be that we cut our sample set down so small that it was overwhelmed by chance distribution?

  3. Approach the question a different way: if the data for magnitude 2.0 and greater showed no peak, then what would the data for the set of all earthquakes no larger than magnitude 1.9 look like? These earthquakes are near the limit of detection, even with modern seismic networks. Does it make sense that the process of recording them would be susceptible to even slight interference?

Return to the Text