Waveform onset data set from Meier et al., 2019, JGR
====================================================

Data set file name: onsetWforms_meier19jgr_pub1_0_woJP.h5 
Version:            1.0
Format:             hdf5
Compression:        GZIP
Date:               January 10, 2019
Contact:            mmeier@caltech.edu
Citation:           Meier, M.-A., Ross, Z. E., Ramachandran, A., Balakrishna, A., 
                    Nair, S., Kundzicz, P., et al. (2019). Reliable realā€time 
                    seismic signal/noise discrimination with machine learning. 
                    Journal of Geophysical Research: Solid Earth, 124. 
                    https://doi.org/10.1029/2018JB016661 


COMMENTs
--------
The data set included here only contains the noise, quake and teleseismic data
from the Caltech/USGS Southern California Seismic Network (SCSN). If you use
these data from the SCSN please acknowledge:
    
    Analyzed SCSN data; doi: 10.7914/SN/CI; stored at the Southern California
    Earthquake Data Center. doi:10.7909/C3WD3xH1.

We do not have permission to redistribute the NIED data from Japan, which were
used in this study. If you are interested in the Japanese data they are
available from: http://www.kyoshin.bosai.go.jp/ (Aoi, S., Kunugi, T. and Fujiwara, 
H., 2004. Strong-motion seismograph network operated by NIED: K-NET and KiK-net. 
Journal of Japan association for earthquake engineering, 4(3), pp.65-74). We can
provide guidance for how to download and process the data into the same format.

Note that this data set has been compiled in a largely automated fashion and 
hence may contain some errors and misclassifications. Comments and suggestions 
are welcome. Furthermore, we did not apply a minimum signal/noise ratio (SNR) 
for the quake signals. If only clean quake signals are desired, use the SNR 
values given in the numMeta field, or recompute them from the waveforms directly. 
All ground motion values are given in SI units.


CONTENTs
--------
There are 3 main h5 data groups (see paper for details)
/quake          local earthquake records
/noise          signals that triggered the OnSite STA/LTA trigger but were not
                associated with local earthquakes in the SCEDC catalog
/tele           teleseismic records that triggered the OnSite STA/LTA trigger

In each group there are the following sub-data-sets: 
wforms          3D array of 6s long ground velocity waveform time series, three 
                components in order N, E, Z, 100sps, from 2s before until 4s after 
                impulsive signal onset. Amplitudes in SI units. Waveforms have been
                gain-corrected, zero-centered and filtered with a 2nd order causal 
                Butterworth high-pass filter with corner frequency 0.075Hz. 
        
numMeta         numerical meta data: magnitude, hypocentral distance, hypocentral depth, 
                log10(snr), unique recordID, pickIndex, stationLat, stationLon,
                PGA, PGV, PGD, numeric origin time, back-azimuth
            
featureVals     3D array of feature values, computed at 1, 2, 3 and 4s after
                the impulsive signal onset (first array dimension). See
                publication for feature definitions.

featureNames    Names of the provided features in each column of featureVals.