Professional Documents
Culture Documents
Principal Investigator:
Darrell Whitley Associate Professor Computer Science Department Colorado State University Fort Collins, CO Ron Cogswell Computer Science Department Colorado State University Fort Collins, CO Landmark Graphics Barry Fish Company Representative
Graduate Student:
Collaborating Company:
Contents
1 Preface 2 Introduction
2.1 Problem De nition . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Potential Pitfalls . . . . . . . . . . . . . . . . . . . . . . . . .
1 2
5 6
8
8 8 10 12 12 13
4 New Approaches
4.1 Pixel-Thresholding . . . . 4.2 Linear Extraction Picking 4.2.1 Multiple-Segments 4.2.2 Dual-Line Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
16 17 20 21
5 Results
23
27 35
iii
iv
1 Preface
The following report summarizes our results from September 1, 1996 to September 1, 1997. Our original goals were somewhat di erent than the actual work carried out in this project. One key element of the original proposal that is still at the core of our current work is problem representation. Part of the original goal was to explore new forms of problem representation for the \First Break Picking" problem both analytically and by experimentation. The work contained here does indeed present a radically di erent approach to representing this problem. Another major goal of the original proposal was to use the new representations to explore and evaluate various neural network approaches compared to more traditional heuristic solutions. By heuristic solutions, we mean methods used in geophysics that draw on signal processing and statistics, but which still are basically heuristic in nature. We also proposed to explore the relationship between the neural network models and the heuristic solutions by building an intermediate fuzzy logic model. One of several advantages provided by a fuzzy logic model is that the fuzzy logic model could also function as a \rapid prototyping" tool for exploring the relationship between the problem representation and solution quality. In our initial studies of this problem we decided on a rather new and innovative way of representing this problem: we decided to treat the \First Break Picking" problem as a problem in computer vision. Existing heuristic methods and neural network models attempt to use key features from a single signal (or a small set of 3 to 5 adjacent signals) to decide when the rst pulse (i.e. the rst break) of a seismic event is present in a recorded signal. On the other hand, when rst breaks are picking manually, a user views an image composed of large blocks of signals. There is spatial continuity in the overall seismic survey: sensors are laid out at regular intervals. There is also a good deal of redundancy in the seismic survey. Thus, rst breaks should be picked in a consistent way across the seismic survey so as to exploit this redundancy. The global information contained in the entire survey can be used when picking each rst break. When stacked side by side (which is how the data is presented to the human rst break picker) the rst breaks appear as a relatively smoothy \surface" across the set of seismic signals. Speci cally, we started to view the rst break picking problem as being similar to nding an \edge" in a image in a computer vision problem. Thus, locatin the \edge" representing the set of rst breaks is not unlike locating the edge of a continuous and relatively regular surface such as the side of a house or the top of a hill. Noise makes this problem more di cult, but there are methods designed to lter out the noise, particularly when we can make strong assumptions about what the \edge" should look like. The best method we have developed attempts to t a line along the edge corresponding the rst breaks across a large set of signals when the signals are represented as an image like that used for manual rst break picking. We can then use more traditional methods to look for key features near this line which would indicate a rst break. This approach lead us 1
SOURCE
RECEIVER
RECEIVER
RECEIVER
RECEIVER
RECEIVER
SURFACE
REFLECTOR
Figure 1: A simpli ed drawing of seismic survey. away from neural networks and fuzzy logic models. In e ect, this representation simpli ed the problem to the point that a pattern recognition system was no longer needed. The \edge detection" algorithm (which we will also refer to as the \line extraction" algorithm) also has a built in evaluation function. Thus, we longer needed to build a pattern recognition system to do rapid prototyping. By May of 1997 we had to make a decision as to whether to continue developing and testing the \line extraction" based methods we were working on, or to work on neural networks and fuzzy logic methods. We decided on a compromise. We decided to nish developing a set of \Line Extraction methods" for rst break picking, and then to compare our methods to the the heuristic and neural network tools for rst break picking that are available in the ProMAX package provided to us by Landmark Graphics. Our results suggest that our Line Extraction methods for rst break picking are very good compared to the heuristic and neural network tools currently available in ProMAX. This report described the rst break picking problem as well as various approaches to the rst break picking problem. This includes a discussion of manual picking, basic methods in geophysics for rst break picking, neural network approaches and nally, our new methods based on Line Extraction. Most of this report is based on Ron Cogswell's MS thesis.
2 Introduction
Geophysicists perform seismic surveys in order to learn more about the underlying geologic structure of the Earth's crust. A basic model of a simple two dimensional seismic survey is shown in Figure 1. The basic premise of geophysical surveying is to arrange for a signal to be generated by the source. This signal will penetrate the surface of the Earth and then, upon striking a change of strata or a layer composed of a di erent material, a portion of the signal is re ected back towards the surface. This primary signal is very important and will be analyzed for the purpose of rst break picking. A line of receivers or geophones measures the returning signals at set time intervals. The arrangement and locations of these receivers are set up by the survey team to maximize the e ectiveness of the survey. The process of generating and recording is repeated multiple times by moving the shot location and possibly the locations of receivers. A common movement strategy is to move the shot location to a receiver location 2
Amplitude
Figure 2: A single seismic signal. further down a line and continue in this manner until the shot has moved from one extreme of the line to the other. Geophysicists use a number of methods to generate the initial energy or impulse to create the measurable seismic wave. The use of dynamite or other explosive compounds creates signi cant seismic disturbances and are often used as survey sources. However, increased concern for the environment and the availability of less destructive and safer methods have led to the use of alternatives such as vibroseis, air guns, and various other methods. The most generally used non-impulsive energy source is vibroseis. The vibroseis system uses a one square meter pad pressed rmly to the ground. The pad is mounted on a truck supported by hydraulic jacks. These jacks allow for a xed amplitude vibration using an oscillation of the uid in the jacks and the weight of the truck to create a controlled seismic disturbance. The oscillation is carefully controlled to produce a continuously varying frequency for a speci ed period of time. The use of vibroseis is increasing since it is less destructive to property, and does not require the drilling of shot holes. The lack of drilling also makes vibroseis more rapid to deploy and survey 9]. The receiver is called a geophone. The modern seismometer converts particle velocity into electrical voltages 10]. The oscillation of a magnet inside a coil is often employed to achieve this electrical output. The problem here is the frequency sensitivity of the geophone. For seismic purposes the highest sensitivity is the range between 5 to 40 hertz since the frequency of the seismic signal is generally low 9]. The result of recording the signal, referred to as a seismic trace or just trace, can be seen in Figure 2. This is a wiggle plot with variable width shading of a signal from one of the data sets used in this project. Variable width shading lls in the positive cycles in the plot. The next gure shows a group of signals from the same survey 3. The seismic survey generates a large amount of data. A receiver recording for six seconds at a sampling rate of four milliseconds has 1,500 samples. Each of these samples is four bytes 3
Time
Trace
Time
in length. A single trace in the survey is six kilobytes. A single shot with 96 receivers is 576 kilobytes. A single data set of only 100 shots would be more than 50 megabytes for a simple two dimensional survey. Shell Services company recently processed more than two terabytes of geophysical data in a single day 8]. This enormous amount of data drives the search for more e cient methods of processing.
into sharp relief. The accentuation of the positive portion of the peak draws the human picker to place rst break picks along an almost continuous line of positive peaks. The bias of the display impacts the accuracy of these manual picks. The display method does decrease the accuracy of the picks, however, because it does not take into account possible phase shifts in the signal. This does not invalidate these picks since they also have the advantage of being consistent and are what the human interpreter expects to see for a rst break pick. If an automatic picking program does not come close to the results of manual picking, the users will generally not use the tool. If the same data is given to di erent geophysicists the results will in all probability not be the same. Experience and training di er from person to person and the expectations of each person are not the same. Some scientists will pick early in the trace while others may be more conservative and pick a later sample.
Trace
Time
2. \A test on the shape of the signal which checks the sign of the four consecutive di erences after the proposed kick. If all the di erences have the same sign and are either all increasing or decreasing, the kick is passed 6]." This is a frequency test. The number of samples being checked after the pick attempts to place the break at the leading edge of the signal. If the sampling rate is 4 milliseconds the frequency must be less than 62.5 hertz in order to pass this test. This will eliminate high frequency noise from triggering a false pick. 3. A test which checks the continuity of the strong signal at the kick. The standard deviation of the four samples located half the expected dominant frequency later in time is calculated. (10 milliseconds is appropriate for shallow refraction arrivals.) The kick is passed if its standard deviation is less than the later one 6]. This test provides for a degree of noise immunity by verifying that the kick found is not an isolated occurrence. 4. A test which predicts the values at the kick from previous values using the technique of linear-least-squares prediction. Such a test was rst used by Wadsworth et al. (1953) to identify seismic events on the basis of a prediction failure. The present technique uses Robinson's (1967) subroutines, and the proposed kick is passed if there is a signi cant prediction error at the kick and for the two terms after it. Signi cance is established by making comparisons with the prediction errors within the noise 6]. This linearity requirement is based both in the expectations of the manual picking and the consistency expected in the data. Three of the four algorithms used by Hatherly are concerned directly with the values present in a single trace. The fourth algorithm or test is the only portion which examines the more global structure of the rst break picking problem. The di culty in this test is that it requires that the earlier picks are already correct. There exists no previous pick to test against for the rst break chosen on the rst trace. If this trace is picked in error then the error propagates and a ects all of the subsequent picks. This algorithm has proved practical for some seismic data having a high signal to noise ratio. Algorithm development has continued to use the principles that underlie Hatherly's algorithms. The work by Gelchinsky and Shtivelman applied statistical techniques to rst break picking with an emphasis on the spatial correlation not present in Hatherly's work. The algorithm rst attempts to destroy a possible spatial correlation present in the noise. This is done by suppressing noise below a speci ed amplitude. After the noise is reduced statistical estimation techniques are used to determine rst arrivals obtained by summation along straight lines in given directions 4].
Leading Window
Time
11
This algorithm will appear again in the testing section for comparison purposes.
3. Power ratio (PR) between a forward and reverse sliding of the ve sample window is given by P R(t) = MP L(t + 2)=MP L(t ? 2); and 4. Envelope slope (ES) at the peak value is given by ES (t) = E (t)]= t. 11] The use of these Hilbert attributes was based on their discriminatory value in classi cation of rst break peaks as demonstrated in previous work by Amizadeh and Chatterjee in 1984 and previous work by Taner in 1988 11]. 12
The network uses three consecutive traces. The peaks on trace to the left and right of the current trace are used to give a spatial context to identify if the center trace peaks is a rst arrival. The use of this spatial information turned out to be extremely important for good performance 11]. Other ANN research has used larger networks to encompass a larger neighborhood to increase the spatial coherence in the picks. The back propagation network (BPN) used by Veezhinathan used 12 input units, 5 hidden units, and a single output units to give a FAR or nonFAR result on the peak. The number of hidden units was the result of empirical testing to try nd the balance between over tting and under tting 11]. The neural net could classify multiple peaks or zero peaks in a trace as FAR peaks. In a trace with multiple picks, incorrect picks are false alarms. They evaluate the network according to: 1. percentage of peaks correctly classi ed, and 2. percent of traces absolutely correctly picked 11]. The allowance of multiple peaks to be picked allowed the network to score high with respect to the number of peaks correctly classi ed but lower on the number of traces picked with error or false alarm picks. A post processing step using a linear least squares t line through the picks and choosing the FAR closest to the line eliminated the false picks. This line approximates the average refraction velocity curve and helps reject traces that deviate from the overall trend. This post processing step was removed by adding another input to the network in the form of distance to a reduced traveltime curve 11].
The implementation of the network is similar to the work by Veezhinathan et al. There are eight neighboring peaks and troughs within a window rather than ve. The number of traces examined remains three with the classi cation of the center trace dependent on the data in the right and left traces. The main di erence is in the criteria used for classi cation of the peaks. This network uses: 1. 2. 3. 4. 5. 6. The amplitude of the peak. The RMS amplitude of the quiet, or \noise" window. The RMS amplitude of the high amplitude or \signal" window. The frequency of the peak. The ratio between the frequencies in the two windows. The shift between the time of the peak and the value predicted for it by cross correlation with the previous trace. 7. The time shift between the peak and the maximum signal to noise ratio. Each geophysicist will have a unique picking style; some will pick rst breaks very early, and some very late. This attribute allows the network to \learn" the individual preference of the user training it. 8. The trace o set. By using this parameter, we can compensate for some of the gross amplitude and signal to noise ratio di erences in the data. 9. The map coordinates of the traces and of the source. 7].
The rst three amplitude-dependent values are similar to the values used by Veezhinathan, et al. They are similar to the envelope or window tests used by a number of rst break picking algorithms. The frequency information is new. It is not used by Veezhinathan, et al. This application also uses the cross correlation with the previous trace in a manner similar to the reduced traveltime curves being used by a number of both heuristic and neural network algorithms. The number of hidden nodes in the network is higher than those of the back propagation network used by Veezhinathan, et al., but the number of features and inputs is also higher. The training phase adds 6 to 15 hidden nodes. The type of data tends to determine the number of nodes, with dynamite requiring fewer nodes than vibroseis 7] These approaches all consider the seismic trace in terms relating to the physics that produced the signal. There are some attempts made to look at the human factors, particularly in the neural network algorithms, but there exists very little global consistency. Manual rst break are chosen based on the display image. The new approaches examined in the next chapter take a more image-based approach to the problem. 14
4 New Approaches
In manual rst break picking, the human is not examining traces as individual elements but rather in the context of the contribution of each trace to a more coherent image of a shot record or multiple shot records. In order to create an algorithm that produces a satisfactory rst break solution, the method must work in the domain of image processing as well as seismic data processing. The problems encountered with previous methods can be summarized as follows: 1. 2. 3. 4. 5. 6. Noise sensitivity, Lack of similarity to the manual picking results, A process requiring an inordinate amount of human interaction, Algorithms that do not generalize well across data sets, Global trend information not utilized, Inability to handle special cases such as cycle skips in vibroseis data.
In order to examine the rst break picking problem, it was decided early to make the algorithms utilize as much global information as possible. The data is highly structured due to repetition and the similarities present as a result of survey methodology. These methods speci cally increase the similarities between data both within a single shot record and across all shot records within a survey. This spatial coherence present in the data should be exploited and respected by an automated rst break picking algorithm. There are common trends in how all people examine the visual display of the data. We naturally see coherent lines running across the traces. People tend to pick along these lines in the data when picking horizons such as the rst break. How these lines are perceived depends upon how the seismic data is displayed. The importance of viewing the trace data as images led to approaches based in image processing. We have explored three new methods developed from an image based approach. The algorithms are: to classify pixels as either in the data region or the noise region of the image. Multiple-Segments The Multiple-Segments algorithm exploits the linear characteristics in rst break picking by tting linear segments approximating the rst break line. Dual-Line The Dual-line algorithm uses linearity and the constraints of common traveled distance to narrow the search for the rst break. 15
Pixel-Thresholding This algorithm uses local pixel windowing, thresholding, and erosion
4.1 Pixel-Thresholding
The change to a \whole image" based algorithm results in the search for rst breaks being moved from a search space of individual traces to an image representation of the data. This is similar to the approach used by manual picking since the choices there are made based on the displayed data. This algorithm de nes the rst break as a \line" through the rst pixels in each trace after applying a threshold function over a pixel neighborhood. The inability of this simple algorithm to correctly classify the edges of the regions make this algorithm pick very early in the time. It cannot handle the cycle-skip problem in vibroseis data. The algorithm is able to process clean data with a high signal to noise ratio. The rst step in the pixel-thresholding approach classi es each pixel or data value in a seismic trace as either signal or noise. An initial classi cation image is de ned: ( 1 if ax(y) > ; I (x; y ) = (3) 0 otherwise where I (x; y) is the image, x is the trace number, y is the sample number in time, ax(y) is the amplitude of trace x at sample y, and is a user de ned threshold on the minimum amplitude to display. Classi cation is re ned by applying a window operator to each pixel. Window size is de ned by four variables; forward, reverse, up and down. By de ning the window in this manner the location of the value to be classi ed is not restricted to the center of the matrix. A pixel value is labeled as signal or noise based on its value as well as the value of its neighbors within the window. The result of the application of thresholding is a new image: ( 1 if V (x; y) > ; T (x; y ) = (4) 0 otherwise where T (x; y) is the threshold image value, V (x; y) is the window value, and is a userspeci ed threshold. The window value, V (x; y), is j =y+u i=x+f X X I (i; j ); (5) V (x; y ) =
j =y?d i=x?r
where y is time, u is the up variable, d is the down variable, f is the forward variable, r is the reverse variable, and I is the image. The user-speci ed value for is chosen based on the characteristics of the data, the size of the window, and the desired results. Surveys with small o set values utilize a higher value while larger surveys will need a smaller value to handle the attenuation of signal at distant receivers. The larger the value used the later in time picks will be chosen. 16
The top graph in Figure 6 shows I (x; y), the data for six shots prior to the application of the local pixel window search. A black pixel shows values greater than the user-speci ed threshold. The bottom graph in Figure 6 shows T (x; y) for six example shots. The interior data is classi ed correctly in each case. The rst break however is not dependent on the interior but rather the edges of the regions classi ed. The algorithm does not have a problem with removing the noisy traces and other isolated pockets of noise present in the survey. Yet the edges of the region are not correct since shingling, ringing, and other noise present very close to the onset of signal are not removed. We tried several variation on Pixel Thresholding. These methods are adequate in some areas and inadequate in others. The algorithm does work quickly and with minimum of user interaction required. The weakness stems from the limited amount of information used. There is no usage of the global and linear trend information present in the data. The pixelthresholding algorithm has di culties dealing with the traces at the edges of a shot record. There is not enough horizontal information present at these locations. But Figure 6 also provided motivation for more complex \edge detection" methods that attempt to treat the set of rst breaks picks across a set of signals as a single uni ed decision process.
(6)
where y is the intercept point, ci is the value to be checked to see if it is in the window. The window width is based on the value of as shown in Figure 7. Only one test is needed for both a positive and negative o set from the intercept value. The window is terminated either by the edge of the data or the Gaussian value being less than 0.05. An entire line is 17
Shot Trace
Figure 6: The top gure shows the data prior to search. The bottom gure shows the data 18 after a single pass of pixel thresholding.
Time
Shot Trace
Time
where x is the trace number, t is the number of traces intercepted by the line, g is the Gaussian function, yi is the sample at which the line intercepts the trace, and yp is the rst peak occuring earlier in time than yi. This dual window system is similar to those used by previously described methods (See Figure 5). The utilization of the Gaussian distribution captures both amplitude and frequency information. The emphasis is placed on the peaks which have been observed to be a major visual cue for manual picking. The ability to change the width of the Gaussian function provides for exibility and can be changed to accommodate the dominant frequency present in the seismic data. The line segments are de ned by the endpoints. The endpoint is a trace number and the sample within the trace. The trace numbers are the horizontal or \x" component and the sample is the \y" component. The lines can be manipulated by moving the endpoints. There are functions to move either endpoint independent of the other or both endpoints together. Another function calculates the intermediate points on the line using the slope. The movement functions combined with the energy function allow for a local search algorithm to be utilized in the manipulation the lines. The energy function provides a measure of how well the line ts the rst breaks in the traces within the segment endpoints. Using this method to evaluate the tness of a line representing the rst breaks across a group of traces, the next task was to discover methods to place the lines. This leads to the linear extraction of these lines using simple search algorithms.
x=1
g (yi) ? g (yp);
(7)
19
Trace
Figure 8: Maximum Line Energy Search. Colored area shows the search limits.
Time
4.2.1 Multiple-Segments
The procedure for the multiple segment search is as follows: 1. 2. 3. 4. Place initial maximum line. Determine suitable break points for the line. Perform needed segment searches. Move individual breaks to nearest peak.
The rst linear algorithm developed used the multiple segment idea expressed above. The algorithm begins by tting the required number (either 1 or 2) of lines to the data. The number of lines needed is based on the location of the shot with respect to the receivers. If the shot occurs at either the extreme left or right location of a shot line then only a single line is required. Shot locations not at the extreme require a pair of lines, one for the receivers on the right side of the shot and another for the receivers on the left. The procedure for the right and left are identical; the following presents the simple case where the shot occurs at an extreme. The rst step in the multiple-segment algorithm is to locate a best t for a single line. This rst line or initial maximum line is the result of a depth rst search from an initial placement and the local search results in a neighborhood around this initial placement. The initial placement is determined by a simple search at the shot location and the extreme location. This simple search is controlled by a user-speci ed threshold and terminates when this threshold is exceeded returning the location at which this occurs. The local search neighborhood is also de ned by the user. The search locates the maximum energy single line within the neighborhood of the initial starting point. Figure 8 shows the salient points of the search for this maximum energy line. The line formed by the two di erent colored halves is the initial line. The search algorithm then searches all of the possible lines from the left most edge of the two colored region to the right edge. In this case there are 302 possible 20
lines to be searched. This can be increased or decreased based on the o set values used to de ne this region. This same search is used for any line segment de ned by the end points and the upper and lower bound at each end point. The line with the highest energy value is returned from this search. The highest energy line will be the line with the most points on the edge of the data region and noise region since the energy function will return higher values at these points. The single line found above matches the gross location of the rst break line but does not capture the variations required. In order to accomplish this task, the sections of this line that are not well aligned with the rst breaks must be found. The locations of these sections constitute breaking points in the line. The determination of suitable breaking points for the line is done by searching for cases of negative energy values in the single line de ned in step 1. These negative energy locations are not located on a peak of the data or too deep into the data. A segment length variable is used to keep the search from fracturing the line into individual segments of length one. A segment suitable for searching must exceed the user de ned length, in this case ten traces. The starting and ending negative energy values are used to de ne a segment. The search uses a look-ahead to insure individual variations do not interfere with segment de nition. Each time a segment is de ned a local search is performed to maximize the segment energy. There are constraints placed on this local search neighborhood to insure that the slope of the new segment does not diverge from the segment above and below it by more than two. This is accomplished by limiting the neighborhood in which the search occurs. The occurrence of section endpoints in noise traces such as 60 cycle power lines prompted this constraint. If a higher energy value for the section is found, the section is moved. This is done until no more target sections can be de ned in the line. The nal phase of the search is to simply accommodate the user's picking style and provide a consistent phase choice. This could be modi ed to pick zero crossings or troughs rather than the current choice of peaks. This method provided adequate results similar to the stabilized power algorithm in ProMAX. The advantages were based in the linearity requirements and constraints.
Trace
Figure 9: Maximum Line Location. Best single line t (Maximum Energy Line) shown in red. and left sides of the shot. Geophone placement o set values are the same for both the right and left sides. This forces the arrival times to be similar. The di erences in times would depend upon the di erences in the depth of the weathered layer upon which the geophones are placed. The similarity constraint would also apply to receivers at the same o set in di erent shots in the same survey. These constraints are particularly useful in the cycle skip problem in \loopy" vibroseis data. The reason the rst break is picked at lower cycles in this type of data is based on the human observation of the linear trend associated with the traveltime of the seismic signal to points at a common o set. The addition of these constraints allows for a simpler search method to be applied. The user de nes a tolerance in terms of the amount each value is allowed to di er from the trace at equal o set within the shot if it exists. These constraints are reinforced by a similarity constraint between the values at similar o sets between shots. These tolerances are strictly a guideline the user provides to the search in order to enforce a degree of coherency. The dual line algorithm uses a strictly local, energy-based, search with the con nes of the interception points of a maximum energy line and a rst energy line for each trace. The maximum energy line is the same initial line algorithm used by the multiple segment method. The shot location and extreme are searched to nd an initial location. The neighborhood of this initial location is searched to nd the maximum energy line. This line serves as the lower bound on the search. Figure 9 shows this line. The upper bound on the search or the rst energy line is found using a sampling technique. A user-de ned number of samples are taken. These samples are also taken in a user-de ned area between the shot and an ending trace which can occur between the shot and the extreme edge of the survey. The samples are taken at equal intervals between the shot and the chosen end point for sampling. The samples are the rst energy locations found using a simple search until the trace value exceeds a user-de ned value related to displayable peaks. The simplest and most e ective method has been to simply use the minimum image display value for peaks. 22
Time
Trace
Figure 10: First Energy Line Location. Red line is the maximum energy line and the blue line is the upper bound line. The upper line is tted to the sampled points using an average slope rather than regression for an empirically observed phenomenon present in the data. In all of the sample data there exists a ringing or shingling in the traces closest to the shot. This occurs in both dynamite and vibroseis data. Figure 10 shows both the maximum energy and the rst energy lines. Prior to searching the traces between the rst energy line and the maximum energy line the constraint tests are performed to move the lines into proximity with each other across the shot and across the survey. The average is taken for the location of the line at each o set and if a line is outside the average by more than two standard deviations it is moved to the average location. Once the lines are placed, we have created globally consistent constraints across the entire shot line. The nal task is to locate the rst breaks on each trace by simply searching for the maximum energy point between the limiting lines. If no maximum is found between the lines then the rst positive peak below the maximum energy line is chosen for the rst break. Figure 11 shows the result for the shot portion displayed in Figures 9 and 10.
Time
5 Results
Testing of the algorithms developed during this study was performed on two data sets. The rst data set demonstrates an impulsive source, such as dynamite. This data is from the Marmousi data set generated at the Institut Francais du Petrole in 1990 and used for a workshop on the practical aspects of seismic data inversion 12]. This data was provided by John Scales of Colorado School of Mines for a project last year concerning a related topic in geophysics: residual static correction. 23
Trace
Figure 11: First Break Results Line. The red line is the rst break result. The second data set was provided by Barry Fish from Landmark Graphics. This set is representative of a vibroseis data set. It was provided to us since it exhibits the di culties encountered when working with non-impulsive data sources. It shows both coherent and incoherent noise problems. The shingling present in the data requires the picking algorithm be able to handle the required cycle skips.
Time
Shot Trace
Time
Figure 12: Algorithm results on 2 shots from the Marmousi data. Dual-Line result in red. 25
second shot it can be seen that the pick falls o to a later pick on the extreme right hand trace. This can be attributed to the attenuation problem resulting from the distance between the shot and the trace. A correction in the picking algorithm amplitude selection results in a better pick. The algorithm picks the correct rst break on over 95 percent of the traces when compared with manual picks. The incorrectly picked traces are these extreme traces. The missed picks are located on the peak following the actual rst break. This is part of a compromise between the separation of noise and the picking on the extremes of the data.
Dual Line Neural Net 1 Neural Net 2 Neural Net 4 Stab. Power 29 16 20 15 12 51 27 34 26 21 78 43 53 44 46 92 69 77 68 73
Table 1: Percentage of Picks Within Bounds There is a great deal less jumping from a linear group of early rst picks to a later group. There is still a number of spikes in which the picks change a great deal. The network trained on four shots used shots one, three, ve and seven as the training data. This was the longest of the training times required. The network did not converge to a suitable group of weights the rst time. The search was terminated when the network had more than 140 units. A second attempt resulted in a network of 98 units with 71 hidden units but required 136 minutes to train on the four shots. The results were not better than training the network on only two shots. Table 5.1 shows comparative results of algorithm performance on the rst nine shots, 540 traces. The horizontal categories in the table represent ranges in terms of samples and peaks in the seismic signals. The values reported are the percentage of picks within the indicated classi cation range of the hand picked data. For instance, row one represents an algorithm pick at the same sample as a manual pick. The successive rows increase the window of acceptance around the hand pick to plus or minus one sample (Sample + or - 1) in row two, on the same peak in row three, and nally within a three peak window (Previous or Next Peak) in row four. Figure 19 shows the same information in a graphical format. The dual-line algorithm is the best of the algorithms tested. In row one, the dual-line algorithm performance is higher than any of the other algorithms tested. If we increase the window of acceptance to plus or minus one, the dual-line algorithm performance is almost double that of the other algorithms. A reasonable range of the same peak in row three shows the dual-line algorithm still performing at a level almost twice that of the competition. Further loosening the restriction to a three peak window, a fairly wide margin, the performance of the dual-line algorithm is still signi cantly higher than that of the other algorithms.
Shot Trace
Time
Figure 13: Dual-Line Algorithm and Hand-picking Results Shots 1-3. Dual-Line in red and Hand Picks in blue. 28
Shot
Trace
Time
Figure 14: Dual-Line Algorithm and Hand-picking Results Shots 4-6. Dual-Line in red and Hand Picks in blue. 29
Shot
Trace
Time
Figure 15: Dual-Line Algorithm and Stabilized Power Algorithm Results. Dual-Line in red and Stabilized Power in blue. 30
Shot Trace
Time
Figure 16: Dual-Line Algorithm and Neural Net Results 1 Shot Training. Dual-Line in red and Neural Net in blue. 31
Shot Trace
Time
Figure 17: Dual-Line Algorithm and Neural Net Results 2 Shot Training. Dual-Line in red and Neural Net in blue. 32
Shot Trace
Time
Figure 18: Dual-Line Algorithm and Neural Net Results 4 Shot Training. Dual-Line in red and Neural Net in blue. 33
Figure 19: Comparison of Algorithm Performance. the cycle skips present in this data. The use of similarity constraints both within a shot and across all of the shots within a survey has proved to be a very valuable tool. Local coherency of the picks is enforced by requiring similarity between the rst break picks at receivers with equal o set. The same is true for a more global coherence through the use of constraints on the picks at equal o sets between shots. The possibilities of the use of constraints within the data in a single shot must be explored further. The global similarities between the data in all shots must be utilized with care in order to maintain the coherence but not to the exclusion of changes that are justi ed by the data. There is a great deal of work to be done yet on the algorithms investigated here. Additional testing of the current algorithms on a larger set of seismic data with di ering characteristics is needed. It would also be possible to combine the line extraction algorithm with a neural net or other heuristic method to determine the segments which make up the rst break line. The proximity of the lines to each other along the edge and the similarity of slopes could be used as guides for development. The search space would no longer be pixel values but rather lines in the display image. A number of lines such as strictly horizontal and vertical lines could be eliminated from consideration. The similarity constraints for o sets could be applied. The rst task would simply be to identify the line segments that could possibly represent the rst break horizon in the data and then use a dual windows energy test such as the one utilized by the dual line algorithm to test the tness of the potential rst break line segments.
34
References
1] F. Coppens. First arrival picking on common-o set trace collections for automatic estimation of static corrections. Geophysical Prospecting, 33:1212{1231, 1985. 2] Scott Fahlman and Christian Lebiere. The cascade correlation learning architecture. In D. Touretzky, editor, Advances in Neural Information Processing Systems 2. Morgan Kaufmann, 1990. 3] Pascal Fua and Yvan G. Leclerc. Model driven edge detection. In Proceedings: Image Understanding Workshop { 1988, pages 1016 { 1021. DARPA, Morgan Kaufmann, April 1988. 4] B. Gelchinsky and V. Shtivelman. Automatic picking of rst arrivals and parameterization of traveltime curves. Geophysical Prospecting, 31:915{928, 1983. 35
5] Landmark Graphics. ProMAX Reference Manual. Advance Geophysical Corp., Englewood, Colorado 80112, 1995. 6] P. Hatherly. A computer method for determining seismic rst arrival times. Geophysics, 47(10):1431{1436, 1982. 7] T. Kusuma and B. Fish. Towards more robust neural-network rst break and horizon pickers. In Proceeding of the Society of Exploration Geophysicists 63rd Annual International Meeting, pages 238{241, 1993. 8] J. Martin. Personal conversation, 1997. 9] E. Robinson and C. Coruh. Basic Exploration Geophysics. John Wiley and Sons, New York, New York, 1988. 10] E. Robinson and S. Treitel. Geophysical Signal Analysis. Prentice Hall, Englwood Cli s, New Jersey 07632, 1980. 11] J. Veezhinathan, D. Wagner, and J. Ehlers. First break picking using a neural network. In F. Aminzadeh and M. Simaan, editors, Expert Systems in Exploration. Society of Exploration Geophysicists, 1991. 12] R. Versteeg. Sensitivity of prestack depth migration to the velocity model. Geophysics, 58(6):873{882, 1993.
36