PROCESS CLEANLINESS
Using categorized defect learning to optimize photo processes
Eric H. Bokelberg, James L. Goetz, and Michael E. Pariseau, IBM Microelectronics
Although defect detection, analysis, and resolution have always played important roles in semiconductor manufacturing, as feature sizes have shrunk below 0.5 µm, the need to address any anomalies on production wafers, regardless of size, has spurred a new generation of inspection equipment and analysis techniques. Gone are the days when large environmental contaminants and human-generated particles were the primary concerns and small defects were considered inconsequential to device performance. Now, defects as small as 0.12 µm can limit wafer yields, and processes must be engineered for close to zero-defect performance.1,2
In the standard approach to defect identification and analysis, or defect learning, wafers are inspected at various stages of production to identify "events," which are then reviewed to determine if they represent defects that can adversely affect device performance.3 It is not unusual for several different types of equipment, each suited for a particular task, to be used at various levels of a process.46 In order to create a common database, some facilities are investing significant resources to network all their defect inspection and review equipment. The database could then provide real-time access to a large volume of process-specific defect information, which should result in accelerated defect learning.7
Photolithographic processes, in particular, can create a variety of different defects that directly affect device patterns.8,9 Typically, efforts for improving the defect performance of such processes are concentrated on reducing the total number of events to an arbitrarily low level.1012 Reducing defects beyond this level is difficult because of the limited sample sizes for specific defect types. Also, prior-level defects can add noise
to the inspection data, thereby decreasing a system's ability to detect low-level
defects.
This article presents a defect learning method that uses category data, generated during defect review, to understand the types of defects that occur and to
facilitate process optimization. After
the term categorized defect learning is
introduced and its application to photolithographic processes is described,
the method is demonstrated through
a series of experiments that investigated a puddle-develop process. The use of categorized defect learning for tool monitoring is also discussed briefly.
Categorized Defect Learning
Categorized defect learning can be defined as process optimization that focuses on specific types or categories of defects identified through automated inspection and manual review. Until recently, reviewing and verifying defect data was a very time-consuming process, which limited the amount of data that could be gathered. Process engineers could quantify the total number of defects and review a representative sampling of the defect types, but it was impossible to accurately quantify the extent of any one defect.
Fortunately, advances in the capabilities of inspection systems have made categorized defect learning a viable engineering process. While the detection of irregularities on semiconductor wafers by such techniques as laser surface scans and digital image-pattern recognition is nothing new, the ability of various systems to use the resulting defect-location data, stored in a common format, has allowed more-rigorous and timely analysis using a diverse array of equipment. Defects can be discovered with one type of tool and then, using the defect-location data, quickly relocated, investigated, and ultimately identified by other tools better suited for that analysis.5
Recognizing defects that occur during a particular process becomes more difficult at each successive level because the
inspection equipment also locates prior-level defects. Partitioning (inspecting the same wafers at different stages of processing and subtracting the prior-level defect data) makes the in-line defect data at later process steps much more relevant, but it is difficult to manage logistically in a production setting.7 A better approach is to use monitor wafers processed with recipes identical to those used in production.10 The monitor wafers can be bare silicon or built with film stacks to better represent in-process wafers at a particular production phase. As long as their surface cleanliness is characterized prior to processing, any new defects found on the monitor wafers will have occurred during the current process step.
In contrast to the traditional process optimization method of experimenting with production wafers and relying on electrical tests and in-line inspections for results, the
categorized defect learning method can generate large amounts of data in a relatively short period of time. This capability provides manufacturers the option of running a series of short-loop experiments to accelerate process learning.
Application to Photolithographic Processes. The categorized defect learning method is well suited to optimizing photolithographic processes, especially those based on clustered tools (photoclusters). Photoresist patterns can be created on bare silicon monitor wafers using processes identical to those used for production wafers, and the sensitivity levels of automated inspection equipment can be easily tailored for recognizing defects in (or on) these patterns.
A typical categorized defect learning study, such as the lithographic process optimization described later in this
article, consists of five simple steps:
- Precount. Bare silicon monitor wafers are first inspected to determine their baseline cleanliness levels. For best results, there should be fewer than 50 particles (>=0.3 µm) on any wafer used.
- Process. The monitor wafers are then fully processed (i.e., coated with resist, exposed, and developed) through a photocluster. The photomask can be an existing product reticle or something specifically designed for defect learning.10 Process conditions should be adjusted as necessary for the experiment, and a process-of-record (POR) cell must always be included to standardize results from day to day.
- Inspect. High-throughput digital image-pattern recognition equipment is used to identify anomalies or unusual events in the resist pattern on the processed wafers, and the location of each event relative to reference sites on the wafer is stored for use in subsequent analysis.
- Review. Following the initial inspection, the location data are downloaded to an inspection station consisting of a microscope, a video camera, and automated positioning stages. An operator then reviews and classifies each wafer's events into predetermined defect categories (see Table I). By using location data to drive the automated stages, hundreds of events can be reviewed in a short period of time, making it possible to generate a statistically significant sample for each defect category. Usually, there are specific defect types that are common to a particular photo process; the relative amounts may change from experiment to experiment, but the defect categories remain the same.
- Analyze. Finally, the defect category data are analyzed using various statistical techniques. This task is not necessarily straightforward: the data must often be grouped in a particular fashion to provide meaningful results. Possible groupings include total events, total unclustered events, combinations of two or more categories, and individual categories. With time and experience, it becomes easier to identify which defect categories should be grouped for analysis.
Defects that do not fit the established categories can be studied further using scanning electron microscopy, energy-dispersive x-ray spectroscopy, or other advanced techniques.13,14 Here again, the ability to use location data from other systems speeds processing.
| Type | Abbreviation | Description |
|---|
| Residual resist | RR | Resist flakes, webbing between printed lines |
| Foreign material | FM | Small, nonresist particles |
| Handling damage | HD | Any damage to the photoresist pattern (e.g., scratches) |
| Printed defect | PT | Areas of imaged photoresist, often irregular in shape, that are not part of the reticle pattern |
| Short | SH | A printed defect that bridges two resist lines |
| Foreign material short | FS | Bridging between two lines with foreign material visibly present |
| Mound | MD | Raised spherical area of resist, seen in large areas of unexposed photoresist |
| Ball bearing | BB | Aerosol resist particle that has landed on the wafer |
| No defect | ND | Nothing visible during defect review |
Table I: Representative defect categories used for process optimization.
Precautions. It should be stressed that using the categorized defect learning method on monitor wafers does not eliminate the need for production wafer inspections; some correlation will always be required. Nitride or oxide surfaces may have a greater affinity than bare silicon for photo-process by-products, for example, necessitating different rinse conditions. Furthermore, a defect that appears threatening after a photo process may disappear during postprocessing (UV harden, etch, and so forth), and efforts to eliminate it in the photo process will not affect overall wafer yields.
Also, sorting defects into categories is a subjective taskone person may view a particular defect quite differently than someone else. As more people are involved in a defect review and more subjective opinions come into play, the cumulative defect category data become less accurate. Defect classification responsibilities must therefore be limited to as few people as possible. Advances in automated defect classification algorithms will lessen this dependency over time.15
While the large volume of data that can be generated with the categorized defect learning method can be beneficial, it can also be burdensome. Experimental trends are not always evident on initial review of the data but, as mentioned earlier, experience will help operators to identify defect categories that should be grouped to identify significant trends.
Process Optimization Case Studies
The experiments described below demonstrate the use of categorized defect learning in optimizing single-wafer puddle-develop processes. Two photoresist conditions were investigated: a basic single-layer resist and a resist with an aqueous top antireflective coating (resist + TARC). In both cases, the wafers were developed with a standard single-puddle process in nonsurfacted 0.26-N tetramethylammonium hydroxide (TMAH) developer.16 Deionized (DI) water was used for final rinsing. The process parameters investigatedrinse delay, final rinse spin speed, rinse-nozzle height, number of rinse nozzles, and puddle refreshwere selected based on their potential to reduce defects below POR levels. Attention was focused on the DI-water rinse because one of its primary functions is to eliminate by-products of the develop process.
Rinse Delay. At the end of the dwell phase of a puddle-develop process, the developer meniscus is laden with dissolved resist, azo dye, and other process by-products. When rinsing with DI water first begins, the alkalinity of the puddle solution decreases dramatically, and some compounds can fall out of solution and adhere to the wafer surface. The effect is especially pronounced in the resist + TARC process because the coating dissolves into the developer solution before the exposed photoresist does, saturating the puddle and increasing the likelihood of sedimentation. With a rinse delay, the solution is spun off the water before rinsing with DI water begins, and minimal residue is left behind.
Figure 1: The effects of a rinse delay on defect levels of (a) resist + TARC and (b) single-layer resist processes.
In the POR processes studied, wafer spinning and DI-water rinsing began simultaneously. As Figure 1 shows, adding a rinse delay resulted in a marked improvement in the defect performance of the resist + TARC process and a slight improvement in the single-layer resist process. Note that the defect scale for the resist + TARC process represents total events, whereas the scale for the single-layer resist process represents foreign material (FM) defects only. In general, defect counts for resist + TARC tend to be 78* higher than those for a single-layer resist developed with the same process.
Figure 2: Wafer spin speeds during the puddle-develop processes' rinse step.
Figure 3: Defect correlation to final rinse spin speed for (a) resist + TARC and (b) single-layer resist processes.
Final Rinse Spin Speed. As seen in Figure 2, wafer spin speed accelerated gradually for the processes studied, with or without a rinse delay. The final rinse spin speed is the maximum speed achieved and held constant during the DI-water rinse. The experimental categorized defect data, shown in Figure 3, revealed that the number of total events in the resist + TARC process decreased steadily when the final spin speed was increased, reaching minimum levels at ~ 2500 rpm. (The data shown represent several experiments; therefore, the defect scale is normalized with respect to the POR operation.) In the single-layer resist process, increasing the spin speed had the opposite effect: defect counts increased at spin speeds of 2000 rpm and higher.
Figure 4: Evidence of resist scalloping at high rinse speeds, shown at varying magnifications.

Microscopic examination of the mound (MD) defects seen in the single-layer resist at higher spin speeds revealed that the rinse was actually "scalloping" the resist, taking divots out of the top surface of unexposed resist and depositing them elsewhere on the wafer (Figure 4). It is possible that the combination of a high DI-water flow rate and a high spin speed created sufficient lateral force to shear the resist pieces out of the film surface. This phenomenon may not occur in the resist + TARC process because the coating has changed the surface properties of the photoresist and made it pliable enough to dissipate shear forces. Of course, the higher levels of other defects for the resist + TARC process could have masked any increase in MD defects caused by resist scallops at higher spin speeds.
Figure 5: Comparative defect performance at standard and raised rinse-nozzle heights for the resist + TARC and single-layer resist processes.
Rinse-Nozzle Height. In these experiments, the "standard" rinse-nozzle height was as close to the wafer as the dispense nozzles could get without touching the developer meniscus; the "raised" height was the highest nozzle position possible. As the comparative data shown in Figure 5 indicate, moving the rinse nozzles to the raised position increased the number of FM and residual resist (RR) defects on the wafers for both the resist + TARC and single-layer resist processes.
Figure 6: Rinse-nozzle alignments for experiments using (a) two and (b) three nozzles. The dotted lines indicate the flow paths between the nozzles and the impact points on the wafer.
The rinse configuration on the develop module used in the experiments consisted of two nozzles, one oriented vertically and positioned to strike the center of the wafer, and the other mounted to strike the wafer with some lateral velocity slightly beyond midradius (see Figure 6a). When the rinse nozzles were raised, the flow rate of the lateral rinse was reduced to maintain its impact location on the wafer; this change could explain the increase in defects with that height. Also, the DI-water stream flowing from each rinse nozzle at the raised height separated into several smaller streams before reaching the wafer, which probably compromised the rinse paths on the wafer, leaving some areas with less water and resulting in less-effective rinsing.
Number of Rinse Nozzles. It has been reported elsewhere that two rinse nozzles are more effective than one for removing develop process residue.9 The experiments reported here compared three nozzles to two. The two rinse-nozzle configurations are shown in Figure 6 (dotted lines indicate the path of rinse flow above the wafer). In the three-nozzle setup, the additional nozzle was mounted to the side of the wafer and aligned so that its rinse stream struck the wafer midradius.
Figure 7: Comparative defect performance with two- and three-nozzle rinsing. T-test: Equality of means rejected with at least 95% confidence.
As seen in Figure 7, there was a decrease in unclustered defects for the three-rinse setup compared with the two-rinse process. This result was not surprisingmore water and better rinse coverage should result in more-effective rinsing.
Although the improvement in defect performance was not dramatic, it suggests that other changes in the rinse elements (specialized nozzle designs or alternate rinse solutions, for example) might further minimize defects.
Puddle Refresh. In many production puddle-develop processes, fresh developer is added to the puddle at a certain point within the dwell period to enhance process resolution. These experiments compared the POR develop operationa single-puddle (no refresh) procedurewith three different methods of puddle refresh: a standard refresh where the original meniscus of developer is displaced by dispensing additional developer onto the wafer; a spin-off refresh, where spinning the wafer removes the original meniscus before fresh developer is added; and a rinse-off refresh, where the original meniscus is rinsed off of the wafer with DI water before the developer is replaced. With all three methods, the refresh was initiated halfway through the dwell period.
Figure 8: Comparative defect levels for four puddle-refresh methods used with resist + TARC and single-layer resist processes.
As Figure 8 indicates, with the standard and spin-off refresh methods, the number of defects for both the resist + TARC and single-layer resist processes actually increased. The
agitation of developer solution that occurs during these procedures may cause some semideveloped residue to adhere
permanently to the resist or wafer surface. The rinse-off refresh methodessentially two complete dispense, dwell, and rinse cycleswas equivalent in defect performance to the single-puddle process without developer refresh.
Although the data are not shown, the point at which developer refresh occurs was also investigated, with times ranging from 7 to 90% of the dwell period. No correlations were found; the standard refresh generated more defects than the POR, regardless of when it was initiated.
In sum, the results of the process optimization experiments highlighted some critical factors to consider when attempting to reduce defect levels. It was found that spinning the developer meniscus off the wafer before rinsing begins is cleaner than rinsing onto the meniscus, especially with multilayer resist processes. In addition, the rinse spin speed should be optimized for each specific process, rinse nozzles should be configured for consistent coverage across the wafer, and more nozzles can lead to better defect results. Finally, when a puddle refresh is used, it is important to understand how the refresh method can affect defect performance. Fewer defects will result if the developer puddle is rinsed off the wafer with DI water before being refreshed than if displacement or spin-off techniques are used.
Monitoring Defect Categories for Tool Control
Although total-event data from automated inspection equipment are often used with statistical process control (SPC) for process or tool monitoring, it is usually not very effective because SPC that is based only on total events is subject to extensive variability.3,6,17 However, if tool-specific SPC charts can be established for each defect category, the data gathered with a daily monitor wafer can accurately reflect current process conditions on a given tool, thereby ensuring that proper responses are initiated if specified defect levels are reached. By using monitor wafers rather than production wafers, only tool-induced defects will be found and there is no risk of prior-level events generating false alarms.
Some modifications to the categorized defect learning method are required to use it effectively as a tool control. In contrast to process-optimizing experiments, where inspection equipment sensitivities are tuned to find every event on a wafer, for tool control applications, the sensitivities must be detuned to a level where only gross excursions are identified. This detuning will shorten inspection turnaround time, reduce the frequency of excursions, and increase the urgency of a response when an SPC chart indicates an out-of-control process. The defect categories must also be grouped and simplified to compensate for the increased number of people who will be classifying events on production monitors. Fewer categories will leave less room for errors caused by individual interpretation.
The feasibility of monitoring photocluster process performance with defect category data is under review at the IBM Microelectronics semiconductor production facility in Essex Junction, VT. Initial efforts are concentrated on establishing an optimal photomask pattern for defect identification, characterizing the sensitivity and process window of the defect monitor, establishing the defect-level baselines of a few specific tools, and training manufacturing personnel. Based on the success of using categorized defects for process optimization, the use of a similar approach for tool control looks very promising.
Conclusion
The categorized defect learning method is a valuable tool for understanding and optimizing the defect performance of a particular photolithographic process. After silicon monitor wafers are fully processed and evaluated using pattern recognition methods, anomalies in the resist pattern are highlighted, reviewed, and categorized. By providing a statistically significant sample of categorized defects, the method enables process engineers to design short-loop experiments that can provide a wealth of information. A small segment of the process can be quickly dissected and analyzed to find its specific effects on defect levels. With minor modification, the method can also be used for process or tool control monitoring, generating real-time data that accurately reflects current defect performance.
Acknowledgments
This article is a revised version of a paper originally presented at the Olin Microlithography Seminar, Interface '96, San Diego, October 1996. The authors wish to thank Dana Basiliere for his skilled and service-oriented support of the
inspection tool programming, and Joe Mundenar for his
motivational and managerial assistance when needed.
References
1. Kamoshida M, Inni H, Ohta T, and Kasama K, "Sizes and Numbers of Particles Being Capable of Causing Pattern Defects in Semiconductor Device Manufacturing," IEICE Transactions on Electronics, E79-3(3):264271, 1996.
2. Childs KD, Paul DF, and Clough SP, "Sub-0.25-Micron Defect Analysis on 200-mm Semiconductor Wafers," Proceedings of SPIEThe International Society for Optical Engineering, 2725:255260, 1996.
3. Tobin KW, Gleason SS, Karnowski TP, et al., "An Image Paradigm for Semiconductor Defect Data Reduction," Proceedings of SPIEThe International Society for Optical Engineering, 2725:194205, 1996.
4. Martin H, and Bichebois P, "Defect Detection Methodology on the Back-End Process: A Case Study," Proceedings of SPIEThe International Society for Optical Engineering, 2725:233241, 1996.
5. Bichebois P, Perret P, Martin H, et al., "Review and Characterization of Defects after Automatic Optical Inspection on Patterned Wafers," Proceedings of SPIEThe International Society for Optical Engineering, 2725:261272, 1996.
6. Schwart J, "Process and Machine Mastering Employing WF-710 Wafer Inspection System," Proceedings of SPIEThe International Society for Optical Engineering, 2725:242254, 1996.
7. Singh H, Lakhani F, Proctor P, and Zazakoff A, "Defect Data Management System at Sematech," Solid State Technology, 38(12):7580, 1995.
8. Pratt LD, "Photoresist Aerosol Particle Formation During Spin-Coating," Proceedings of SPIEThe International Society for Optical Engineering, 1262:170179, 1990.
9. Mirth G, "Reduction of Post-Develop Residue Using Optimal Developer Chemistry and Develop Rinse Processes," Proceedings of SPIEThe International Society for Optical Engineering, 2635:268275, 1995.
10. Alvis JR, and Satterfield MJ, "Using Optical Pattern Filtering Defect Inspection Tools and Process Induced Defects per Wafer Pass for Process Defect Control," Proceedings of SPIEThe International Society for Optical Engineering, 2725:217232, 1996.
11. Costigan JG, and Wolf TM, "Defect Reduction in the Resist Apply Area," Proceedings of SPIEThe International Society for Optical Engineering, 1802:8194, 1992.
12. Nestle GJ, "Reducing Photolithography Defects to Meet
Submicron Device Requirements," Microcontamination, 11(2):3538, 1993.
13. Brundle D, Uritsky Y, and Pan JT, "Extending Particle Characterization Limits in Wafer Processing," MICRO, 13(7):4356, 1995.
14. Sullivan N, and Arsenault S, "SEM/EDS Analysis Method for Bare Silicon Particle Monitor Wafers," in Proceedings of the IEEE/SEMI 1994 Advanced Semiconductor Manufacturing Conference and Workshop, pp 293296, 1994.
15. Bennett MH, Tobin KW, and Gleason SS, "Overview of Automatic Defect Classification," in Proceedings of the IEEE/SEMI 1994 Advanced Semiconductor Manufacturing Conference and Workshop, p 272, 1994.
16. Bokelberg E, and Goetz J, "Manufacturing Requirements for a Single-Wafer Develop Process," Proceedings of SPIEThe International Society for Optical Engineering, 2438:737746, 1995.
17. Henis NB, "Yield Enhancement through Monitoring of Real-Time Manufacturing Processes," Proceedings of SPIEThe International Society for Optical Engineering, 2334:91101, 1994.
Eric H. Bokelberg is a staff engineer and team leader for photolithographic process and equipment engineering at IBM Microelectronics' semiconductor production facility in Essex Junction, VT. Since joining IBM in 1991, he has been involved in photolithographic process optimization and photocluster productivity enhancement. Bokelberg received his BS and MS degrees in mechanical engineering from Penn State University. (Bokelberg can be reached at 802/769-8964.)
James L. Goetz is a technical lab specialist and member of the photolithographic process and equipment engineering team at IBM Microelectronics' Essex Junction facility. Since joining IBM as a technician in photolithography inspection in 1977, he has held assignments in defect inspection, industrial engineering, and process engineering. His current responsibilities include single-wafer develop process optimization and support. Goetz has an associates degree in applied science from Champlain College, Burlington, VT.
Michael E. Pariseau is also a technical lab specialist and member of the photolithographic process and equipment engineering team at Essex Junction, with responsibility for develop process optimization and defect reduction. He joined IBM Microelectronics in 1979 and has held positions in photomask fabrication, drafting services, equipment maintenance, and process engineering. Pariseau earned an AS degree in electrical engineering from Vermont Technical College in 1987.

MicroHome |
Search | Current Issue | MicroArchives
Buyers Guide | Media Kit
Questions/comments about MICRO Magazine? E-mail us at cheynman@gmail.com.
© 2007 Tom Cheyney
All rights reserved.
|