Minicurso 2 : Record Linkage

Intructor: Bill Winkler (US Bureau of Census)


Record linkage is used in creating a frame, removing duplicates from files, or combining files so that relationships on two or more data elements from separate files can be studied. Automated record linkage can yield substantial cost and time savings. Rather than develop a special survey to collect data for policy decisions, it might be more appropriate to match data from administrative data sources. For econometric modeling, an economist might wish to link a list of companies and the energy resources they consume with a comparable list of companies and the types, quantities, and dollar amounts of the goods they produce.

This course will cover the reasons for record linkage, how to preprocess files, means of estimating ‘optimal parameters’ and error rates, and some of the limitations of the methods. It describes computer matching techniques that are based on formal mathematical models subject to testing via statistical and other accepted methods. Methods of adjusting analyses for matching error in merged data bases will be described. Classroom notes, computer science and mathematical preliminaries, and a bibliography will be provided.

Maximum course size: 40.

Language: The course will be in English.

