Project Format

This project is relies on multiple regression analysis to analyze a data set that is of interest to you. The final report for the project should be a 4-7 page paper that describes the questions of interest, how you used your data set to analyze these questions with details on the steps you used in your analysis, your findings about your question of interest and the limitations of your study. Specifically, your report should contain the following:

1. Introduction. The introduction succinctly states the problem you are interested in, briefly describes your data and the method of analysis, and summarizes your main conclusions. A summary of what you set out to learn, and what you ended up finding. It should summarize the entire report.

2. Data Description. This section provides the details of the data sources, any transformations you have done to the data (for example, changing the units of some variables), gives a table of summary statistics (means and standard deviations) of the variables, and provides scatterplots and/or other relevant plots of the data. If there are outliers other than those arising from corrected typographical or computer errors, this is the place to point them out.

3. Regression Analysis.Describe how you used multiple regression to analyze the data set. Specifically, you should discuss how you carried out the steps in analysis discussed in class, i.e., exploration of data to find an initial reasonable model, checking the model and changes to the model based on your checking of the model.

4. Empirical Results. This section provides the main empirical results in the paper. Conventionally, regression results are presented in tabular form, with footnotes clearly explaining the entries. The initial table of results should present the main results; sensitivity analysis using alternative specifications can be presented in additional columns in that table or in subsequent tables. The text should provide a careful discussion of the results, including assessments both of statistical significance and of economic significance, that is, the magnitude of the estimated relations in a real-world sense.

5. Summary and Discussion. This section summarizes your main empirical findings and discusses their implications for the original question of interest. Describe any limitations of your study and how they might be overcome in future research and provide brief conclusions about the results of your study.

Checkpoint 1: Topic Selection.

The model and the data are the starting points of an econometric project. The first step in formulating a model is to select a topic of interest and to consider the model’s scope and purpose. In particular thought should be given to the objectives of the study, what boundaries to place on the topic, what hypotheses might be tested, what variables might be predicted, and what policies might be evaluated. Close attention must be paid, however, to the availability of adequate data. In particular the model must involve causal relations among measurable variables.

The topic selected can be economic or noneconomic. It could be a particular market (the market for UIUC graduates, the market for economists, the market for cellular phones), a process (economic development, inflation, unemployment), demographic phenomena (birth rates, death rates), environmental phenomena (water quality, air quality), political phenomena (elections, voting behavior of legislatures), some combination of these, or some other topic.

You are free to choose the topic of your choice. The topic you choose will require approval from your instructor. Some paper title examples are presented below:

Air pollution and Population

Differential Growth in U.S. Cities

Birth Rates, Death Rates, and Economic Growth in Developing Economies

Economic and Social Determinants of Infant Mortality in the United States

The Relationship between Exports and Growth in Less Developed Countries

Remember that these ideas above are merely examples of reasonable topics. You should be original and follow your own interests. Perhaps the best choice of a topic is one in which you have prior experience or knowledge.

Keep in mind that this project is studying the impact of some independent variable (or variables) X on a dependent variable Y. But since in most cases there are many variables X that have an influence on the variable Y, it is important to include all those variables on the right hand side of the equation. To ensure that the model is both interesting and manageable, it should contain at least three to four independent variables on the right hand side–in other words, we want a multiple regression model. The model should be formulated as an algebraic, linear, stochastic equation along with a corresponding verbal statement of the meaning of the equation. The expected signs of all the coefficients should be considered. All relevant multipliers, short-run and long-run, should be identified and considered.

Checkpoint2: Hypothesis and Research Question

Particular thought should be given to the objectives of the study, what boundaries to place on the topic, what hypotheses might be tested, what variables might be predicted, and what policies might be evaluated.

Once you have a general understanding of your topic, narrow it down into a manageable research question or hypothesis. This will help you define the parameters of your research, as well as your argument. A research hypothesis is a statement of expectation or prediction that will be tested by research.

Hypotheses look very much like “mini-arguments”; the objective of the research paper is to present evidence that will prove those hypotheses.

Before formulating your research hypothesis, read about the topic of interest to you. From your reading, which may include articles, books and/or cases, you should gain sufficient information about your topic that will enable you to narrow or limit it and express it as a research question. The research question flows from the topic that you are considering. The research question, when stated as one sentence, is your Research Hypothesis.

In your hypothesis, you are predicting the relationship between variables. Through the disciplinary insights gained in the research process throughout the year, you “prove” your hypothesis. This is a process of discovery to create greater understandings or conclusions.

Project Checkpoint 3: Identify Variables for the Study

Keep in mind that this project is studying the impact of some independent variable X on a dependent variable Y. But since there are many variables X that have influence on the variable Y, it is important to include all those variables on the right hand side of the equation.

To ensure that the model is both interesting and manageable, it should contain at least three to four independent variables on the right hand side. The model should be formulated as an algebraic, linear, stochastic equation along with a corresponding verbal statement of the meaning of the equation. The expected signs of all the coefficients should be considered. All relevant multipliers, short-run and long-run, should be identified and considered.

Project Checkpoint 4: Data Sets

Before finding a data set, you must be aware of what data will help you to answer the question you are investigating. It helps to understand how you intend to perform your analysis. What unit of observation would be most useful ( local governmental data? international data? etc.)?

In order for you to choose the right data set, you must be clear about what variables you are using before you search for your data set. You should already know what you are using for your dependent variable and what variables will help you answer the research question most effectively.

The UIUC library would be a good place to start in your search for data. In addition to the material resources available there, you can also seek assistance from the data librarian, who will point you in the right direction.

Here are some ideas for data sources that are available for public use:

Statistical Abstract of the US

Statistical Handbooks

Statistical Yearbooks

Federal Reserve Economic Data

International Economic Conditions

National Statistical Abstract

Center for Research in Securities Prices

Project Checkpoint 5: Regression Analysis

The first step in your empirical analysis is getting familiar with the data. Plot the data, using histograms and/or scatterplots. Are there big outliers, and if so are those observations accurately recorded or are they typographical or data manipulation errors? Be very careful when you input your data as any errors may completely throw off your analysis. Once you feel that the data are is error-free, you can start looking at specific relationships. Are the units of the data the ones you expected, and are they the ones you want to use? Do the relations you see in the scatterplots make sense? Do relationships look linear, or do they look nonlinear?

Once you are acquainted with your data set, you can begin your regression analysis. This is the point at which all the previous work you have done preparing your study begins to pay off.

