Sunday, February 27, 2011

Gray income and national income distribution

 Gray income and national income distribution (the)
Wang Xiaolu
Introduction
2005-2006 years, we have dozens of different sizes throughout the city conducted a survey of urban income and consumption. On this basis, 2007 years of a speech entitled and the National Economic Research Institute China Reform Foundation funding, completed in 2007). In this report, of the 2005 projections of disposable income of urban residents, there are 4.8 trillion yuan is not reflected in income statistics hidden income, mainly in high-income class. of the hidden income that most of these are the highest income and lowest income 10% of all the income gap between families from 9 times to 31 times; the country the highest income and lowest income residents of the 10% of the income gap between families from 21 times to 55 times. This means that China imbalance in the distribution of national income than in the past generally understand the situation more serious.
The report also based on household car ownership, commercial housing sales, the number of private people travel abroad, as well as the distribution of the data bank, from different perspectives income (mainly high-income residents income) were estimated to calculate the results of these cross-validation. but also a source of income for the gray some preliminary analysis, that this was mainly due to institutional shortcomings caused an urgent need to promote structural reform to address the problem of imbalance in the distribution of national income.
the projection method and the results are credible? after a few years, the national income distribution situation has changed? To answer these questions, we in 2009 re-organization of the urban households income and expenditure survey (data collected for the year 2008 data), and the survey data based on new research and analysis. This report on the scope of this survey, investigation and analysis, and the presentation of the results, and further analysis on this basis, the phenomenon of imbalance in the distribution of national income and the required response.
report of the first part of this report income of urban residents of our survey methods and sample distribution. The second part about the analysis, reporting the results of technical analysis. The third part of the survey data and technical analysis, based on the projected disposable income of urban residents and real true level of income inequality. The fourth part is the source of income for the further analysis of gray . The fifth part of gray income effect on the distribution of national income. sixth part is the conclusion of this report.
an income of urban residents survey methods and sample distribution
1. how to get the real survey data
The author believes that the current statistics on the income survey data there are significant distortion. particularly on high-income residents of urban disposable income, the distortion is very serious. This does not mean the survey methodology or calculation errors. the current urban and rural residents household survey sample is based on statistical random sampling method determined. This method itself does not exist systematic errors. but it must be noted the following:
1. the existing household survey sample is determined based on a voluntary basis But high-income residents do not want to have a large proportion of the survey, resulting in the sampling process was forced to replace the sample, so the sample occurred during the replacement of missing high-income residents.
2. included in the survey sample high-income residents, many people are reluctant to provide their real income information. In their report the income, wage income of the true extent of the higher, while other income may be low; in particular, some of them do not want to have a large number of exposure of unknown origin problem, it is difficult in the existing statistical sample of households and the existing methods of investigation to be addressed thoroughly within the need to explore other ways. For this reason, the primary purpose of our study is to obtain data on the real income. In 2005, -2006 survey, we draw on the sociological methods of investigation, the professional investigators from all over are familiar with their relatives, friends, colleagues, neighbors of the family conducted a survey of the balance of payments. We have also taken a number of supplementary measures to ensure the survey data authenticity. Facts prove that the method is feasible, the survey data obtained with relatively high confidence.
in the 2009 survey, we used the same methods of investigation, but adopted a more stringent quality control measures, and expand the survey sample. have to say that the survey method is different from the random sampling method, we can not directly use this data to calculate the sample survey of urban residents in China's overall distribution of income, and must resort to other methods projections. on the projection method, the second part of this report in detail. The main purpose of this part of the survey methodology is described and sample distribution.
before the survey, we conducted a questionnaire around the investigators and investigation method of training. In order to eliminate possible concerns of respondents, the questionnaire anonymously to the survey before the start of the survey respondents provided the commitment is only for research and sample data confidentiality. in the investigation methods, has also taken reduce the sensitivity of a number of investigations and measures to facilitate access to real data. For example, on the objectives of the survey, with emphasis on studies of consumption structure, rather than on income levels; in the questionnaire design, consumer issues and asked to ask income questions, to ask specific sub- item asked, consumption and income. On the revenue sources, the questionnaire only asked to answer a simple classification of income (including wage income, part-time and labor income, the entity operating income, financial investment income, property income, intellectual property income, types of transfer income, and the above are not included in other income), are not required to answer the specific source of income. In the end of the visit, the request of investigators and the respondents fill in their relationship and their credibility of survey results (including the possible deviation of the direction and extent of deviation) of the individual estimates, as reference information for the questionnaire.
investigation is completed, we conducted a comprehensive survey of quality inspection. In addition to information integrity and accuracy of the check survey sites, we designed a set of inspection procedures, the logical relationship between the various issues, as well as income and consumption data of the quantitative relationship between the conduct a reasonable check on the quality does not meet the requirements of the questionnaire (including information missing, altered, data anomalies , logic error between the different information which can not be the problem identifying the correct information and so on more than criteria, and not part of the survey of urban residents) and the questionable accuracy of information the questionnaire did remove.
2. Survey sample The survey
distribution in 19 provinces (including municipalities) of the 64 different-sized cities and 14 counties and towns for the county,
provinces (including municipalities), including Beijing, Shanghai, Shandong, Jiangsu, Zhejiang, Guangdong, Shanxi, Henan, Hubei, Anhui, Jiangxi, Liaoning, Heilongjiang, Sichuan, Chongqing, Yunnan, Shaanxi, Gansu, Qinghai. This ensures that the eastern, central, western and northeastern regions have a certain number of sample distribution and take care of the distribution of North and South.
cities including Beijing, Shanghai, Jinan, Nanjing, Hangzhou, Guangzhou, Taiyuan, Zhengzhou, Wuhan, Hefei, Nanchang, Shenyang, Harbin, Chengdu, Chongqing, Kunming, Xi'an, Lanzhou, Xining, Shenzhen, Qingdao, Suzhou, Datong, Anshan, Fushun, Qiqihar, Daqing, Xuzhou, Yangzhou, Fuyang, Wuhu, Luan, sunshine, Xiangfan, Yichang, Dongguan, Zhongshan, Mianyang, Xinzhou, Kaifeng, Sanmenxia, Zhumadian, Xiaogan Yidu, Pizhou, Fuyang, Jinhua, Shaoxing, Shaoguan, Chaohu, Chuzhou, Ganzhou, Jian, Jingdezhen, Jiujiang, Dandong, Tieling, Mudanjiang, Xichang, Xianyang, silver, Jiayuguan, Tianshui, Yuxi. including municipalities, provincial capitals and Fanshi County, Shanxi Province, Pei County, Jiangsu Province, Xiangshan County, Zhejiang Province, Shandong Province, the plain County, Qihe County, Henan Hua County, Hubei Province Dawu County, Chongqing City, Kaixian, Zhong County, Xianyang City, Shaanxi Province Liquan County, Gansu Province Gaolan County Soil and, Minhe County in Qinghai Province. The county is the geographic distribution more balanced.
number of cities selected in this investigation are more, but the distribution of the sample in each city scattered, which is based on two considerations: First, an excessive number of urban samples, sample households can not ensure that the investigation by the professional investigators are familiar with the family, and this investigation contradict the original intention of the program. Second, the sample includes more number of cities, but also to ensure better representation of the sample.
our survey method also has drawbacks. A major problem is that the survey is conducted once, on the respondent's household income and consumption data are provided by the respondents from memory (but in the process of selection of respondents has been ruled out of the household income and consumption not know enough about the family members). and the book-entry compared to a sample survey, which will have a greater data error. but not more than book-entry one-time survey investigating the high cost, time-consuming, difficult, and because of the sensitivity of survey content, more easily lead to systematic bias. The one-time survey is not accurate because of the memory data error caused by the general terms are randomly distributed, rather than systematic. in the sample average sense, the random error of plus or minus offset each other due to the greatly reduced, while the systematic error is not automatically offset. So the research project on this topic and research conditions, all the way to take a one-time survey.
The survey included a total of 4909 sample households, through strict quality inspection, remove the questionnaire does not meet the quality requirements of 689 samples, and another 25 samples is not a negative income included in the analysis (because analysis shows that under normal circumstances, most of them do not belong to low-income families, temporary negative income is mainly due to operating losses), the actual analysis using effective sample 4195.
Table 1 (omitted) lists all the survey sample and effective geographical distribution of the sample, the distribution by city size, age and residence status of the respondents distribution of sample households the highest educational level of income distribution, the respondents per capita disposable income of the family distribution and so on. you can see, sample different regions of the country between the cities of different sizes, as well as their age, education level of the distribution is more balanced; but living in larger cities, higher education, as well as in business activities and the proportion of white-collar professional population, higher than in the urban population in these groups in the proportions. This is because, according to 2007 research report of the results, level of income of urban residents mainly in the statistical deviation of the high-income residents . In order to ensure a sufficient number of high-income residents in the sample for analysis, the survey has increased the awareness of this part of the population the number of samples. We use methods of analysis, will ensure that such differences do not affect the distribution of urban residents across the country per capita income distribution of the projection results.
Second, analysis methods and techniques of analysis
1. This report uses the basic calculation method m
Engel coefficient calculated according to the survey sample data, the disposable income of urban residents in ways to summarized as follows:
First, we carried out the purpose of the survey income of urban residents, not to be inferred from the survey sample of urban residents directly to the distribution of aggregate income, but on the basis of authentic data, projected income levels and a number of consumer The relationship between parameters. One of the key parameters of consumption is the Engel coefficient (ie, household food consumption expenditure accounted for the proportion of total household consumption expenditure). Engel's coefficient is a parameter associated with income levels, the downward trend that reflects income levels rise; which the economics profession is a recognized fact. This is because food and clothing to meet the basic needs, the residents will gradually shift to pursue other needs met, such as the demand for travel and communications and liaison, for luxury goods needs, and education, culture and entertainment, the demand for higher level. Thus, as income levels, residents incremental expenditures for food will decline, while higher levels of consumption for some of the incremental spending will increase, so that they in the proportion of total consumption changes.
based on this principle, we can be based on a more credible, more representative survey sample, to calculate the Engel coefficient and household disposable income levels, and use statistics study or econometric methods to identify the statistical relationship between the two. based on these relationships obtained, we can any set of income statistics for testing. In other words, as long as we can get a group of samples is relatively reliable Engel coefficient to approximate the coefficient can be based on projected population of the group of real per capita income level. Therefore, we grouped according to the National Bureau of Engel coefficient of urban households, the projected level of average income of these groups, and these projections published by the group and income level statistics were compared to detect the existence of systematic statistical data errors, and how much of this error. We call this method as the > Of course, this premise is to claim a statistical sample of the Engel coefficient to be authentic. a natural problem encountered is that if a group of statistical data on the income level of systematic bias exists, the Engel coefficient data they systematic bias does not also have it? In fact, if there is a deviation income data (for example, be underestimated), then the consumer and commodity consumption expenditure data is likely there are some deviations. But first, as long as consumer spending and food consumption expenditure deviation is in the same direction, and in the statistical sense, largely to keep the same proportion, then the group average is still the basic credibility of the Engel coefficient. In this case, we can still use the Engel coefficient to calculate the level of real income. Second, Even if consumer spending and food consumption expenditure to maintain the same proportion of deviation in the calculation of the Engel coefficient, the deviation in the same direction can still be largely offset each other, so that the deviation of the Engel coefficient of deviation is far less than income levels. therefore still can be used to calculate income levels, but the lower the accuracy of projections.
of 2007, according to the research, found that residents in the balance of payments statistics, the income level of high-income residents in the largest deviation of the data, significantly lower than their real income. their consumption expenditure and food consumption expenditures are also a degree of underestimation, but the extent is far less than the income of the deviation. in which the deviation of food expenditure than the deviation of the total spending less. This means that , the Engel coefficient from this calculation may be slightly higher, according to the Engel coefficient and therefore the projected income levels may be slightly lower; but still to a large extent the original income data error correction. However, we also need to know that we the income level of the calibration results are compared with real income, to some extent may be more or less still low.
also be noted that using this method statistical data on the income test, and can not be quantify the statistical sample of high income residents omissions, only income data on the existing statistical sampling to correct the systematic bias. Thus, after correction of the results may still be underestimated to some extent, the income level of high-income residents (due to the reasons for missing samples).
below to take two specific methods of the Engel coefficient to establish the relationship between income levels, and income statistics for testing. These two methods belong to the Engel coefficient, but the analytical tools and the process is different. In the following narrative, for simplicity, the author of ;, and our sample of the survey as Results per capita income is projected as This method is used. The steps are as follows:
first step, all the survey sample were calculated per capita incomes and Engel's coefficient.
second step, the calculation of statistical sample packet Engel coefficient. National Bureau of Statistics published annually by the town Residents grouped income data, is based on per capita income, the country is divided into seven groups of urban households. where the lowest income (times) of low-income, high income, (second) high-income are the four groups divided by ten equal parts, that each 10% each of the urban households. among the three groups (low-income, middle income, middle income group) is divided by five equal parts, each accounting for 20% of urban households (National Bureau of Statistics data sources see the calendar : three steps, the effective investigation of all the samples sorted by per capita income, grouped from low to high. grouping method is to start from the lowest income, the cumulative sample one by one until the group, the average Engel coefficient and ; Engel coefficient equal to the average so far. We call this set of samples for the and also to the average Engel's coefficient and the average level of per capita income group. According to the reasons explained above, we assume that a group of residents of the Engel coefficient and their corresponding income level has a unique relationship. That is, given an Engel coefficient of a group of residents, the group Our per capita income should be calculated, corresponding to per capita income in the Engel coefficient.
fifth step, the survey sample in each group with the corresponding statistical sample to compare the per capita income, the difference between the data to reflect the statistical sample omissions. Table 2 was to investigate the grouping of samples and statistical distribution of the sample (abbreviated). you can see in the Engel coefficient obtained after the same survey the proportion of the sample distribution of each group is different. you can see, according to Engel's Coefficient divided into seven groups, and so much higher per capita income as part of the survey sample was left on the outside of the seven groups (because the Engel coefficient lower). This group of disposable annual income in excess of 40 million, the highest 176 million. Table 3 is the statistical sample survey sample and the Engel coefficient and per capita income of the comparative results (omitted).
can be seen from Table 3, the Engel coefficient of each group in the corresponding case of equal eleven, the survey sample per capita income of each group were higher than the corresponding statistical sample, but the balance and very regularly slip progressively expanded, especially the difference between the highest income group and the largest slip, statistical sample of per capita income in the highest income group, only 4.3 million, the highest income group survey sample was 16.4 million yuan, nearly 3.8 times the former. the difference is the difference between the total sample accounted for 2 / 3. This, and the report of the study in 2007 found that the situation is almost the same The only difference is in the difference between low-income groups and the slip to some extent, reported in 2007 than projected. so, it basically verifies the results of the 2007 report of the credibility. but for the projection credibility, we have adopted in the next section an alternative method to verify the projections.
also be noted that in the report of the 2007 study was published, a few readers might think that this research method is still the same as survey sample used to calculate the distribution of aggregate income of urban residents, and thus the credibility of the findings into question. because both the survey methodology of this study (non-random sample) or sample size (small scale), are not suitable for directly to the overall income distribution of urban residents projected. This misconception is due to the Engel coefficient used in this study method, especially for comparative analysis of group a lack of understanding. In fact, even if the packet does not understand our analysis and use from the survey sample projection of the substantial difference between the overall approach, as long as the results of the two methods were compared to clear. In table 4, (abbreviated), the authors use data from this survey, the results of analysis with the group and direct the overall projection distribution results were compared. Obviously, the two methods are not only the Engel coefficient of each group are different, and differences in per capita income of each group is more significant. which on per capita income in the highest income group, group analysis result is 164,000 yuan, and the overall method are projected 29.4 million. The difference is obvious.
3. model analysis of group comparison method
level of income projections, there are also disadvantages, namely that method assumes that the Engel's coefficient is only related with income levels. But in fact, the Engel coefficient may also be affected by other factors, such as consumer prices, different eating habits between local residents and so on. so one really only corresponds to the Engel coefficient of a certain income levels? is suspected.
So here, the second of the specific calculation method used, which can be called a model analysis. This approach to measurement-based model, and be able to income levels than other may also affect the Engel coefficient of other variables as control variables included in the model is tested and the Engel's coefficient in the calculation of the relationship between income levels the impact of these additional factors are excluded. This method can avoid the shortcomings of packet analysis Department, has obvious advantages. The basic steps can be summarized as follows:
first step, determine the control variables. We need to survey samples of the Engel coefficient and per capita income with the return of econometric methods to identify the per capita income of the Engel coefficient impact factor; and carrying out the work of this step, we must find the Engel coefficient may affect other factors included as control variables in the model, to estimate its impact, it may get the correct coefficient of income effects.
First, cities of different sizes, all kinds of consumer goods price levels are very different. This may affect the city's Engel coefficient of different sizes. For example, large cities away from agricultural produce, agricultural products are higher transportation costs and loss of the middle part of is greater, so food prices may be significantly higher than that of small cities, the rate may be higher than other consumer prices higher than the range of medium and small cities (this is because the vegetables, meat and other agricultural products impatience preservation, storage costs and the middle larger loss). So the other under the same conditions, large Engel coefficient of urban residents may be higher than medium and small urban residents. because they can not get the absolute level of price data, set in the model of a city that the city size variable , in which the district towns, 100 million people in the following cities (here referred to as medium and small cities of) ,100-200 million people between the cities (here called the large cities) and more than 200 million people in the city (here called large cities) were assigned to 1,2,3,4.
Second, different regions have different spending habits of residents. Some local residents have a higher population than other areas of food preferences, and may more than spending on food consumption in other areas. Through the analysis of the survey sample data, the authors found that other conditions being equal, Shanghai, Jiangxi, Sichuan, the Engel coefficient was significantly higher than the province average. The three provinces that use dummy variables H1. Beijing, Shandong, Hubei, Guangdong, Chongqing, Henan, the Engel coefficient to a certain extent, higher than average, the provinces with the dummy variable H2 said. and Liaoning, Shanxi, Engel coefficient of less than average, both expressed with L1. Accordingly, in the model include these dummy variables. not included in other provinces (Jiangsu, Zhejiang, Anhui, Heilongjiang, Yunnan, Shaanxi, Gansu, Qinghai) the end of the sample as the sample.
Third, family population may have an impact on the Engel's coefficient, because larger families may have in the food expenditures of scale, to save expenditure on food. So set a representative number of household variables famliy.
Fourth, family members The average education level is likely to affect the Engel coefficient, because the residents of a higher educational level may be biased in favor of more spiritual needs, such as communications and liaison, education, culture and entertainment, and less educated residents in these areas may less demand, and in terms of food and other consumer spent more on alcohol and tobacco. So set a representative of the average education level of adult household members variable edu18, by the family members aged 18 and above the average years of education, said.
s V. Engel coefficient may also face employment with family members (family members of employment of family members of the proportion of the total) related. The reason is complex; the one hand, higher domestic employment, it may save expenditure on food, because practitioners may Dining in the unit, work unit in a way to enjoy the food subsidy. On the other hand, have a higher employment rate could lead to more eating out (because cooking at home more to spend time), leading to higher food expenditure. what kind of factors prevail, but also to demonstrate through testing. Model of Employment in the family setting variables emp.
second step, set the model's functional form. from the data, we can intuitively judge, the Engel coefficient and the relationship between per capita income is non-linear relationship. So each of the semi-logarithmic function is selected, half of the number of quadratic functions, quadratic functions, and three function model estimation. The Engel coefficient functions are explanatory variables (in eng that), semi-logarithmic function to the log per capita income lnY, and control variables city, family, edu18, emp, H1, H2, and L1 for the explanatory variables, known as the function (1). Semi quadratic function to function (1) based on the increase lnY the square, see the function (2). quadratic function of per capita income and its quadratic term, as well as control variables and the quadratic term as explanatory variables, See function (3). cubic function on the basis of the quadratic function of the explanatory variables increase the three items. function (2), (3) below, the function (1), (4) omitted.
eng = C1 + a1lnY + a2city + a3family + a4edu18 + a5emp + a6H2 + a7H1 + a8L1 + a9 (lnY) 2 (2)
eng = C2 + b1Y + b2city + b3family + b4edu18 + b5emp + b6H2 + b7H1 + b8L1 + b9Y2 < br> + b10city2 + b11family2 + b12edu182 + b13empl2 + b14H22 + b15H12 + b16L12 (3)
The third step is to estimate the four models. estimation results in Table 5 (a little). in the initial regression analysis found that model in which secondary and tertiary functions of some variable quadratic or cubic term is not statistically significant, and the t value is very low. Therefore, the regression in Table 5 of these items have been removed from the model.
from Table 5 The regression results, despite the adjustment of the four model R2 is not high enough, but most of them have high variable statistically significant, not only proved the Engel coefficient between per capita income level is significant negative correlation, but also proof of Engel's coefficient is also affected by family members, education level, family size, family members, employment opportunities, city size, and geographic characteristics. which model (2) the adjusted R2 highest. calculations show that the model (1) and model (2) The results are very close, and in the middle and low range of simulation results and statistics are close, and model (3), (4) of the simulation results in all revenue and statistical data interval are quite different, and in income levels no longer remain under high Engel coefficient decreases monotonically, which contradict with the facts. So the analysis will be adopted in the following model (2) results.
Figure I (abbreviated) is the function (1) - (3 ) simulate the Engel coefficient of income levels and the relationship between the curves, showing (1), (2) between the two functions very similar. graph vertical axis represents the Engel coefficient, the horizontal axis represents the level of per capita income (yuan).
fourth step, in order to get the final realization of the regression coefficient of each variable effect, the Engel coefficient to solve with different corresponding to the level of the national income of urban residents also need to determine the impact of variables in the sense of national average assignment.
According to 2007 statistics, urban residents in large cities, large cities, small cities, counties and towns (in the model values for the 1,2,3,4, respectively) of the distribution ratio of roughly 21 %, 25%, 33%, 21%. weighted average value of 2.5. but we know that people of different income groups in different cities there are differences between the distribution of high-income residents are more concentrated in large cities and big cities, The low-income residents are more concentrated in small cities and small towns. therefore, according to data analysis, the per capita income of low to high Sort combination of city residents in the scale of values, determined from a smooth change between 3.3 to 1.3.
on the average educational level of urban residents (18 years old and above), models were used from 1 to 5 and below assigned to represent the primary, junior secondary, senior secondary and secondary vocational, university graduate and undergraduate, master's and doctoral qualifications. estimated that the national urban average value of 3 or so. But education in the distribution of different income groups is a difference, of income from the lowest to the highest income level of residents and cultural values of the average portfolio to determine 2.6m3.8 of smooth changes between.
family members on the employment of urban residents face, according to statistics, the national average of 0.5 or less, but there are also differences between different income groups, from low to high values between the changes in the 0.38m0.62. < br> on the population of urban households, statistics show that the national average of 2.9 people, but the average size of low-income households is relatively large, high-income families, small changes in the range between 3.3m2.6.
Finally, between different regions on the eating habits of urban residents in different models under the same conditions, according to Engel's coefficient was the highest, high, normal, low provinces are divided into four groups, and their dummy variable coefficients are between 0.071 to negative 0.039 . the national average value by 0.01.
the completion of the assignment process of the control variables, you can return to get the model parameters and values of each factor, that take into account other factors impact on the conditions of the Engel coefficient Next, solution corresponding to the different Engel coefficient of urban residents income levels. The results reported in the next section.
Note: This article is gray income of Economic Reform Project Report, the text published in the CITIC Series ...

No comments:

Post a Comment