The project is put forward by Researchers who come from both the Academia and National Research Institutes and are divided into two research units (University of Perugia, UNIPG, and University of Pisa, UNIPI). Researchers in the two units come from different Universities and each one has been involved because of his/her experience and skills to deal with particular challenges of the project. Most of the Researchers involved have already successfully worked together and produced important contributions to estimation from sample surveys. UNIPG hosts the fundamental contribution of those Researchers from ISTAT who are in charge of the production of estimates for small areas from EUSILC and LFS, and of a Researcher from the Bank of Italy who is in charge of the design and the estimation strategy for SHIW. The PI has already collaborated with the Researchers from ISTAT and the Bank of Italy on estimation issues linked to small area estimation from the Italian LFS and non-sampling errors from SHIW (see the PI list of papers). UNIPG will also hire a PostDoctoral fellow to work on the project and envisions involvement of PhD students from the PhD school in Mathematics and Statistics at the Departments of Economics, Finance and Statistics of the University of Perugia. As mentioned earlier, the activities of the project are summarized in two Work Packages (WPs) that will be dealt with by an intense collaboration between the two research units. Each research unit will be in charge of particular tasks within WPs and also envisions partnerships with international universities where outstanding expertise on the topics of the project can be found and collaborations are already established (Prof. Ray Chambers, Un. of Wollongong, Australia; Dr. Nikos Tzavidis, Southampton Un., UK; Dr. Alina Matei, Un. of Neuchatel, Switzerland; Proff. Jean D. Opsomer and F. Jay Breidt, Colorado State Un., US; Prof. Peter G.M. Van der Heijden, Utrecht Un., The Netherlands; Dr Fabian Sobotka e Prof. Thomas Kneib, Georg-August-Universitat Gottingen, Germany).

WP1 deals with the estimation of labor market indicators (BES-3 indicators in particular) from the Italian LFS for planned/unplanned domains, with particular attention to the situation of young people. First, details on the LFS and on the precision of the direct estimates will be studied (UNIPG via the ISTAT personnel). Both units will then focus on a review of small area estimation (SAE) methods that borrow strength from auxiliary information to produce model-based estimates of labor market indicators (see Task 1.1 and Task 1.2 below). The units will then work on developments for different approaches - based on mixed models (UNIPG) and on M-quantile models (UNIPI) - to SAE to suit the LFS challenges (Task 1.3) and will consider parametric and non-parametric regression models as well as the inclusion of time or spatial correlations. Robust and Bayesian methods will also be investigated (UNIPI). As many survey variables in the LFS are categorical, empirical best predictors based on generalized linear mixed models (GLMMs) should be used. However, estimates of GLMM parameters can be very sensitive to outliers or departures from distributional assumptions. Hence, robust SAE based on GLMMs in the frequentist framework will be developed (UNIPI). A new approach to SAE for categorical variables based on M-quantile modeling will be studied to allow for simultaneous estimation of unemployment, employment and inactivity rates/totals. SAE based on finite mixtures models will be considered in which a discrete nonparametric distribution is assumed for the area effects that allows for an automatic clustering of the small areas. In addition, the project aims at developing SAE methodologies that account for the complex sampling design used in the LFS (UNIPG, Task 1.3). The issue of benchmarking will be considered by developing methods for estimating indicators for unplanned domains that are coherent with direct design based estimates released at higher level planned domains (UNIPG, Task 1.3). Using information from the different waves of the LFS, latent variable models for longitudinal data can be employed to investigate the evolution of this classification over time and transitions of one area from a cluster to another (UNIPG, Task 1.3). When dealing with area level models, direct estimates can be considered as variables observed at consecutive time occasions depending on unobservable characteristics and their evolution can be described by hidden Markov models (UNIPG, Task 1.3).

WP1: Analysis of geographical and temporal patterns of labor market indicators, with a particular focus on youth unemployment.
Task 1.1 (Month 1 - Month 4) Overview on the features of the Italian LFS.
- Analysis of the details of the survey design employed and the microdata available
- Preliminary analysis on the level of error for direct estimates of a selection of indicators for sub-regional local levels (BES-3 indicators)
Task 1.2 (M5-M12) Overview on existing small area methodologies for labor force indicators. Thematic literature review and outline of small area models, and of statistical and computational approaches including robust and Bayesian methods.

Task 1.3 (M13-M32) Modeling labor force indicators.
- Development of new models for SAE of BES-3 indicators
- Development of methods for estimating indicators for unplanned domains that are coherent with direct design based estimates released at higher level planned domains (benchmarking)
- Developments of models for estimating unemployment rates that account for spatial and temporal correlation
- Software development 

UNIPG will be main actor for Task 1.1, while UNIPI for Task 1.2, whereas models will be developed (Task 1.3) jointly by UNIPI and UNIPG also in collaboration with researchers from University of Southampton, University of Wollongong and Colorado State University.

The purpose of WP2 is to analyze and evaluate indicators at the local level on social sustainability, focusing on living conditions, income and poverty (BES-4 indicators). The first contribution of WP2 is to review the precision of the available indicators in official statistics at local level in Italy (UNIPI, Task 2.1). Then, new SAE methods will be studied developing ad-hoc statistical models for the variables available from EUSILC and SHIW involved in poverty and inequality indicators based on GLMMs, M-quantile (UNIPI) and latent variable models (UNIPG) - Task 2.2.

Estimation of indicators based on variables related to income and wealth suffers from the presence of possible bias, because households may refuse to disclose such information (nonresponse) or, when collaborating, it usually happens that rich ones tend to understate it so as to reduce their tax liability, while it may happen that the poor tend to overstate it for sense of shame or to avoid controls (measurement error). In the presence of such sensitive variables, the process that describes nonresponse is said to be non-ignorable as it depends on the variables of interest. The project intends to introduce new methods able to use all auxiliary data available for the use of latent variable models to measure the propensity to response (UNIPG, Task 2.3). The possibility of collecting and using information on those subsets of the population which are known to have high non-response rates (elusive populations) using link-tracing sampling methods will also be pursued (UNIPI, Task 2.3). Finally, in order to increase respondent's collaboration and obtain truthful data on sensitive variables, the use of Randomized Response Techniques will be explored and tailored to the estimation of the aforementioned indicators of economic inequality and poverty (UNIPG, Task 2.3).

WP2: Mapping and monitoring households economic wellbeing.
Task 2.1 (M1-M4) Critical review of the precision of the available indicators (BES-4 indicators) in official statistics at the local level: - Critical analysis of the data available in EUSILC and SHIW
- Poverty indicators Social cohesion/sustainability (BES-4 indicators)
Task 2.2 (M5-M30) Modeling poverty, inequality and income indicators:
- Development of models for SAE of BES-4 indicators
- Development of models to estimate cumulative distribution function and quantiles
- Development of SAE methods based on latent variable models
- Software development
Task 2.3 (M13-M36) Treatment of non sampling errors.
- Development of methods to deal with nonignorable nonresponse
- Development of generalized calibration estimators that use latent constructs as instrumental variables
- Development of randomized response methods to increase data quality
- Development of finite mixture models for SAE methods that account for measurement error
- Software development

Tasks 2.1 will be coordinated by UNIPI. Both partners will be concentrated on the development of the methodological issues of Task 2.2. Tasks 2.3 will be coordinated by UNIPG also in collaboration with the personnel at the University of Granada, University of Neuchatel and Utrecht University. 

The aim of the project is to provide official statistics producers a set of useful and usable tools to disseminate routinely estimates of relevant socio-economic indicators for small geographical regions or other subsets of the national population. The relevant indicators will be chosen within the set considered by ISTAT and CNEL for the measurement of Equitable and Sustainable wellbeing (BES). The availability of reliable estimates of relevant indicators for small sub-populations will provide the public (policy makers, press, researchers, interested citizens) information about detailed geographical, social patterns and time evolution of (un)employment, poverty and social cohesion in Italy. In turn, this may represent the basis for the analysis of regional disparities, the identification of disadvantaged social groups and the apportioning of public funds according to efficient criteria. The research topics addressed in this project are in line with some of the central themes of Horizon 2020, which aims at promoting innovative and inclusive European societies. To reach the broadest possible audience, possibly beyond the country's borders, newly proposed procedures will be made available to the public via the project web site and R software packages. The project has the potential to provide a progress beyond the state of the art from a scientific point of view because:

(1) Labor market indicators obtained from Italian LFS are statistically reliable at national, regional and provincial level (LAU1). The novelty of the project is to provide appropriate methodology to provide statistically sound estimates for unplanned domains, i.e. smaller sub-populations obtained cross-classifying the indicators by groups (gender, age) and geography or smaller geographical units such as local labour market areas. The focus will be especially on the new indicators introduced by the BES project to complement traditional employment and unemployment rates. The project starts from the foreground obtained by the European projects SAMPLE, AMELI, ESSnet-SAE and goes beyond their methodological results. In particular, extensions of existing methodologies are required to handle variables that are categorical in nature and present clear spatio-temporal patterns, to produce model based estimates that take properly into account the complex sampling design used in the Italian LFS and to achieve benchmarking with estimates released at higher level planned domains and grant coherence.

(2) The most popular indicators of poverty and inequality are based on monetary variables such as income or consumption. The project will also take into consideration those indicators of social cohesion based non-monetary aspects of the phenomenon, such as social exclusion, vulnerability, fragility and material deprivation. It aims at developing procedures for reliably estimating these indicators for both large domains (NUTS2) and small areas (LAU1). Different measures describe different aspects of households living conditions and individual wellbeing that may be assumed to be related to underlying latent variables. Moreover, measuring the adverse effect of deprivation with respect to several dimensions at the same time requires multidimensional poverty indicators. A final goal of the project for this objective is the definition of ad-hoc small area estimation methods based on latent variable models to account for the multidimensional and latent nature of the variables of interest. Although based on a multidimensional approach, the calculation of average indicators overlooks their distribution in the population of concern. This project aims at the calculation of all poverty indicators for different quantiles of the population. Indicators specifically aimed at measuring inequality in the level of economic wellbeing will also be considered.

(3) EUSILC and SHIW surveys are likely to be affected by non-sampling errors as they deal with sensitive items (income, wealth, living and health conditions). Ignoring the impact of non-sampling errors may bias severely the final estimates frustrating the improvements realized by the application of small area methodologies. The novelty of the project is in the definition of new methodologies that (a) evaluate and treat unit nonresponse when the probability of response can either take extreme values (exactly zero, i.e. we are dealing with elusive populations) or can depend on the variables of interest (non-ignorable nonresponse) using very different perspectives from those employed so far in the literature to deal with it and (b) improve response rates and measurement quality using improvements of randomized response methodologies. 

Statistics plays a crucial role in providing a deep insight into economical and social phenomena, by supplying quantitative methods and reliable data. In particular, National Institutes in charge of the production of official statistics have nowadays to face the growing need of timely, high quality and relevant estimates of parameters of interest via sample surveys. However, these requirements have to meet also the need to moderate costs, reduce the response burden on units, and fully exploit the opportunities provided by developments in technology. The objective of the project stems from open research issues concerning these general topics and will focus on the statistical estimation of indicators of social cohesion and sustainability, with a special attention to youth unemployment and economic wellbeing, and will be achieved by means of two Work Packages of research.

The first Work Package (WP1) aims at the estimation of labour market indicators by fully exploiting the information gathered by the Italian Labor Force Survey (LFS) to provide tools to monitor their evolution over time and at local geographical levels. LFS is designed to provide reliable estimates of such indicators at the Province level (LAU1). Therefore, computing estimates of the unemployment rate for young people (aged 15-24) at LAU1 level, or at finer geographical definitions such as local labour market areas or metropolitan cities, requires ad-hoc statistical tools for such unplanned domains (or small areas). Several aspects concerning estimation for such small areas will be analyzed thoroughly, with a particular emphasis on issues related to the categorical nature of most of the variables involved in the computation of the indicators from the LFS, the spatial and temporal structure of the data and the coherence between aggregated model-based small area estimates and direct (design-based) estimates for larger or planned areas (benchmarking property).

The second objective (WP2) focuses on tools to map and monitor households' wealth in Italy. Computation of indicators of economic wellbeing is mainly based on data coming from the European Survey on Income and Living Conditions (EUSILC) run in Italy by the National Institute of Statistics (ISTAT) and the Survey of Households Income and Wealth (SHIW) run by the Bank of Italy. The former provides reliable estimates at regional level, while the latter does not. Therefore, as a final objective of WP2, a more detailed geographical level analysis will be approached in order to identify the critical areas to direct specific policies and for which novel ad-hoc small area estimation techniques are required. A peculiarity of EUSILC and SHIW is that of surveying sensitive items (income, wealth and living conditions) that may introduce non-sampling errors: a main objective is to handle the bias introduced by non-ignorable nonresponse and to introduce novel techniques  to reduce nonresponse rates and increase data quality.

In 2006 the Council of the European Union stated that for the social model to be sustainable, Europe needs to step up its efforts to create more economic growth, a higher level of employment and productivity, while strengthening social inclusion and social protection in line with the objectives provided for the Social Agenda. In the seven years after this statement, a profound economic crisis has hit Europe making this challenge more urgent and critical to be addressed by the countries of the continent in the next decades. From the point of view of scientific research, addressing these challenges requires a multi-disciplinary perspective and necessitates an evidence-based approach to provide policy guidelines aimed at improving welfare systems and at promoting public interventions. Policymakers should be provided a set of tools for decision making to enhance living conditions and achieve higher levels of employment. Official statistics plays a crucial role in providing a deep insight into economic and social phenomena, by supplying reliable data and quantitative analytic methods. The objectives of the project stem from several open research issues concerning these general topics and will focus on the analysis of indicators of social cohesion and sustainability, with a special attention to unemployment, social exclusion, economic inequalities, and on their statistical estimation. With this regard, the project will build on the results of European projects SAMPLE, AMELI and ESSnet-SAE, which had involved a number of the Researchers of this group, and other projects that had focused on the definition of indicators of wellbeing and sustainability. In particular, since a focus on the peculiarities of the Italian situation is of concern, the project will look at the set of indicators for an Equitable and Sustainable Wellbeing (BES) recently identified by ISTAT and CNEL. The production of reliable estimates of these indicators for the territorial or population disaggregation level most relevant for policy design and monitoring is of paramount importance for policymakers. These estimates may also provide the basis for efficient fund allocation.

The aforementioned general aims will be achieved by the following detailed objectives that correspond to two Work Packages of research. The first Work Package (WP1) aims at the measurement of labour market indicators, with a special emphasis on the condition of disadvantaged social groups such as women and young people. Particular attention will be paid to the set of BES indicators (group 3 - Labor and reconciliation of work and family life) related to employment and tailored to the description of the Italian scenario. These indicators complement traditional employment and unemployment rates trying with the aim of shedding light on groups like discouraged, forced part-time, underemployed and temporary workers. WP1 aims at fully exploiting the information gathered by the Italian LFS to estimate traditional indicators, along with those added by the BES project, at local geographical levels. LFS is designed to provide reliable estimates of such indicators at the Province level (LAU1). This means that subpopulations within provinces, such as young people (age 15-24), females, travel-to-work metropolitan areas are unplanned domains. The estimation of labour market indicators and the monitoring of their evolution over time, require ad-hoc statistical tools. With this regard, several aspects concerning estimation for such unplanned domains (or small areas) will be analyzed thoroughly, with a particular emphasis on issues related to the categorical nature of most of the variables involved in the computation of the indicators from the LFS, the spatial and temporal structure of the available data and the coherence between aggregated model-based small area estimates and direct (design-based) estimates for larger or planned areas (benchmarking property). Cutting edge research directions will also be pursued that employ (discrete) latent variables models to cluster geographical areas and to monitor the evolution over time of such classification. These aims will be achieved by the following detailed objectives.

WP1: Analysis of geographical and temporal patterns of labor market indicators, with a particular focus on youth unemployment.
(1.1) Outline of the properties (in terms of error) of labor market indicators (e.g. BES-3) computed using a direct estimator from data from the Italian LFS at different geographical resolutions and for different subpopulations (e.g. young people, women).
(1.2) Evaluation and proposal of Generalized Linear Mixed Models (GLMMs) and M-quantile models that borrow strength from auxiliary information to produce model-based estimates of indicators that allow to compute the indicators of objective (1.1) with a smaller error. Auxiliary information may be taken from related variables, cross sectional or spatial structure, time dependency and additional sources like administrative registers. The categorical nature of the variables involved in the computation of the indicators requires the development of ad-hoc robust statistical models. These models will also be tailored to suit the features of the complex sampling scheme adopted for the Italian LFS. Since ISTAT disseminates direct estimates at planned domain level as official statistics, it is important for SAE estimates at finer levels to be consistent with them. Therefore, benchmarking property is an important issue to be taken into account for the production of SAE estimates and new methods for unit level models need to be developed and proposed to this end.
(1.3) Development of finite mixture models for SAE to allow for a nonparametric modeling of the area random effects and their automatic clustering in homogenous groups. Evolution over time of such classification can be described using Hidden Markov Models. These methodologies have never been applied for SAE and will be tailored to the problems at hand.
(1.4) An R package with functions to estimate and map labor force indicators at local level will be produced.
(1.5) Development of a set of tools to produce routinely estimates of the most relevant indicators selected from those of objective (1.1) for subpopulations given by gender and/or age class at LAU1 level.

The second objective (WP2) focuses on tools to map and monitor poverty and economic inequalities in Italy. The importance of a detailed analysis of some indicators of income distribution is to allow a better evaluation of the level of welfare of the society, especially in times of severe economic crisis like the current, characterized by a larger concentration of wealth. Several indicators can be taken into account in order to satisfy the local information requests. In December 2001 the European Council agreed on a list of social indicators (such as the Laeken indicators), that reflect the standard of life and the possible unequal distribution of income. In addition, reference will be made to BES indicators (group 4 - Economic wellbeing) that are mainly based on data coming from two official surveys: the European Survey on Income and Living Conditions (EUSILC) run in Italy by ISTAT and the Survey of Households Income and Wealth (SHIW) run by the Bank of Italy. The former provides reliable estimates at regional level, while the latter does not. Therefore, as a final objective of WP2, a more detailed geographical level analysis (LAU1 for indicators from EUSILC, and NUTS2/LAU1 for indicators from SHIW) will be approached in order to identify the critical areas to direct specific policies and for which novel ad-hoc SAE techniques are required. A peculiarity of EUSILC and SHIW is that of surveying sensitive items (income, wealth, living and health conditions) that may introduce non-sampling errors. A main objective is to improve the quality of estimation of indicators in WP2 by proper treatment of nonresponse and measurement error. WP2 aims will be achieved by the following detailed objectives.

WP2: Mapping and monitoring households economic wellbeing.
(2.1) Outline of the properties (in terms of error) of economic wellbeing indicators (e.g. BES-4) computed using a direct estimator from data from the Italian EUSILC and the SHIW at different geographical resolutions. Some alternative indicators based on monetary and non-monetary will also be investigated to measure, e.g. 

households financial fragility, given by a situation in which expenses outpace disposable income.
(2.2) Development and proposal of GLMMs and M-quantile models for SAE tailored to this context. Finite mixture models will be considered as well as a tool to model measurement error that is likely present when surveying income and financial assets.
(2.3) R package with functions to estimate and map economic wellbeing indicators at local level will be produced.
(2.4) Proposals of methods to get domain estimates of multidimensional indicators based on latent constructs. Households economic and living conditions are quantities that cannot be measured directly, but are latent variables hidden behind a set of manifest variables or questionnaire items. Latent class models (for categorical latent variables), latent trait models (for continuous latent variables) and structural equation models can be employed to this end, but must be extended to tailor the SAE framework.
(2.5) Improvement of the quality of estimation of indicators from objective (2.1) by proper treatment of non-sampling errors. In particular, novel methodologies will be developed to handle the bias introduced by non-ignorable nonresponse by developing forefront methodologies based on techniques like latent variable models and link-tracing sampling designs, that have never been applied in this field. In addition, methods based on the randomized response theory will also be developed to reduce nonresponse rate and collect more truthful data by increasing respondent cooperation. 

Household wealth and youth unemployment: new survey methods to meet current challenges

ERC sector 

SH - Social Sciences and Humanities

  1. SH1_4 Econometrics, statistical methods

  2. SH1_12 Income distribution, poverty

Keywords

1.

SURVEY METHODOLOGY

2.

SMALL AREA ESTIMATION

3.

LATENT VARIABLE MODELS

4.

DEPRIVATION AND SOCIAL ESCLUSION

5.

NON-IGNORABLE NONRESPONSE