Text, functional and other high-dimensional data in econometrics: New models, methods, applications (HiTEc)
This Action integrates cutting-edge analytic developments involving innovative sources of information, such as text, functions, perceptions or imprecise data, in econometrics. High-dimensional, complex and unstructured economic datasets cannot be fully exploited hitherto by the existing methodologies. An international network of experts, spanning the disciplines of econometrics, mathematics, statistics and computer science, will be created, with the aim of establishing and implementing new efficient inferential procedures for using such information in econometric modelling and forecasting.
The research tasks will be carried out by five WGs:
WG 1 – Complex and text data. This WG will investigate data generation and management. It will assist in the establishment of generalized models to be used after for inference. It will have strong connections with WG 5, as they will be the data producers. Different sub-groups of this WG will work in parallel to cover different kind of complex data and their inherent characteristics. Special attention will be paid to the case of text data.
WG 2 – Econometric models in generalized spaces and inference.  Separable Hilbert spaces will be considered as a framework for complex elements such as time series, functional data, latent variables associated with text mining models, images, perceptions, or non-precise data.
WG 3 – High-dimensional data analysis. The aim will be to contribute to this crucial big data problem that genuinely arises when handling complex data, especially in the case of text data and some approaches to functional data. Common topics, such as model/variable selection which relies on proper methods for dimension reduction, sparsity, and estimation in high- and ultra-high-dimensions will be approached in generalized spaces taking inspiration from powerful computational techniques.
WG 4 – Algorithms and software. This WG will take care of the algorithmic and technological advances: scalable algorithms and software. In-depth knowledge of the methods is required to produce a competitive product through an efficient implementation. Thus, close connections to the other WGs must be kept.
WG 5 – Applications and transfer. This WG starts and closes the working plan cycle. It is also the bridge to society through stakeholders and a variety of practitioners. This WG will identify potentially useful sources of information for their specific problems and will be responsible for the data management. Practitioners and stakeholders will collaborate to identify relevant datasets. The scientists will also be able to generate relevant datasets by mining open-source or easy to access sources indicated by the stakeholder, such as specialized publications or the EU Open Data Portal. A 15 dialogue with the methodological researchers will motivate the particular data processing and tools to be developed/applied. Once methods, algorithms and software had been delivered, they will come back to this group, and from here to the society.
Project funding:
EU COST Programme
Project results:
                            
An international network of experts, spanning the disciplines of econometrics, mathematics, statistics and computer science, will be created, with the aim of establishing and implementing new efficient inferential procedures for using such information in econometric modelling and forecasting. User-friendly and freely available software will be produced. These results will enable applied econometricians to mine textual information gathered from newspapers, articles, opinions and sentiments recorded by poles, in combination with other complex and traditional data. New techniques for analyzing the evolution of economic indicators will help to improve forecasting. Valuable insights into economic issues will provide ample prospects for further research, as vast sources of data are still noticeably under-exploited. The potential to enhance economic data analysis will be fostered by a training programme for Early Career Investigators, and by intensifying connections among academics, stakeholders, and policy-makers. The impact will not be limited to economics and finance. The interaction with experts in other areas, such as environmental sciences or health, will facilitate the transfer of knowledge and technology. Emphasis will be given to sensor data and indicators that will alert to the vulnerability of commercial enterprises and social groups to extreme events associated with environmental hazards. Such indicators will include those relating to mortality risks.
The analysis of the data generation process of complex data will lead to the establishment of new models describing the economic reality better, which will reduce uncertainty and improve forecasting. The joint analysis of complex data through Hilbert spaces is a pioneering concept. It constitutes a non-incremental step-forward advance that will allow the formalization of innovative econometric models suitable for handling useful information sources available nowadays. The consideration of new processes unifying the analysis of complex data through Hilbert spaces will channel past and current efforts and will mark a methodological milestone. A fundamental innovation will concern the management of high-dimensionality through the combination of efficient matrix computations, graph strategies, Hilbertian optimization, and quantum computing. The significant advance of this approach is the possibility to bound the error, which is not possible with techniques such as lasso or genetic algorithms. On the other hand, heuristics, deep learning and coresets are faster and offer better possibilities for scalability, which is undoubtedly relevant for big data.
| ORG | 
Period of project implementation: 2022-10-10 - 2026-10-09
Project partners: Czech Republic, Estonia, Luxembourg, Malta, Montenegro, North Macedonia, Poland, Serbia, United Kingdom, Albania, Denmark, France, Ireland, Israel, Moldova, Portugal, Romania, Slovenia, Sweden, Austria, Belgium, Bosnia and Herzegovina, Bulgaria, Croatia, Cyprus, Suomija, Germany, Greece, Hungary, Italy, Netherlands, Norway, Slovakia, Spain, Switzerland, Turkey