From Data to Knowledge, from Data to Decision The advent of massive data is pushing technical limits in many areas, specifically in decision-making and optimization. On the one hand, mainstream statistical models and decision-making algorithms are challenged by heterogeneous, semi-structured, complex, incomplete and/or uncertain data. On the other hand, data management raises new operability constraints in terms of security integrity and traceability. In order to generate knowledge, build models and make informed decisions, statistical validity, robustness, computational tractability, and --to some extent– causal modeling, are mandatory; but the acceptability of the results (knowledge, models and decisions) also requires privacy, traceability, and fairness to be enforced. In parallel, classic, heuristic and agile optimization schemes require new developments.
Deep learning toward Artificial Intelligence In the last few years, Deep Learning has achieved quite a few breakthroughs in Data Science, literally yielding performance jumps in computer vision and natural language processing. The reasons for these achievements --besides massive data, computational power, and design efforts-- are not yet fully understood and call for three research challenges. The first one is to set up a learning theory suited to analyze deep architectures. The second one concerns their compositionality and their ability to represent higher-order logic models. The third one concerns their interpretation: opening the black box to make the underlying representations explicit.
Digital Trust and Appropriation Digital trust may arise from:
Privacy by Design is a form of regulation that includes personal data protection in the very first stages of data collection and processing. Tracing tools have to be developed to facilitate the model explanation at various levels for experts and users. Common privacy principles, though easy to formulate, require a shift in data storage and processing infrastructures, with major legislative and economic impacts.
Data economy and regulation Companies in the data economy need to continuously rethink the way they are organized because agility and flexibility require a project-oriented organization with constant changes in resource allocation. Data economy also raises concerns about concentration and monopoly. A small number of companies (the so-called GAFAM) hold most of the data, that they aggregate from different sources (Internet, smartphones, connected devices). This market concentration may lead to unfair competition, possibly slowing down innovation of e.g. European companies. Citizens do expect governments to intervene in the digital economy in order to avoid too much concentration of data-based economic power. Governments also need to retain the fine-grained knowledge of their citizen data in order to preserve the state sovereignty as well as the compliance to civil rights rules and regulations.