Overarching challenges

Chapo
The DATAIA Institute relies on disciplinary bases such as statistics, strategy, data science, and law, and faces interdisciplinary challenges to spread its expertise to all scientific partners.
Bandeau image

Overarching challenges

  • Machine learning toward artificial intelligence
  • From data to knowledge, from data to decision
  • Transparency, responsible AI and ethics
  • Data protection, regulation and economy

Share

twlkml
Ancre
Machine learning toward artificial intelligence
Corps de texte

Recently, deep learning research has made dramatic advances in computer vision and natural language processing. Beyond the arrival of massive data, increased computing power and design efforts, the causes of this progress, still poorly understood, raise at least three questions :

What theory of machine learning will analyze deep architectures?

How to manage the compositionality of these architectures and their capacity to apprehend more complex objects?

How to open the black box to update the learned representations?

Challenges :

  • Innovative machine learning and AI: common sense, adaptability, generalization
  • Deep learning and adversarial learning
  • Automatic learning and hyper-optimization
  • Optimization for learning, eg improvements in stochastic gradient methods, Bayesian optimization), combinatorial optimization
  • Link learning-modeling, integration of a priori in learning
  • Reproducibility and robust learning
  • Statistical Inference and Validation
  • Compositionnality of deep architectures.
Ancre
From data to knowledge, from data to decision
Corps de texte

The growing availability of massive data pushes technical boundaries in many fields. On the one hand, the heterogeneous, semi-structured, incomplete or uncertain nature of the data questions the usual statistical models as well as the algorithms dedicated to the decision. On the other hand, data management raises new operational constraints such as security, integrity and traceability.

In addition, producing knowledge requires the construction of models that provide explainable, statistically valid and calculable decisions. Acceptance of results also requires that confidentiality and loyalty be strengthened.

At the same time, new developments in optimization should make it possible to improve estimation procedures.

Challenges :

  • Heterogeneous, complex, incomplete, semi-structured and / or uncertain data
  • Fast big data: structuring the data to exploit it
  • E-learning, methodology for big data, efficient methods
  • Improved storage, calculation and estimation for data science
  • Modeling interactions between agents (human or artificial) by game theory.
  • Representation and multi-scale and multimodal algorithms
  • Theoretical analysis of heuristic methods (complexity theory, information geometry, Markov chain theory)
  • Human-machine coevolution in autonomous systems: conversational agents, cars, social robots
Ancre
Transparency, responsible AI and ethics
Corps de texte

Digital trust is established from the implementation of ethically responsible methodologies through the transparency and accountability of algorithmic systems; regulation of the collection, use and processing of personal data; reinforcement of regulation through appropriate numerical procedures.

Confidentiality by design is a form of regulation that includes the protection of personal data in all stages of collection and processing. The tracing of the tools applied to the data must also be developed in order to facilitate the explanation of the model for the experts as for the users making the algorithmic systems auditable. The principles of confidentiality, while easy to formulate, require modification of storage and processing infrastructures, with significant legislative, sociological and economic impacts. The transparency techniques of algorithmic systems will be developed focusing on: fairness, loyalty and non-discrimination and accountability-by-construction.

Challenges :

  • Responsibility-by-design, Explicability-by-design
  • Transparency-by-design, equity-by-design
  • Audit of algorithmic systems: non-discrimination, loyalty, technical bias, neutrality, equity
  • Measuring trust and digital appropriation
  • "Progressive user-centric-analytics" (interactive monitoring of decision systems: dataviz dashboards, HMI)
  • Responsibility for information processing and decision making: data usage control and fact-checking
  • Causal discovery, traceability of inferences from source data, interpretability of deep architectures
Ancre
Data protection, regulation and economy
Corps de texte

Businesses involved in the data economy continually need to rethink their structuring: they must adopt a project-oriented organization with rapid changes in resource allocation. The data economy also raises problems of concentration and monopoly.

A small number of companies (GAFAM) hold most of the data. This market concentration can lead to unfair competition whose innovation in small and medium-sized enterprises is likely to suffer. Citizens expect governments to intervene in the digital economy to avoid too much concentration and monopoly. Governments must prevent information leakage to preserve state sovereignty and regulatory compliance.

Challenges :

  • "Privacy-by-design", GDPR
  • Privacy-friendly learning ("differential privacy")
  • Development of ethically responsible methodologies and technologies to regulate the collection, use and processing of personal data, and the exploitation of knowledge derived from these data
  • Computer security of data processing chains
  • Security / crypto: block-chain and trusted third party