Podcast
Questions and Answers
Que sont les données informatiques?
Que sont les données informatiques?
Les données sont des informations qui peuvent être interprétées et utilisées par les ordinateurs. Il s'agit d'un ensemble de faits, tels que des chiffres, des mots, des mesures, des observations ou même de simples descriptions de choses.
Selon la nature, combien de types de données existent-ils?
Selon la nature, combien de types de données existent-ils?
- quatre
- trois
- un
- deux (correct)
Quels sont les deux types de données selon leur nature?
Quels sont les deux types de données selon leur nature?
Données textuelles et données multimédia.
Quels sont les trois types de données selon leur structure?
Quels sont les trois types de données selon leur structure?
Qu'est-ce qu'un document hypertextuelle?
Qu'est-ce qu'un document hypertextuelle?
La notion de l'hypertexte est forcement liées à la présence de l'internet.
La notion de l'hypertexte est forcement liées à la présence de l'internet.
Qu'est-ce qu'un document hypermédia?
Qu'est-ce qu'un document hypermédia?
Qu’est-ce qu’un système hypermédia?
Qu’est-ce qu’un système hypermédia?
Définir les données structurées
Définir les données structurées
Définir les données semi-structurées.
Définir les données semi-structurées.
Quelles sont les nouvelles technologies qui ont vu le jour pour résoudre le problème des données semi-structurées?
Quelles sont les nouvelles technologies qui ont vu le jour pour résoudre le problème des données semi-structurées?
Quel est l'objectif du cours?
Quel est l'objectif du cours?
Flashcards
What are computer data?
What are computer data?
Information that can be interpreted and used by computers, including numbers, text, and observations.
What is text-based data?
What is text-based data?
Data that is written and stored in text format, including texts and numbers.
What is multimedia data?
What is multimedia data?
Data that includes text, images, audio, and video, combining multiple formats.
What is structured data?
What is structured data?
Signup and view all the flashcards
What is unstructured data?
What is unstructured data?
Signup and view all the flashcards
What is semi-structured data?
What is semi-structured data?
Signup and view all the flashcards
What is a hypertext document?
What is a hypertext document?
Signup and view all the flashcards
What is a Hypertext system?
What is a Hypertext system?
Signup and view all the flashcards
What are Hypermedia Documents?
What are Hypermedia Documents?
Signup and view all the flashcards
What is a Hypermedia System?
What is a Hypermedia System?
Signup and view all the flashcards
What is structured data?
What is structured data?
Signup and view all the flashcards
Tabular structured data?
Tabular structured data?
Signup and view all the flashcards
What are semi-structured data?
What are semi-structured data?
Signup and view all the flashcards
Representation of Semi-Structured Data?
Representation of Semi-Structured Data?
Signup and view all the flashcards
What happens during semi-structured data updates?
What happens during semi-structured data updates?
Signup and view all the flashcards
Benefits of semi-structured data?
Benefits of semi-structured data?
Signup and view all the flashcards
Examples of semi-structured data formats?
Examples of semi-structured data formats?
Signup and view all the flashcards
What is the course objective?
What is the course objective?
Signup and view all the flashcards
Study Notes
- The provided text covers the topic of semi-structured data
- It is divided into multiple sections that cover the introduction to generalities and semi-structured data
- The text is intended to provide a comprehensive overview of the subject
Computer Data
- Computer data can be interpreted and used by computers
- Data includes such facts as numbers, words, measurements, observations, or simple descriptions,
- It can be in the form of numbers, texts, images, audio, or videos
- Once collected and organized, data becomes the basis of the computer system
Data Classification by Nature
- There are two types of Data classification based on nature: textual and multimedia
- Text data is written and stored in text format, including text and numbers
- Multimedia data groups different formats, containing text, image, audio, and video
Data Classification by Structure
- There are three types of data structures: structured, unstructured, and semi-structured
- Structured data is predefined and formatted with a precise structure before being placed on physical media
- Unstructured data has no defined structure or schema; is used for reports, text files, comments, opinions on social networks, emails, etc
- Semi-structured data has a format/structure; but is not fixed/rigid
Hypertext
- Hypertext is a document that allows transition from one piece of information to another, using hyperlinks
- Example: a webpage
- A hypertext system contains documents linked together by hyperlinks
- These hyperlinks automatically transition the user to another related document
- Hypertext navigation is a non-linear consultation mode
- Example: a web browser
- Hypertext is not necessarily internet-based
- Examples include local navigation for a web browser, or a PDF reader
Hypermedia
- Hypermedia document allows users to navigate between different sections or to other documents
- Example: YouTube video
- Hypermedia systems are designed to contain and present hypermedia documents to users
- The reader can directly access other related information
- Examples include YouTube and Google Earth
Structured Data
- Structured data uses a predefined and expected format
- It may come from different sources, but common factor is that the fields are fixed
Structured Data Characteristics
- Data is in table form with rows and columns clearly defining data attributes
- A rigid structure is defined before data is populated
- Data of the same attribute (column) are of the same type
- It is easy to search, process, and analyze
Database Reminder
- Review of the basis of databases, including definitions of base de données (BDD(R)), système de gestion de base de données (SGBD (R))
- Covers data modeling of an information system (SI) using E/A or UML, relational models, schemas, primary and foreign keys, normalization
- Physical level includes tables, indexes, and SQL
Semi-Structured Data
- Semi-structured data isn't captured or formatted in a conventional way; that is, data isn't following a tabular model or a relational database because the dataset doesn't have a fixed schema
- The same entry can have multiple structures from the same data source
Semi-Structured Data Representation
- Semi-structured data is represented with the hierarchical model
- Leaves of the model represent the data
- Nodes and links represent the data structure.
Structural Comparisons of Data
- Relational models are used data structures, using rigid schemas defined before data loading
- Updates do not affect the structure of the data, is easy to find
- Data is represented as a flat table, making it difficult to manage missing, multi-valued, or multi-order attributes
- Scaling up a structured BDD schema is very difficult
- Hierarchical models are used in semi-structured data, using flexible and extensible schemas defined implicitly in data
- Updates cause a change in the data structure, with the updates being complicated
- Data is easy to represent complicated, and easy to manage the various attributed values
- Scaling is easy compared to structured data.
Semi-Structured Data Problems
- The web is an important source of semi-structured data
- Its importance and volume have grown with internet development
- One needs to store and manipulate new client data, such as navigation history, cookies, and sessions
- There is a need for adequate tools that can manipulate semi-structured data
- Using SGBD poses problems - as data is completely different from structured data
- Tools and method are needed that will efficiently manage this type of data
Solutions for Semi-Structured Data
- To solve the problem, new technologies include XML (eXtensible Markup Language), JSON (JavaScript Object Notation), CSV (Comma-separated Values), TSV (Tab-Separated Values), the Parquet format, and YAML (Yet Another Markup Language)
Examples of XML and JSON
- Text shows sample representation of the XML and JSON example dataset.
Course Objectives
- The objective of the course is to familiarize students with semi-structured data
- It teaches them to manipulate such data using XML technology
- The module will be organized into four chapters, covering generalities, the XML core, XML galaxies, and XML with BDD
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.