Chapter 11 The semi-structured data model Structured data XML (http://www.w3.org/XML/) Document Type Definitions XML Schema 1
1 Chapter 11 The semi-structured data model Structured data XML (http://www.w3.org/XML/) Document Type Definitions XML Schema
Graphs of Semistructured Data ▣Nodes=objects Labels on arcs (like attribute names). Atomic values at leaf nodes (nodes with no arcs out). Flexibility:no restriction on: ■Labels out of a node, Number of successors with a given label. 2
2 Graphs of Semistructured Data Nodes = objects. Labels on arcs (like attribute names). Atomic values at leaf nodes (nodes with no arcs out). Flexibility: no restriction on: Labels out of a node. Number of successors with a given label
Example:Data Graph Notice a new kind beer beer of data. bar manf manf prize name A.B. name servedAt (Bud year award Mlob 1995 Gold name addr Joe's Maple The beer object for Bud The bar object for Joe's Bar 3
3 Example: Data Graph Bud A.B. 1995 Gold Joe’s Maple M’lob beer beer bar manf manf servedAt name name name addr prize year award root The bar object for Joe’s Bar The beer object for Bud Notice a new kind of data
XML 目XML= Extensible Markup Language. While HTML uses tags for formatting (e.g.,"italic"),XML uses tags for semantics (e.g.,"this is an address"). Key idea:create tag sets for a domain,and translate all data into properly tagged XML documents
4 XML XML = Extensible Markup Language. While HTML uses tags for formatting (e.g., “italic”), XML uses tags for semantics (e.g., “this is an address”). Key idea: create tag sets for a domain, and translate all data into properly tagged XML documents
XML:Motivation Data interchange is critical in today's networked world Examples: Banking:funds transfer Order processing (especially inter-company orders) ▣Scientific data ■ Chemistry:ChemML,.. ■ Genetics(n.遗传学): BSML (Bio-Sequence Markup Language),... Paper flow of information between organizations is being replaced by electronic flow of information Each application area has its own set of standards for representing information XML has become the basis for all new generation data 5 interchange formats
5 XML: Motivation Data interchange is critical in today’s networked world Examples: Banking: funds transfer Order processing (especially inter-company orders) Scientific data Chemistry: ChemML, … Genetics (n.遗传学): BSML (Bio-Sequence Markup Language), … Paper flow of information between organizations is being replaced by electronic flow of information Each application area has its own set of standards for representing information XML has become the basis for all new generation data interchange formats
XML Motivation (Cont. Earlier generation formats were based on plain text with line headers indicating the meaning of fields Similar in concept to email headers Does not allow for nested structures,no standard "type”language Each XML based standard defines what are valid elements, using XML type specification languages to specify the syntax DTD (Document Type Descriptors) ▣XML Schema I Plus textual descriptions of the semantics 目 XML allows new tags to be defined as required However,this may be constrained by dTDs A wide variety of tools is available for parsing,browsinge and querying XML documents/data(next chapter)
6 XML Motivation (Cont.) Earlier generation formats were based on plain text with line headers indicating the meaning of fields Similar in concept to email headers Does not allow for nested structures, no standard “type ” language Each XML based standard defines what are valid elements, using XML type specification languages to specify the syntax DTD (Document Type Descriptors) XML Schema Plus textual descriptions of the semantics XML allows new tags to be defined as required However, this may be constrained by DTDs A wide variety of tools is available for parsing, browsing and querying XML documents/data (next chapter)
Comparison with Relational Data Unlike relational tuples,XML data is self- documenting due to presence of tags Non-rigid format:tags can be added Allows nested structures Wide acceptance,not only in database systems,but also in browsers,tools,and applications 7
7 Comparison with Relational Data Unlike relational tuples, XML data is selfdocumenting due to presence of tags Non-rigid format: tags can be added Allows nested structures Wide acceptance, not only in database systems, but also in browsers, tools, and applications
Well-Formed and Valid XML 目 Wel/-Formed XML allows you to invent your own tags. Valid XML conforms to a certain DTD, or XML schema Relational database Valid XML Well-formed XML from strict Structure to loose Structure 8
8 Well-Formed and Valid XML Well-Formed XML allows you to invent your own tags. Valid XML conforms to a certain DTD, or XML schema . Relational database Valid XML Well-formed XML from strict Structure to loose Structure
Well-Formed XML 目 Start the document with a declaration, surrounded by "standalone"="no DTD provided." 目 Balance of document is a roof tag surrounding nested tags. 9
9 Well-Formed XML Start the document with a declaration, surrounded by . Normal declaration is: “standalone ” = “no DTD provided. ” Balance of document is a root tag surrounding nested tags
Tags Tags are normally matched pairs,as ., Unmatched tags also allowed,as Tags may be nested arbitrarily. XML tags are case-sensitive. 10
10 Tags Tags are normally matched pairs, as … . Unmatched tags also allowed, as Tags may be nested arbitrarily. XML tags are case-sensitive